OCR-RCNN-v2 is designed for autonomous elevator manipulation, the goal of which is to enable the robot to autonomously operate elevators that are previously unvisited. This repository contains the perception part of this project. We published the initial version in paper A Novel OCR-RCNN for Elevator Button Recognition and this version improves the accuracy by 20% and achieves a real-time running speed (640*480 ). Current version can also run in laptops with at least 2GB GPU memory. The Nvidia TX-2 compatible version will be soon released with the dataset, as well as the post-processing code.
- Ubuntu == 16.04
- TensorFlow == 1.9.0
- Python == 2.7
- 2GB GPU (or shared) memory
For laptops and desktops:
sudo apt install libjpeg-dev libpng12-dev libfreetype6-dev libxml2-dev libxslt1-dev
pip install pillow, matplotlib, lxml
git clone https://github.com/zhudelong/ocr-rcnn-v2.git
cd ocr-rcnn-v2
python ocr-rcnn-v2-infer.py
python ocr-rcnn-v2-visual.py
(for visualization)
For Nvidia TX-2 platform:
- soon be available.
- if you are interested in converting the model by yourself, please check here
Two demo-images are listed as follows. They are screenshots from two Youtube videos. The character recognition results are visualized at the center of each bounding box.
Image Source: [https://www.youtube.com/watch?v=bQpEYpg1kLg&t=8s]
Image Source: [https://www.youtube.com/watch?v=k1bTibYQjTo&t=9s]