Feat: Publish the super fast and accurate 3D object detection based o…

…n LiDAR
Yousif-GO · Aug 24, 2020 · 6e141bf · 6e141bf
commit 6e141bf
Show file tree

Hide file tree

Showing 40 changed files with 27,043 additions and 0 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,136 @@
+results/
+.DS_*
+dataset/
+checkpoints/
+logs/
+src/.idea/
+
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+
+# Translations
+*.mo
+*.pot
+
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+
+# Flask stuff:
+instance/
+.webassets-cache
+
+# Scrapy stuff:
+.scrapy
+
+# Sphinx documentation
+docs/_build/
+
+# PyBuilder
+target/
+
+# Jupyter Notebook
+.ipynb_checkpoints
+
+# IPython
+profile_default/
+ipython_config.py
+
+# pyenv
+.python-version
+
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+
+# SageMath parsed files
+*.sage.py
+
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+
+# Spyder project settings
+.spyderproject
+.spyproject
+
+# Rope project settings
+.ropeproject
+
+# mkdocs documentation
+/site
+
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+
+# Pyre type checker
+.pyre/
diff --git a/LICENSE b/LICENSE
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2020 Nguyen Mau Dung
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
diff --git a/README.md b/README.md
@@ -0,0 +1,157 @@
+# Super Fast and Accurate 3D Object Detection based on LiDAR
+
+[![python-image]][python-url]
+[![pytorch-image]][pytorch-url]
+
+---
+
+## Features
+- [x] Super fast and accurate 3D object detection based on LiDAR
+- [x] Fast training, fast inference
+- [x] An Anchor-free approach
+- [x] No Non-Max-Suppression
+- [x] Support [distributed data parallel training](https://github.com/pytorch/examples/tree/master/distributed/ddp)
+- [x] Release pre-trained models 
+
+Technical details could be found [here](./Technical_details.md)
+
+## 2. Getting Started
+### 2.1. Requirement
+
+```shell script
+pip install -U -r requirements.txt
+```
+
+### 2.2. Data Preparation
+Download the 3D KITTI detection dataset from [here](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d).
+
+The downloaded data includes:
+
+- Velodyne point clouds _**(29 GB)**_
+- Training labels of object data set _**(5 MB)**_
+- Camera calibration matrices of object data set _**(16 MB)**_
+- **Left color images** of object data set _**(12 GB)**_ (For visualization purpose only)
+
+
+Please make sure that you construct the source code & dataset directories structure as below.
+
+### 2.3. How to run
+
+#### 2.3.1. Visualize the dataset 
+
+To visualize 3D point clouds with 3D boxes, let's execute:
+
+```shell script
+cd src/data_process
+python kitti_dataset.py
+```
+
+
+#### 2.3.2. Inference
+
+```
+python test.py --gpu_idx 0 --peak_thresh 0.2
+```
+
+
+#### 2.3.3. Training
+
+##### 2.3.3.1. Single machine, single gpu
+
+```shell script
+python train.py --gpu_idx 0
+```
+
+
+#### Tensorboard
+
+- To track the training progress, go to the `logs/` folder and 
+
+```shell script
+cd logs/<saved_fn>/tensorboard/
+tensorboard --logdir=./
+```
+
+- Then go to [http://localhost:6006/](http://localhost:6006/)
+
+
+## Contact
+
+If you think this work is useful, please give me a star! <br>
+If you find any errors or have any suggestions, please contact me (**Email:** `[email protected]`). <br>
+Thank you!
+
+
+## Citation
+
+```bash
+@misc{Super-Fast-Accurate-3D-Object-Detection-PyTorch,
+  author =       {Nguyen Mau Dung},
+  title =        {{Super-Fast-Accurate-3D-Object-Detection-PyTorch}},
+  howpublished = {\url{https://github.com/maudzung/Super-Fast-Accurate-3D-Object-Detection}},
+  year =         {2020}
+}
+```
+
+## References
+
+[1] CenterNet: [Objects as Points paper](https://arxiv.org/abs/1904.07850), [PyTorch Implementation](https://github.com/xingyizhou/CenterNet)
+[2] RTM3D: [PyTorch Implementation](https://github.com/maudzung/RTM3D)
+
+## Folder structure
+
+```
+${ROOT}
+└── checkpoints/
+    ├── fpn_resnet_18/    
+        ├── fpn_resnet_18_epoch_300.pth
+└── dataset/    
+    └── kitti/
+        ├──ImageSets/
+        │   ├── test.txt
+        │   ├── train.txt
+        │   └── val.txt
+        ├── training/
+        │   ├── image_2/ (left color camera)
+        │   ├── calib/
+        │   ├── label_2/
+        │   └── velodyne/
+        └── testing/  
+        │   ├── image_2/ (left color camera)
+        │   ├── calib/
+        │   └── velodyne/
+        └── classes_names.txt
+└── src/
+    ├── config/
+    │   ├── train_config.py
+    │   └── kitti_config.py
+    ├── data_process/
+    │   ├── kitti_dataloader.py
+    │   ├── kitti_dataset.py
+    │   └── kitti_data_utils.py
+    ├── models/
+    │   ├── fpn_resnet.py
+    │   ├── resnet.py
+    │   └── model_utils.py
+    └── utils/
+    │   ├── demo_utils.py
+    │   ├── evaluation_utils.py
+    │   ├── logger.py
+    │   ├── misc.py
+    │   ├── torch_utils.py
+    │   ├── train_utils.py
+    │   └── visualization_utils.py
+    ├── demo_2_sides.py
+    ├── demo_front.py
+    ├── test.py
+    └── train.py
+├── README.md 
+└── requirements.txt
+```
+
+
+
+[python-image]: https://img.shields.io/badge/Python-3.6-ff69b4.svg
+[python-url]: https://www.python.org/
+[pytorch-image]: https://img.shields.io/badge/PyTorch-1.5-2BAF2B.svg
+[pytorch-url]: https://pytorch.org/
diff --git a/Technical_details.md b/Technical_details.md
@@ -0,0 +1,42 @@
+#Super Fast and Accurate 3D Object Detection based on LiDAR Point Clouds
+
+---
+
+Technical details of the implementation
+
+
+## 1. Input/Output & Model
+
+- I used the ResNet-based Keypoint Feature Pyramid Network (KFPN) that was proposed in [RTM3D paper](https://arxiv.org/pdf/2001.03343.pdf). 
+- The model takes a birds-eye-view RGB-map as input. The RGB-map is encoded by height, intensity and density of 3D LiDAR point clouds. 
+- **Outputs**: **7 degrees of freedom** _(7-DOF)_ of objects: `(cx, cy, cz, l, w, h, θ)`
+   - `cx, cy, cz`: The center coordinates.
+   - `l, w, h`: length, width, height of the bounding box.
+   - `θ`: The heading angle in radians of the bounding box.
+- **Objects**: Cars, Pedestrians, Cyclists.
+
+## 2. Losses function
+
+- For main center heatmap: Used `focal loss`
+
+- For heading angle _(direction)_: The model predicts 2 components (`imaginary value` and `real value`). 
+The `im` and `re` are directly regressed by using `l1_loss`
+
+- For `z coordinate` and `3 dimensions` (height, width, length), I used `balanced l1 loss` that was proposed by the paper
+ [Libra R-CNN: Towards Balanced Learning for Object Detection](https://arxiv.org/pdf/1904.02701.pdf)
+
+## 3. Training in details
+
+- Set weights for the above losses are uniform (`=1.0` for all)
+- Number of epochs: 300
+- Learning rate scheduler: [`cosin`](https://arxiv.org/pdf/1812.01187.pdf), initial learning rate: 0.001
+- Batch size: `16` (on GTX 1080Ti)
+
+## 4. Inference
+
+During the inference, a `3 × 3` max pooling operation is applied on the center heat map, then I keep `50` predictions whose 
+center confidences are larger than 0.2.
+
+## 5. How to expand the work
+
+You can train the model with more classes and expand the detected area by modify configurations in the [config/kitti_dataset.py]() 
diff --git a/checkpoints/fpn_resnet_18/fpn_resnet_18_epoch_300.pth b/checkpoints/fpn_resnet_18/fpn_resnet_18_epoch_300.pth