Skip to content

Commit

Permalink
Feat: Publish the super fast and accurate 3D object detection based o…
Browse files Browse the repository at this point in the history
…n LiDAR
  • Loading branch information
maudzung committed Aug 24, 2020
0 parents commit 6e141bf
Show file tree
Hide file tree
Showing 40 changed files with 27,043 additions and 0 deletions.
136 changes: 136 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
results/
.DS_*
dataset/
checkpoints/
logs/
src/.idea/

# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2020 Nguyen Mau Dung

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
157 changes: 157 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# Super Fast and Accurate 3D Object Detection based on LiDAR

[![python-image]][python-url]
[![pytorch-image]][pytorch-url]

---

## Features
- [x] Super fast and accurate 3D object detection based on LiDAR
- [x] Fast training, fast inference
- [x] An Anchor-free approach
- [x] No Non-Max-Suppression
- [x] Support [distributed data parallel training](https://github.com/pytorch/examples/tree/master/distributed/ddp)
- [x] Release pre-trained models

Technical details could be found [here](./Technical_details.md)

## 2. Getting Started
### 2.1. Requirement

```shell script
pip install -U -r requirements.txt
```

### 2.2. Data Preparation
Download the 3D KITTI detection dataset from [here](http://www.cvlibs.net/datasets/kitti/eval_object.php?obj_benchmark=3d).

The downloaded data includes:

- Velodyne point clouds _**(29 GB)**_
- Training labels of object data set _**(5 MB)**_
- Camera calibration matrices of object data set _**(16 MB)**_
- **Left color images** of object data set _**(12 GB)**_ (For visualization purpose only)


Please make sure that you construct the source code & dataset directories structure as below.

### 2.3. How to run

#### 2.3.1. Visualize the dataset

To visualize 3D point clouds with 3D boxes, let's execute:

```shell script
cd src/data_process
python kitti_dataset.py
```


#### 2.3.2. Inference

```
python test.py --gpu_idx 0 --peak_thresh 0.2
```


#### 2.3.3. Training

##### 2.3.3.1. Single machine, single gpu

```shell script
python train.py --gpu_idx 0
```


#### Tensorboard

- To track the training progress, go to the `logs/` folder and

```shell script
cd logs/<saved_fn>/tensorboard/
tensorboard --logdir=./
```

- Then go to [http://localhost:6006/](http://localhost:6006/)


## Contact

If you think this work is useful, please give me a star! <br>
If you find any errors or have any suggestions, please contact me (**Email:** `[email protected]`). <br>
Thank you!


## Citation

```bash
@misc{Super-Fast-Accurate-3D-Object-Detection-PyTorch,
author = {Nguyen Mau Dung},
title = {{Super-Fast-Accurate-3D-Object-Detection-PyTorch}},
howpublished = {\url{https://github.com/maudzung/Super-Fast-Accurate-3D-Object-Detection}},
year = {2020}
}
```

## References

[1] CenterNet: [Objects as Points paper](https://arxiv.org/abs/1904.07850), [PyTorch Implementation](https://github.com/xingyizhou/CenterNet)
[2] RTM3D: [PyTorch Implementation](https://github.com/maudzung/RTM3D)

## Folder structure

```
${ROOT}
└── checkpoints/
├── fpn_resnet_18/
├── fpn_resnet_18_epoch_300.pth
└── dataset/
└── kitti/
├──ImageSets/
│ ├── test.txt
│ ├── train.txt
│ └── val.txt
├── training/
│ ├── image_2/ (left color camera)
│ ├── calib/
│ ├── label_2/
│ └── velodyne/
└── testing/
│ ├── image_2/ (left color camera)
│ ├── calib/
│ └── velodyne/
└── classes_names.txt
└── src/
├── config/
│   ├── train_config.py
│   └── kitti_config.py
├── data_process/
│   ├── kitti_dataloader.py
│   ├── kitti_dataset.py
│   └── kitti_data_utils.py
├── models/
│   ├── fpn_resnet.py
│   ├── resnet.py
│   └── model_utils.py
└── utils/
│ ├── demo_utils.py
│ ├── evaluation_utils.py
│ ├── logger.py
│ ├── misc.py
│ ├── torch_utils.py
│ ├── train_utils.py
│ └── visualization_utils.py
├── demo_2_sides.py
├── demo_front.py
├── test.py
└── train.py
├── README.md
└── requirements.txt
```



[python-image]: https://img.shields.io/badge/Python-3.6-ff69b4.svg
[python-url]: https://www.python.org/
[pytorch-image]: https://img.shields.io/badge/PyTorch-1.5-2BAF2B.svg
[pytorch-url]: https://pytorch.org/
42 changes: 42 additions & 0 deletions Technical_details.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#Super Fast and Accurate 3D Object Detection based on LiDAR Point Clouds

---

Technical details of the implementation


## 1. Input/Output & Model

- I used the ResNet-based Keypoint Feature Pyramid Network (KFPN) that was proposed in [RTM3D paper](https://arxiv.org/pdf/2001.03343.pdf).
- The model takes a birds-eye-view RGB-map as input. The RGB-map is encoded by height, intensity and density of 3D LiDAR point clouds.
- **Outputs**: **7 degrees of freedom** _(7-DOF)_ of objects: `(cx, cy, cz, l, w, h, θ)`
- `cx, cy, cz`: The center coordinates.
- `l, w, h`: length, width, height of the bounding box.
- `θ`: The heading angle in radians of the bounding box.
- **Objects**: Cars, Pedestrians, Cyclists.

## 2. Losses function

- For main center heatmap: Used `focal loss`

- For heading angle _(direction)_: The model predicts 2 components (`imaginary value` and `real value`).
The `im` and `re` are directly regressed by using `l1_loss`

- For `z coordinate` and `3 dimensions` (height, width, length), I used `balanced l1 loss` that was proposed by the paper
[Libra R-CNN: Towards Balanced Learning for Object Detection](https://arxiv.org/pdf/1904.02701.pdf)

## 3. Training in details

- Set weights for the above losses are uniform (`=1.0` for all)
- Number of epochs: 300
- Learning rate scheduler: [`cosin`](https://arxiv.org/pdf/1812.01187.pdf), initial learning rate: 0.001
- Batch size: `16` (on GTX 1080Ti)

## 4. Inference

During the inference, a `3 × 3` max pooling operation is applied on the center heat map, then I keep `50` predictions whose
center confidences are larger than 0.2.

## 5. How to expand the work

You can train the model with more classes and expand the detected area by modify configurations in the [config/kitti_dataset.py]()
Binary file not shown.
Loading

0 comments on commit 6e141bf

Please sign in to comment.