In ES6D, we propose an efficient backbone for real-time instance-level 6D pose estimation. Besides, we propose A(M)GPD to solve the ambiguity problem in the widely used ADD-S metric. For more details, please check our paper, supplement , and video. We recommend you use the proposed A(M)GPD metric to train your own model!!!
2022.06.12
: Initial release.- To do: the experiment details on YCB-Video dataset, the details about the GP construction.
git clone [email protected]:GANWANSHUI/ES6D.git
cd ES6D
conda env create -f es6d.yaml
Directory structure for the download datasets (click to expand; only list used files)
datasets
|-- tless # http://cmp.felk.cvut.cz/t-less/download.html
| |-- train_pbr # https://bop.felk.cvut.cz/media/data/bop_datasets/tless_train_pbr.zip
| | |-- 000000
| | | |-- depth
| | | |-- mask
| | | |-- mask_visib
| | | |-- rgb
| | | |-- scene_camera.json
| | | |-- scene_gt.json
| | | |-- scene_gt_info.json
| | |-- 000001
| |
| |-- test_primesense
| | |-- 000001
| | | |-- depth
| | | |-- mask_visib
| | | |-- mask_visib_pred // (please find the prediction result from Stablepose)
| | | |-- rgb
| | | |-- scene_camera.json
| | | |-- scene_gt.json
| | | |-- scene_gt_info.json
| | |-- 000002
|
|
|-- ycb # Link: https://rse-lab.cs.washington.edu/projects/posecnn/
pack each training and testing instance into .mat.
It may occur the error due to the duplicate file name, just simply retry the order.
-------------------------------------------------------------------------------------------------------------------------------------
$ python ./datasets/tless/tless_preparation.py --tless_path ./datasets/tless --train_set True
If succeed, it will have ./datasets/tless/train_pbr_mat and ./datasets/tless/train_pbr_mat.txt for the dataloader
-------------------------------------------------------------------------------------------------------------------------------------
$ python ./datasets/tless/tless_preparation.py --tless_path ./datasets/tless --train_set False
If succeed, it will have ./datasets/tless/test_primesense_gt_mask_mat and ./datasets/tless/test_primesense_gt_mask_mat.txt for the dataloader
At last, please download the downsample pcd files (https://www.dropbox.com/sh/zxq5lx71zpq4nts/AAALVgeSvszpHEy8CUBr8iala?dl=0), and place the models into ./datasets/tless
To train tless
and evaluate testset at the end of training (metric for add(s)
and A(M)GPD
), run:
$ python train.py --loss_type GADD
Some messages for the training on T-LESS dataset
- The initial learning rate is set as 0.002, which is much large than the one in the YCB-Video dataset.
- The training set is a synthesis dataset, so suitable data augmentation could very helpful to improve the performance in the real scenario testing set. For example, we just randomly add some noise to the point cloud and find obvious performance gain. Therefore, more suitable data augmentation could be further investigated.
- The training strategy is just simply cut down the learning rate after 60 epochs, other learning rate adjustments may more helpful. We train the whole network with 8 NVIDIA 2080TI with 120 epochs and it cost nearly 2 days. But from the loss curve, it should be not necessary to train so many epochs if with a more suitable learning rate strategy.
To only evaluate the testset add(s)
and A(M)GPD
of the trained tless
without re-training, please define the checkpoint_PATH and run:
$ python train.py --test_only True --resume ./experiments/tless/GADD/checkpoint_0120.pth.tar
We post the old version code but still need to update it, but hope that you could find something useful if you urgent to use the GP for your project.
This work can not be finished well without the following reference, many thanks for the author's contribution: DenseFusion, PVN3D, FFB6D, Stablepose.
Please cite ES6D if you use this repository in your publications:
@inproceedings{mo2022es6d,
title={ES6D: A Computation Efficient and Symmetry-Aware 6D Pose Regression Framework},
author={Mo, Ningkai and Gan, Wanshui and Yokoya, Naoto and Chen, Shifeng},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={6718--6727},
year={2022}
}