Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
figs		figs
README.md		README.md

Repository files navigation

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

CMT_nuScenes_testset.mp4

This repository is an official implementation of CMT.

CMT is a robust 3D detector for end-to-end 3D multi-modal detection. A DETR-like framework is designed for multi-modal detection(CMT) and lidar-only detection(CMT-L), which obtains 73.5% and 70.1% NDS separately on nuScenes benchmark. Without explicit view transformation, CMT takes the image and point clouds tokens as inputs and directly outputs accurate 3D bounding boxes. CMT can be a strong baseline for further research.

Preparation

Environments
Python == 3.8, CUDA == 11.1, pytorch == 1.9.0, mmdet3d == 1.0.0rc5
Data
Follow the mmdet3d to process the nuScenes dataset (https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/data_preparation.md).

Main Results

We provide some results on nuScenes val set. The default batch size is 2 on each GPU.

config	mAP	NDS	GPU	schedule	time
CMT-pillar0200-r50-704x256	53.8%	58.5%	8 x 2080ti	20 epoch	13 hours
CMT-voxel0100-r50-800x320	60.1%	63.4%	8 x 2080ti	20 epoch	14 hours
CMT-voxel0075-vov-1600x640	69.4%	71.9%	8 x A100	15e+5e(with cbgs)	45 hours

Citation

If you find CMT helpful in your research, please consider citing:

@article{yan2023cross,
  title={Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection},
  author={Yan, Junjie and Liu, Yingfei and Sun, Jianjian and Jia, Fan and Li, Shuailin and Wang, Tiancai and Zhang, Xiangyu},
  journal={arXiv preprint arXiv:2301.01283},
  year={2023}
}

Contact

If you have any questions, feel free to open an issue or contact us at [email protected], [email protected], [email protected] or [email protected].

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Preparation

Main Results

Citation

Contact

About

Releases

Packages

Languages

License

junjie18/CMT

Folders and files

Latest commit

History

Repository files navigation

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection

Preparation

Main Results

Citation

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages