Name		Name	Last commit message	Last commit date
parent directory ..
assets		assets
configs		configs
docs		docs
evaluators		evaluators
mmdet3d		mmdet3d
scripts		scripts
README.md		README.md

README.md

V2XFusion

V2XFusion is a multi-sensor fusion 3D object detection solution for roadside scenarios. Utilizing the relative concentration of object heights in the roadside scenarios and the more information brought by multi-sensor fusion, V2XFusion achieved a good performance on the DAIR-V2X-I Dataset. This repository contains training, quantization, and ONNX exportation for V2XFusion. The pictures below show the camera and LiDAR installation locations and the characteristics of the roadside scenarios.

Architecture

The architecture image takes batch size of 4 as an example. The data input of each batch consists of a camera data and the corresponding lidar data.

Git clone and apply patch

Please pull the specified version of BEVFusion and copy our patch package into it.

git clone https://github.com/mit-han-lab/bevfusion
cd bevfusion
git checkout db75150717a9462cb60241e36ba28d65f6908607
cd ..

git clone https://github.com/ADLab-AutoDrive/BEVHeight.git
cd BEVHeight
cp -r evaluators ../bevfusion/
cp -r scripts ../bevfusion/
cd ..

cp -r Lidar_AI_Solution/CUDA-V2XFusion/* bevfusion/

Prerequisites

Installation guide

Quick Start

1. Prepare Dataset

Convert the dataset format from DAIR-V2X-I to KITTI. Please refer to this document provided by BEVHeight.

2. Configure dataset paths

Set the dataset_root and dataset_kitti_root in the configs/V2X-I/default.yaml to the paths where DAIR-V2X-I and DAIR-V2X-I-KITTI are located.

3. Training and PTQ

We have two training mechanisms, including sparsity training (2:4 structured sparsity) and dense training. The 2:4 structured sparsity can usually yield significant performance improvement on GPU that supports sparsity. This is an acceleration technique supported by hardware that we recommended.

For sparsity training and PTQ:

Training

$ torchpack dist-run -np 8 python scripts/train.py configs/V2X-I/det/centerhead/lssfpn/camera+pointpillar/resnet34/default.yaml --mode sparsity

PTQ

$ python scripts/ptq_v2xfusion.py configs/V2X-I/det/centerhead/lssfpn/camera+pointpillar/resnet34/default.yaml sparsity_epoch_100.pth --mode sparsity

For dense training and PTQ

Training

$ torchpack dist-run -np 8 python scripts/train.py configs/V2X-I/det/centerhead/lssfpn/camera+pointpillar/resnet34/default.yaml --mode dense

PTQ

$ python scripts/ptq_v2xfusion.py configs/V2X-I/det/centerhead/lssfpn/camera+pointpillar/resnet34/default.yaml dense_epoch_100.pth --mode dense

4. Export ONNX

We have two export configurations available: FP16 (--precision fp16) and INT8 (--precision int8). FP16 configuration provides sub-optimal TensorRT performance but the best prediction accuracy when compared to INT8. There are no QDQ nodes in the exported ONNX file. INT8 configuration provides optimal TensorRT performance but sub-optimal prediction accuracy when compared to FP16. In this configuration, quantization parameters are present as QDQ nodes in the exported ONNX file.

FP16

$ python scripts/export_v2xfusion.py configs/V2X-I/det/centerhead/lssfpn/camera+pointpillar/resnet34/default.yaml ptq.pth --precision fp16

INT8

$ python scripts/export_v2xfusion.py configs/V2X-I/det/centerhead/lssfpn/camera+pointpillar/resnet34/default.yaml ptq.pth --precision int8

5. Deployment on DeepStream

Please refer to the V2XFusion inference sample provided by DeepStream. The configuration parameters can be adjusted during training to adapt to different datasets, especially the dbound parameter. Before inference, you need to confirm that the parameters in the precomputation script in DeepStream are the same with the training setup.

image_size = [864, 1536]
downsample_factor = 16
C = 80 
xbound = [0, 102.4, 0.8]
ybound = [-51.2, 51.2, 0.8]
zbound = [-5, 3, 8]
dbound = [-2, 0, 90]

Experimental Results

DAIR-V2X-I Dataset

Method	Sparsity/Dense	FP16/INT8	Car			Pedestrain			Cyclist			model pth
			[email protected]			[email protected]			[email protected]
			Easy	Mod.	Hard	Easy	Mod.	Hard	Easy	Mod.	Hard
V2XFusion	sparsity	FP16	82.08	69.70	69.76	51.51	49.15	49.54	61.21	58.07	58.65	model
	sparsity	INT8-PTQ	82.06	69.70	69.75	51.13	48.86	49.22	60.95	57.81	58.43	ptq.pth
	Dense	FP16	82.30	69.84	69.90	49.47	47.31	47.60	59.09	58.29	58.70	model
	Dense	INT8-PTQ	82.33	69.88	69.94	48.60	46.55	46.82	59.28	58.12	58.52	ptq.pth

V2XFusion Backbone: ResNet34 + PointPillars

Note:
To make the model more robust, the sequence dataset V2X-Seq-SPD also can be added to the training phase. We provide a pre-trained model (dbound=[-1.5, 3.0, 180]) that is used in the Deepstream demo for reference. Please refer to the script dataset_merge for data merge.

Performance on Jetson Orin

FP16 Dense(ms)	FP16 Spasity(ms)	INT8 Dense(ms)	INT8 Sparsity(ms)
49.3	42.2	31.6	25.3

Device: NVIDIA Jetson AGX Orin Developer Kit (MAXN power mode)
Version: Jetpack 6.0 DP
Modality: Camera + LiDAR
Image Resolution: 864 x 1536
Batch size: 4

References

V2X-DAIR Dataset: https://thudair.baai.ac.cn/coop-forecast
BEVFusion-R: https://github.com/mit-han-lab/bevfusion
BEVHeight: https://github.com/ADLab-AutoDrive/BEVHeight

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA-V2XFusion

CUDA-V2XFusion

README.md

V2XFusion

Architecture

Git clone and apply patch

Prerequisites

Quick Start

1. Prepare Dataset

2. Configure dataset paths

3. Training and PTQ

4. Export ONNX

5. Deployment on DeepStream

Experimental Results

Performance on Jetson Orin

References

Files

CUDA-V2XFusion

Directory actions

More options

Directory actions

More options

Latest commit

History

CUDA-V2XFusion

Folders and files

parent directory

README.md

V2XFusion

Architecture

Git clone and apply patch

Prerequisites

Quick Start

1. Prepare Dataset

2. Configure dataset paths

3. Training and PTQ

4. Export ONNX

5. Deployment on DeepStream

Experimental Results

Performance on Jetson Orin

References