Skip to content
/ DEIM Public
forked from ShihuaHuang95/DEIM

DEIM: DETR with Improved Matching for Fast Convergence

License

Notifications You must be signed in to change notification settings

dokooh/DEIM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

DEIM: DETR with Improved Matching for Fast Convergence

license prs issues stars arXiv email

[Shihua Huang](http://www.shihuahuang.cn/)1, [Zhichao Lu](https://scholar.google.com/citations?user=tIFWBcQAAAAJ&hl=en)2, [Xiaodong Cun](https://vinthony.github.io/academic/)3, Yongjun Yu1, Xiao Zhou4, [Xi Shen](https://xwcv.github.io)1 πŸ“§

1 Intellindust AI Lab, 2 City University of Hong Kong, 3 Great Bay University , 4 Hefei Normal University

(πŸ“§) corresponding author, [email protected] (πŸ“§) project leader, [email protected]

If you like , please give us a ⭐! Your support motivates us to keep improving!

DEIM is a real-time training framework that builds on detection with Transformer (DETR) and provides state-of-the-art DETRs, includes D-FINE and RT-DETR.

πŸš€ Updates

  • [2024.12.03] Release DEIM series. Besides, DEIM repo supports the re-implmentations of D-FINE and RT-DETR.

Model Zoo

DEIM-D-FINE

Model Dataset APval #Params Latency GFLOPs config checkpoint
S COCO 49.0 10M 3.49ms 25 yml ckpt
M COCO 52.7 19M 5.62ms 57 yml ckpt
L COCO 54.7 31M 8.07ms 91 yml ckpt
X COCO 56.5 62M 12.89ms 202 yml ckpt

DEIM-RTDETRv2

Model Dataset APval #Params Latency GFLOPs config checkpoint
S COCO 49.0 20M 4.59ms 60 yml ckpt
M COCO 50.9 31M 6.40ms 92 yml ckpt
M* COCO 53.2 33M 6.90ms 100 yml ckpt
L COCO 54.3 42M 9.15ms 136 yml ckpt
X COCO 55.5 76M 13.66ms 259 yml ckpt

Quick start

Setup

conda create -n deim python=3.11.9
conda activate deim
pip install -r requirements.txt

Data Preparation

COCO2017 Dataset
  1. Download COCO2017 from OpenDataLab or COCO.

  2. Modify paths in coco_detection.yml

    train_dataloader:
        img_folder: /data/COCO2017/train2017/
        ann_file: /data/COCO2017/annotations/instances_train2017.json
    val_dataloader:
        img_folder: /data/COCO2017/val2017/
        ann_file: /data/COCO2017/annotations/instances_val2017.json
Custom Dataset

To train on your custom dataset, you need to organize it in the COCO format. Follow the steps below to prepare your dataset:

  1. Set remap_mscoco_category to False:

    This prevents the automatic remapping of category IDs to match the MSCOCO categories.

    remap_mscoco_category: False
  2. Organize Images:

    Structure your dataset directories as follows:

    dataset/
    β”œβ”€β”€ images/
    β”‚   β”œβ”€β”€ train/
    β”‚   β”‚   β”œβ”€β”€ image1.jpg
    β”‚   β”‚   β”œβ”€β”€ image2.jpg
    β”‚   β”‚   └── ...
    β”‚   β”œβ”€β”€ val/
    β”‚   β”‚   β”œβ”€β”€ image1.jpg
    β”‚   β”‚   β”œβ”€β”€ image2.jpg
    β”‚   β”‚   └── ...
    └── annotations/
        β”œβ”€β”€ instances_train.json
        β”œβ”€β”€ instances_val.json
        └── ...
    • images/train/: Contains all training images.
    • images/val/: Contains all validation images.
    • annotations/: Contains COCO-formatted annotation files.
  3. Convert Annotations to COCO Format:

    If your annotations are not already in COCO format, you'll need to convert them. You can use the following Python script as a reference or utilize existing tools:

    import json
    
    def convert_to_coco(input_annotations, output_annotations):
        # Implement conversion logic here
        pass
    
    if __name__ == "__main__":
        convert_to_coco('path/to/your_annotations.json', 'dataset/annotations/instances_train.json')
  4. Update Configuration Files:

    Modify your custom_detection.yml.

    task: detection
    
    evaluator:
      type: CocoEvaluator
      iou_types: ['bbox', ]
    
    num_classes: 777 # your dataset classes
    remap_mscoco_category: False
    
    train_dataloader:
      type: DataLoader
      dataset:
        type: CocoDetection
        img_folder: /data/yourdataset/train
        ann_file: /data/yourdataset/train/train.json
        return_masks: False
        transforms:
          type: Compose
          ops: ~
      shuffle: True
      num_workers: 4
      drop_last: True
      collate_fn:
        type: BatchImageCollateFunction
    
    val_dataloader:
      type: DataLoader
      dataset:
        type: CocoDetection
        img_folder: /data/yourdataset/val
        ann_file: /data/yourdataset/val/ann.json
        return_masks: False
        transforms:
          type: Compose
          ops: ~
      shuffle: False
      num_workers: 4
      drop_last: False
      collate_fn:
        type: BatchImageCollateFunction

Usage

COCO2017
  1. Training
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=7777 --nproc_per_node=4 train.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml --use-amp --seed=0
  1. Testing
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=7777 --nproc_per_node=4 train.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml --test-only -r model.pth
  1. Tuning
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --master_port=7777 --nproc_per_node=4 train.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml --use-amp --seed=0 -t model.pth
Customizing Batch Size

For example, if you want to double the total batch size when training D-FINE-L on COCO2017, here are the steps you should follow:

  1. Modify your dataloader.yml to increase the total_batch_size:

    train_dataloader:
        total_batch_size: 64  # Previously it was 32, now doubled
  2. Modify your deim_hgnetv2_l_coco.yml. Here’s how the key parameters should be adjusted:

    optimizer:
    type: AdamW
    params:
        -
        params: '^(?=.*backbone)(?!.*norm|bn).*$'
        lr: 0.000025  # doubled, linear scaling law
        -
        params: '^(?=.*(?:encoder|decoder))(?=.*(?:norm|bn)).*$'
        weight_decay: 0.
    
    lr: 0.0005  # doubled, linear scaling law
    betas: [0.9, 0.999]
    weight_decay: 0.0001  # need a grid search
    
    ema:  # added EMA settings
        decay: 0.9998  # adjusted by 1 - (1 - decay) * 2
        warmups: 500  # halved
    
    lr_warmup_scheduler:
        warmup_duration: 250  # halved
Customizing Input Size

If you'd like to train DEIM on COCO2017 with an input size of 320x320, follow these steps:

  1. Modify your dataloader.yml:

    train_dataloader:
    dataset:
        transforms:
            ops:
                - {type: Resize, size: [320, 320], }
    collate_fn:
        base_size: 320
    dataset:
        transforms:
            ops:
                - {type: Resize, size: [320, 320], }
  2. Modify your dfine_hgnetv2.yml:

    eval_spatial_size: [320, 320]

Tools

Deployment
  1. Setup
pip install onnx onnxsim
  1. Export onnx
python tools/deployment/export_onnx.py --check -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml -r model.pth
  1. Export tensorrt
trtexec --onnx="model.onnx" --saveEngine="model.engine" --fp16
Inference (Visualization)
  1. Setup
pip install -r tools/inference/requirements.txt
  1. Inference (onnxruntime / tensorrt / torch)

Inference on images and videos is now supported.

python tools/inference/onnx_inf.py --onnx model.onnx --input image.jpg  # video.mp4
python tools/inference/trt_inf.py --trt model.engine --input image.jpg
python tools/inference/torch_inf.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml -r model.pth --input image.jpg --device cuda:0
Benchmark
  1. Setup
pip install -r tools/benchmark/requirements.txt
  1. Model FLOPs, MACs, and Params
python tools/benchmark/get_info.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml
  1. TensorRT Latency
python tools/benchmark/trt_benchmark.py --COCO_dir path/to/COCO2017 --engine_dir model.engine
Fiftyone Visualization
  1. Setup
pip install fiftyone
  1. Voxel51 Fiftyone Visualization (fiftyone)
python tools/visualization/fiftyone_vis.py -c configs/deim_dfine/deim_hgnetv2_${model}_coco.yml -r model.pth
Others
  1. Auto Resume Training
bash reference/safe_training.sh
  1. Converting Model Weights
python reference/convert_weight.py model.pth

Citation

If you use DEIM or its methods in your work, please cite the following BibTeX entries:

bibtex
@misc{huang2024deim,
      title={DEIM: DETR with Improved Matching for Fast Convergence},
      author={Shihua Huang, Zhichao Lu, Xiaodong Cun, Yongjun Yu, Xiao Zhou, and Xi Shen},
      year={2024},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement

Our work is built upon D-FINE and RT-DETR.

✨ Feel free to contribute and reach out if you have any questions! ✨

About

DEIM: DETR with Improved Matching for Fast Convergence

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.4%
  • Shell 0.6%