Skip to content
/ CSRA Public
forked from Kevinz-code/CSRA

Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

License

Notifications You must be signed in to change notification settings

MLDL/CSRA

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CSRA

This is the official code of ICCV 2021 paper:
Residual Attention: A Simple But Effective Method for Multi-Label Recoginition

attention

Training and Validation code has been released!

This package is developed by Mr. Ke Zhu (http://www.lamda.nju.edu.cn/zhuk/) and we are trying to add the implementation code of ViT models. If you have any question about the code, please feel free to contact Mr. Ke Zhu ([email protected]). The package is free for academic usage. You can run it at your own risk. For other purposes, please contact Prof. Jianxin Wu (mail to [email protected]).

Requirements

  • Python 3.7
  • pytorch 1.6
  • torchvision 0.7.0
  • pycocotools 2.0
  • tqdm 4.49.0, pillow 7.2.0

Dataset

We expect VOC2007 and COCO2014 dataset to have the following structure:

Dataset/
|-- VOCdevkit/
|---- VOC2007/
|------ JPEGImages/
|------ Annotations/
|------ ImageSets/
......
|-- COCO2014/
|---- annotations/
|---- images/
|------ train2014/
|------ val2014/
...

Then directly run the following command to generate json file (for implementation) of these datasets.

python utils/prepare_voc.py  --data_path  Dataset/VOCdevkit
python utils/prepare_coco.py --data_path  Dataset/COCO2014

which will automatically result in json files in ./data/voc07 and ./data/coco

Validation

We provide pretrained models on Google Drive for validation. ResNet101 trained on ImageNet with CutMix augmentation can be downloaded here.

Dataset Backbone Head nums mAP Resolution Download
VOC2007 ResNet-101 1 94.7 448x448 download
VOC2007 ResNet-cut 1 95.2 448x448 download
COCO ResNet-101 4 83.3 448x448 download
COCO ResNet-cut 6 85.6 448x448 download

After model preparation, you can run the following validation command:

CUDA_VISIBLE_DEVICES=0 python val.py --num_heads 1 --lam 0.1 --dataset voc07 --num_cls 20  --load_from PRETRAINED_MODEL.pth

Training

VOC2007

You can run either of these two lines below

CUDA_VISIBLE_DEVICES=0 python main.py --num_heads 1 --lam 0.1 --dataset voc07 --num_cls 20
CUDA_VISIBLE_DEVICES=0 python main.py --num_heads 1 --lam 0.1 --dataset voc07 --num_cls 20 --cutmix CutMix_ResNet101.pth

Note that the first command uses the Official ResNet-101 backbone while the second command uses the ResNet-101 pretrained on ImageNet with CutMix augmentation link (which is supposed to gain better performance).

MS-COCO

run the ResNet-101 with 4 heads

CUDA_VISIBLE_DEVICES=0 python main.py --num_heads 6 --lam 0.5 --dataset coco --num_cls 80

run the ResNet-101 (pretrained with CutMix) with 6 heads

CUDA_VISIBLE_DEVICES=0 python main.py --num_heads 6 --lam 0.4 --dataset coco --num_cls 80 --cutmix CutMix_ResNet101.pth

You can feel free to adjust the hyper-parameters such as number of attention heads (--num_heads), or the Lambda (--lam). Still, the default values of them in the above command are supposed to be the best.

Notice

To avoid confusion, please note the 4 lines of code in Figure 1 (in paper) is only used in test stage (without training), which is our motivation. When our model is end-to-end training and testing, multi-head-attention (H=1, H=2, H=4, etc.) is used with different T values. Also, when H=1 and T=infty, the implementation code of multi-head-attention is exactly the same with Figure 1.

Acknowledgement

We thank Lin Sui (http://www.lamda.nju.edu.cn/suil/) for his initial contribution to this project.

About

Official code of ICCV2021 paper "Residual Attention: A Simple but Effective Method for Multi-Label Recognition"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%