Skip to content

Latest commit

 

History

History
105 lines (81 loc) · 31.1 KB

trick_gallery.md

File metadata and controls

105 lines (81 loc) · 31.1 KB

Tricks, corresponding results, experimental settings, and running commands

  • The file contains the results, experimental settings, and running commands of different tricks. These tricks are divided into four families, which are re-weighting, re-sampling, mixup training, and two-stage training. For more details of the above four trick families, see the original paper.
  • For any problem, such as bugs, feel free to open an issue.
  • Click each method to get the experimental setting configs and running commands.

Re-weighting

  • Strictly speaking, the LDAM loss, CrossEntropyLabelSmooth, CDT, and SEQL do not belong to re-weighting methods, but both of them consider the long-tailed distribution when calculate the losses, and they can be combined with re-weighting in DRW. So we add them to re-weighting family.
  • The methods of re-weighting are realized in loss.
Datasets CIFAR-10-LT-100 CIFAR-10-LT-50 CIFAR-100-LT-100 CIFAR-100-LT-50
Baseline
  1. CONFIG (from left to right):
    • configs/cao_cifar/baseline/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
29.64 25.19 61.68 56.15
CE_CE
  1. Introduction:
    • The most commonly used re-weighting method, you can see Eq. (2) in our paper for more details.

  2. CONFIG:
    • configs/cao_cifar/re_weighting/csce/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
31.70 23.20 67.73 63.49
Square CS_CE
  1. Introduction:
    • This is a smooth version of CE_CE (smooth CS_CE), which add a hyper-parameter $ \gamma$ to vanilla CS_CE. In smooth CS_CE, the loss weight of class i is defined as: $(\frac{N_{min}}{N_i})^\gamma$, where $\gamma \in [0, 1]$, $N_i$ is the number of images in class i. We set $\gamma = 0.5$ to get a square-root version of CS_CE (Square CE_CE).

  2. CONFIG:
    • configs/cao_cifar/re_weighting/csce/{cifar10_im100_square.yaml, cifar10_im50_square.yaml, cifar100_im100_square.yaml, cifar100_im50_square.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
31.70 22.22 61.64 57.23
Focal loss
  1. Introduction:
    • Focal loss makes the model focus training on difficult samples, and you can see Eq. (4) in our paper for more details.
    • The Focal loss paper link: Lin et al., ICCV 2017.

  2. CONFIG:
    • configs/cao_cifar/re_weighting/focal/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
28.44 22.09 62.78 58.21
ClassBalanceFocal
  1. Introduction:
    • The modified version of Focal loss, which is based on the theory of effective numbers, and you can see Eq. (5) in our paper for more details.
    • The ClassBalanceFocal paper link: Cui et al., CVPR 2019.

  2. CONFIG:
    • configs/cao_cifar/re_weighting/cbfocal/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
24.80 21.01 61.44 57.63
ClassBalanceCE
  1. Introduction:
    • The modified version of cross-entropy loss, which is based on the theory of effective numbers, and you can see Eq. (6) in our paper for more details.
    • The ClassBalanceCE paper link: Cui et al., CVPR 2019.

  2. CONFIG:
    • configs/cao_cifar/re_weighting/cbce/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
29.52 22.52 61.03 56.22
CrossEntropyLabelSmooth
  1. Introduction:
    • The commonly used regularization trick, label smoothing, based on cross-entropy loss.
    • The CrossEntropyLabelSmooth paper link: Szegedy et al., CVPR 2016.

  2. CONFIG:
    • configs/cao_cifar/re_weighting/cels/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
27.19 23.43 61.56 57.66
CrossEntropyLabelAwareSmooth
  1. Introduction:
    • The modified regularization trick, label-aware smoothing, which is based on label smoothing. It assigns different smoothing factors for each class according to the number of training images it contains.
    • The CrossEntropyLabelAwareSmooth paper link: Zhong et al., CVPR 2021.

  2. CONFIG:
    • configs/cao_cifar/re_weighting/celas/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
27.49 22.04 62.32 56.22
LDAM loss
  1. Introduction:
    • LDAM loss is one of metric learning methods, which aims to assign different margins to different class.
    • The LDAM loss paper link: Cao et al., NeurIPS 2019.

  2. CONFIG:
    • configs/cao_cifar/re_weighting/ldam/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
26.34 20.99 61.12 56.41
SEQL
  1. Introduction:
    • The softmax equalization loss (SEQL) aims to reduce the gradients of tail classes' negative samples. The author argues that the imbalance of gradients in tail classes' positive and negtive samples causes bad influences.
    • The SEQL paper link: Tan et al., CVPR 2020.

  2. CONFIG:
    • configs/cao_cifar/re_weighting/seql/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
-- -- 59.51 55.19
CDT
  1. Introduction:
    • The authors find that a model significantly over-fits the tail classes, and they argue that feature deviation between the training and test samples causes this problem. So they propose class-dependent temperatures (CDT).
    • The CDT paper link: Ye et al., arXiv 2020.

  2. CONFIG:
    • configs/cao_cifar/re_weighting/cdt/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
22.90 18.19 60.41 55.17
BalancedSoftmaxCE
  1. Introduction:
    • A simple and effective re-weighting method, and you can see Eq. (4) in the author paper.
    • The BalancedSoftmaxCE paper link: Ren et al., NeurIPS 2020.

  2. CONFIG:
    • configs/cao_cifar/re_weighting/bsce/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
22.46 18.89 57.01 53.45

Re-sampling

  • The methods of re-sampling are realized in dataset.
Datasets CIFAR-10-LT-100 CIFAR-10-LT-50 CIFAR-100-LT-100 CIFAR-100-LT-50
Baseline
  1. CONFIG (from left to right):
    • configs/cao_cifar/baseline/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
29.64 25.19 61.68 56.15
Class-balanced sampling
  1. Introduction:
    • Class-balanced sampling makes each class to have an equal probability of being selected, and you can see the section Re-sampling in our paper for more details.
    • The class-balanced sampling paper link: Kang et al., ICLR 2020.

  2. CONFIG:
    • configs/cao_cifar/re_sampling/balance/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
27.55 21.92 64.76 60.54
Square-root sampling
  1. Introduction:
    • Square-root sampling aims to return a lighter imbalanced dataset., and you can see the section Re-sampling in our paper for more details.
    • The square-root sampling paper link: Kang et al., ICLR 2020.

  2. CONFIG:
    • configs/cao_cifar/re_sampling/square/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
28.58 22.32 63.21 58.87
Progressively-balanced sampling
  1. Introduction:
    • Progressively-balanced sampling changes the sampling probabilities of classes from random sampling to class-balanced sampling., and you can see the section Re-sampling in our paper for more details.
    • The progressively-balanced sampling paper link: Kang et al., ICLR 2020.

  2. CONFIG:
    • configs/cao_cifar/re_sampling/progressive/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
28.02 21.43 60.96 56.88
BBN-style sampling
  1. Introduction:
    • We combine the sampling method of BBN, which consists of a uniform sampler and a reverse sampler, with input mixup. For more details about these two samplers, you can read the original paper.
    • The progressively-balanced sampling paper link: Zhou et al., CVPR 2020.

  2. CONFIG:
    • configs/cao_cifar/re_sampling/bbn-style/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
28.38 21.89 62.94 57.97

Mixup training

  • The methods of mixup training are realized in combiner.py.
Datasets CIFAR-10-LT-100 CIFAR-10-LT-50 CIFAR-100-LT-100 CIFAR-100-LT-50
Baseline
  1. CONFIG (from left to right):
    • configs/cao_cifar/baseline/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
29.64 25.19 61.68 56.15
Input mixup
  1. Introduction:
    • In input mixup, each new example is formed with two randomly sampled example by a weighted linear interpolation, and we only use the new example to train the network. You can see the section Mixup training in our paper for more details.
    • The mixup paper link: Zhang et al., ICLR 2018.

  2. CONFIG:
    • configs/cao_cifar/mixup/input_mixup/{cifar10_im100_im_alpha10.yaml, cifar10_im50_im_alpha10.yaml, cifar100_im100_im_alpha10.yaml, cifar100_im50_im_alpha10.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
25.94 21.33 59.18 54.08
Manifold mixup
  1. Introduction:
    • Manifold mixup encourages neural networks to predict less confidently on interpolations of hidden representations. We apply manifold mixup on only one layer in our experiments. You can see the section Mixup training in our paper for more details.
    • The manifold mixup paper link: Verma et al., ICML 2019.

  2. CONFIG:
    • configs/cao_cifar/mixup/manifold_mixup/{cifar10_im100_mm_alpha10.yaml, cifar10_im50_mm_alpha10.yaml, cifar100_im100_mm_alpha10.yaml, cifar100_im50_mm_alpha10.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
24.81 20.42 60.12 54.76
Remix
  1. Introduction:
    • Remix assigns the label in favor of the minority class by providing a disproportionately higher weight to the minority class.
    • The remix paper link: Chou et al., ECCV 2020 workshop.

  2. CONFIG:
    • configs/cao_cifar/mixup/remix/{cifar10_im100_remix_alpha10.yaml, cifar10_im50_remix_alpha10.yaml, cifar100_remix100_im_alpha10.yaml, cifar100_im50_remix_alpha10.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
26.57 20.74 58.61 54.30

Two-stage training

DRW
  • The methods of DRW are realized in loss.
First Stage Second Stage CIFAR-10-LT-100 CIFAR-10-LT-50 CIFAR-100-LT-100 CIFAR-100-LT-50
CE
CE
  1. CONFIG (from left to right):
    • configs/cao_cifar/baseline/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
29.64 25.19 61.68 56.15
CE
CE_CE
  1. CONFIG:
    • configs/cao_cifar/two_stage/drw/csce/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
25.18 20.18 58.38 53.20
CE
Focal loss
  1. CONFIG:
    • configs/cao_cifar/two_stage/drw/focal/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
28.85 20.68 62.47 56.39
CE
ClassBalanceFocal
  1. CONFIG:
    • configs/cao_cifar/two_stage/drw/cbfocal/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
24.57 18.62 61.94 55.01
CE
ClassBalanceCE
  1. CONFIG:
    • configs/cao_cifar/two_stage/drw/cbce/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
25.36 20.65 60.79 56.63
CE
CrossEntropyLabelSmooth
  1. CONFIG:
    • configs/cao_cifar/two_stage/drw/cels/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
28.39 22.71 61.10 57.16
CE
CrossEntropyLabelAwareSmooth
  1. CONFIG:
    • configs/cao_cifar/two_stage/drw/celas/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
27.88 22.27 62.42 57.25
CE
LDAM loss
  1. CONFIG:
    • configs/cao_cifar/two_stage/drw/ldam/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
22.27 18.40 57.53 52.71
CE
CDT
  1. CONFIG:
    • configs/cao_cifar/two_stage/drw/cdt/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
22.45 18.73 57.78 53.20
CE
BalancedSoftmaxCE
  1. CONFIG:
    • configs/cao_cifar/two_stage/drw/bsce/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
22.83 19.16 58.18 53.51
CE
InfluenceBalancedLoss
  1. CONFIG:
    • configs/cao_cifar/two_stage/drw/ibloss/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
23.34 19.08 59.21 54.54
DRS
First Stage Second Stage CIFAR-10-LT-100 CIFAR-10-LT-50 CIFAR-100-LT-100 CIFAR-100-LT-50
CE
Vanilla sampling
  1. CONFIG (from left to right):
    • configs/cao_cifar/baseline/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
29.64 25.19 61.68 56.15
CE
Square-root sampling
  1. CONFIG:
    • configs/cao_cifar/two_stage/drs/squre/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
26.78 21.15 59.64 55.46
CE
Progressively-balanced sampling
  1. CONFIG:
    • configs/cao_cifar/two_stage/drs/progressive/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
26.15 19.47 59.59 54.79
CE
Class-balanced sampling
  1. CONFIG:
    • configs/cao_cifar/two_stage/drs/balance/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
24.93 19.27 59.36 54.58
CE
CAM-based square-sampling
  1. CONFIG:
    • FIRST-STAGE-CONFIG: configs/cao_cifar/two_stage/drs/cam_based_sampling/first_stage/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}
    • CAM-GENERATION-CONFIG: configs/cao_cifar/two_stage/drs/cam_based_sampling/cam_generation/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}
    • SECOND-STAGE-CONFIG: configs/cao_cifar/two_stage/drs/cam_based_sampling/second_stage/square/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • You have three steps. You should run this codebase with the configs in the first stage, CAM generation, and the second stage step by step.
      • bash data_parallel_train.sh FIRST-STAGE-CONFIG GPU
        bash data_parallel_train.sh CAM-GENERATION-CONFIG GPU
        bash data_parallel_train.sh SECOND-STAGE-CONFIG GPU
26.45 20.46 59.33 54.58
CE
CAM-based progressive-sampling
  1. CONFIG:
    • FIRST-STAGE-CONFIG: configs/cao_cifar/two_stage/drs/cam_based_sampling/first_stage/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}
    • CAM-GENERATION-CONFIG: configs/cao_cifar/two_stage/drs/cam_based_sampling/cam_generation/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}
    • SECOND-STAGE-CONFIG: configs/cao_cifar/two_stage/drs/cam_based_sampling/second_stage/progressive/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • You have three steps. You should run this codebase with the configs in the first stage, CAM generation, and the second stage step by step.
      • bash data_parallel_train.sh FIRST-STAGE-CONFIG GPU
        bash data_parallel_train.sh CAM-GENERATION-CONFIG GPU
        bash data_parallel_train.sh SECOND-STAGE-CONFIG GPU
27.08 20.76 58.92 53.90
CE
CAM-based balance-sampling
  1. CONFIG:
    • FIRST-STAGE-CONFIG: configs/cao_cifar/two_stage/drs/cam_based_sampling/first_stage/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}
    • CAM-GENERATION-CONFIG: configs/cao_cifar/two_stage/drs/cam_based_sampling/cam_generation/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}
    • SECOND-STAGE-CONFIG: configs/cao_cifar/two_stage/drs/cam_based_sampling/second_stage/balance/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • You have three steps. You should run this codebase with the configs in the first stage, CAM generation, and the second stage step by step.
      • bash data_parallel_train.sh FIRST-STAGE-CONFIG GPU
        bash data_parallel_train.sh CAM-GENERATION-CONFIG GPU
        bash data_parallel_train.sh SECOND-STAGE-CONFIG GPU
23.10 19.28 58.05 53.27

Classifier-balancing

  • Classifier-balancing shows another way to balance the backbone and classifier. Unlike DRS and DRW, it trains a backbone firstly and then freezes the backbone to re-balance the classifier.

  • The classifier-balancing methods introduced in Kang et al., ICLR 2020, includes tau_normalization, LWS and cRT. You can see the Section 4 in the paper for details of these methods.

  • tau_normalization, LWS, and cRT are realized in network.py

Datasets CIFAR-10-LT-100 CIFAR-10-LT-50 CIFAR-100-LT-100 CIFAR-100-LT-50
Baseline
  1. CONFIG (from left to right):
    • configs/cao_cifar/baseline/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
29.64 25.19 61.68 56.15
Tau_normalization
  1. Introduction:
    • The tau_normalization paper link: Kang et al., ICLR 2020.
    • When using tau_normalization, you should have a trained model firstly and then change the TEST.MODEL_FILE to your own path.

  2. CONFIG:
    • configs/cao_cifar/two_stage/classifier_balance/tau_norm/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • python main/valid.py --cfg CONFIG --gpus GPU
27.30 20.22 59.04 54.31
cRT (Classifier Re-training)
  1. Introduction:
    • The cRT paper link: Kang et al., ICLR 2020.
    • When using cRT, you should have a trained model firstly and then change the NETWORK.PRETRAINED_MODEL to your own path.

  2. CONFIG:
    • configs/cao_cifar/two_stage/classifier_balance/cRT/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
25.01 19.83 59.74 54.64
LWS (Learnable Weight Scaling)
  1. Introduction:
    • The LWS paper link: Kang et al., ICLR 2020.
    • When using LWS, you should have a trained model firstly and then change the NETWORK.PRETRAINED_MODEL to your own path.

  2. CONFIG:
    • configs/cao_cifar/two_stage/classifier_balance/LWS/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  3. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
27.37 21.34 59.84 54.79

Knowledge distillation and knowledge transfer

Datasets CIFAR-10-LT-100 CIFAR-10-LT-50 CIFAR-100-LT-100 CIFAR-100-LT-50
Baseline
  1. CONFIG (from left to right):
    • configs/cao_cifar/baseline/{cifar10_im100.yaml, cifar10_im50.yaml, cifar100_im100.yaml, cifar100_im50.yaml}

  2. Running commands:
    • bash data_parallel_train.sh CONFIG GPU
29.64 25.19 61.68 56.15
DiVE
  1. Introduction:
    • The DiVE paper link: He et al., ICCV 2021.
    • When using DiVE, you should train a teacher model firstly, and then use the trained teacher to distill a student model.

  2. CONFIG:
    • 1, configs/cao_cifar/DiVE/{cifar10_im100, cifar10_im50, cifar100_im100, cifar100_im50}/teacher.yaml
    • 2, configs/cao_cifar/DiVE/{cifar10_im100, cifar10_im50, cifar100_im100, cifar100_im50}/student.yaml

21.12 17.56 54.48 49.17