Merge branch 'dev-ssl' into master

CCGWJ · Oct 27, 2021 · f6e9262 · f6e9262
2 parents 68e84a4 + 230df3a
commit f6e9262
Show file tree

Hide file tree

Showing 39 changed files with 3,601 additions and 33 deletions.
diff --git a/docs/dalib/benchmarks/re_identification.rst b/docs/dalib/benchmarks/re_identification.rst
@@ -16,7 +16,7 @@ We adopt cross dataset setting (another one is cross camera setting). The model
 
 For a fair comparison, our model is trained with standard cross entropy loss and triplet loss. We adopt modified resnet architecture from `Mutual Mean-Teaching: Pseudo Label Refinery for Unsupervised Domain Adaptation on Person Re-identification (ICLR 2020) <https://arxiv.org/pdf/2001.01526.pdf>`_.
 
-As we are given unlabeled samples from target domain, we can utilize clustering algorithms to produce pseudo labels on target domain and then use them as supervision signals to perform self-training. This simple method turns out to be a strong baseline. We use ``Baseline_Cluster`` to represent this baseline in our results.
+As we are given unlabelled samples from target domain, we can utilize clustering algorithms to produce pseudo labels on target domain and then use them as supervision signals to perform self-training. This simple method turns out to be a strong baseline. We use ``Baseline_Cluster`` to represent this baseline in our results.
 
 .. note::
 

diff --git a/docs/index.rst b/docs/index.rst
@@ -80,7 +80,10 @@ Transfer Learning
     :caption: Semi Supervised Learning Methods
     :titlesonly:
 
-    ssllib/semi_supervised_learning.rst
+    ssllib/consistency_regularization.rst
+    ssllib/contrastive_learning.rst
+    ssllib/holistic_methods.rst
+    ssllib/proxy_label.rst
 
 
 

diff --git a/docs/ssllib/consistency_regularization.rst b/docs/ssllib/consistency_regularization.rst
@@ -0,0 +1,42 @@
+=======================================
+Consistency Regularization
+=======================================
+
+.. _PI_MODEL:
+
+Pi Model
+------------------
+
+.. autofunction:: ssllib.pi_model.sigmoid_rampup
+
+.. autofunction:: ssllib.pi_model.softmax_mse_loss
+
+.. autofunction:: ssllib.pi_model.symmetric_mse_loss
+
+.. autoclass:: ssllib.pi_model.SoftmaxMSELoss
+
+.. autoclass:: ssllib.pi_model.SoftmaxKLLoss
+
+
+.. _MEAN_TEACHER:
+
+Mean Teacher
+------------------
+
+.. autofunction:: ssllib.mean_teacher.update_ema_variables
+
+.. autoclass:: ssllib.mean_teacher.SymmetricMSELoss
+
+.. autoclass:: ssllib.mean_teacher.MeanTeacher
+
+
+.. _UDA:
+
+Unsupervised Data Augmentation (UDA)
+------------------------------------
+
+.. autoclass:: ssllib.rand_augment.RandAugment
+
+.. autoclass:: ssllib.uda.SupervisedUDALoss
+
+.. autoclass:: ssllib.uda.UnsupervisedUDALoss
diff --git a/docs/ssllib/contrastive_learning.rst b/docs/ssllib/contrastive_learning.rst
@@ -0,0 +1,12 @@
+=======================================
+Contrastive Learning
+=======================================
+
+.. _SELF_TUNING:
+
+Self-Tuning
+------------------
+
+.. autoclass:: ssllib.self_tuning.Classifier
+
+.. autoclass:: ssllib.self_tuning.SelfTuning
diff --git a/docs/ssllib/holistic_methods.rst b/docs/ssllib/holistic_methods.rst
@@ -0,0 +1,10 @@
+=======================================
+Holistic Methods
+=======================================
+
+.. _FIXMATCH:
+
+FixMatch
+------------------
+
+.. autoclass:: ssllib.fix_match.FixMatchConsistencyLoss
diff --git a/docs/ssllib/proxy_label.rst b/docs/ssllib/proxy_label.rst
@@ -0,0 +1,12 @@
+=======================================
+Proxy-Label Based Methods
+=======================================
+
+.. _PSEUDO:
+
+Pseudo Label
+------------------
+
+Given model predictions :math:`y` on unlabeled samples, we can directly utilize them to generate
+pseudo labels :math:`label=\mathop{\arg\max}\limits_{i}~y[i]`. Then we use these pseudo labels as supervision to train
+our model. Details can be found at `projects/self_tuning/pseudo_label.py`.
diff --git a/examples/task_adaptation/image_classification/README.md b/examples/task_adaptation/image_classification/README.md
@@ -32,13 +32,12 @@ and prepare them following [Documentation for Retinopathy](/common/vision/datase
 
 Supported methods include:
 
-- [Explicit inductive bias for transfer learning with convolutional networks
-    (L2-SP, ICML 2018)](https://arxiv.org/abs/1802.01483)
+- [Learning Without Forgetting (LWF, ECCV 2016)](https://arxiv.org/abs/1606.09282)
+- [Explicit inductive bias for transfer learning with convolutional networks (L2-SP, ICML 2018)](https://arxiv.org/abs/1802.01483)
 - [Catastrophic Forgetting Meets Negative Transfer: Batch Spectral Shrinkage for Safe Transfer Learning (BSS, NIPS 2019)](https://proceedings.neurips.cc/paper/2019/file/c6bff625bdb0393992c9d4db0c6bbe45-Paper.pdf)
 - [DEep Learning Transfer using Fea- ture Map with Attention for convolutional networks (DELTA, ICLR 2019)](https://openreview.net/pdf?id=rkgbwsAcYm)
 - [Co-Tuning for Transfer Learning (Co-Tuning, NIPS 2020)](http://ise.thss.tsinghua.edu.cn/~mlong/doc/co-tuning-for-transfer-learning-nips20.pdf)
 - [Stochastic Normalization (StochNorm, NIPS 2020)](https://papers.nips.cc/paper/2020/file/bc573864331a9e42e4511de6f678aa83-Paper.pdf)
-- [Learning Without Forgetting (LWF, ECCV 2016)](https://arxiv.org/abs/1606.09282)
 - [Bi-tuning of Pre-trained Representations (Bi-Tuning)](https://arxiv.org/abs/2011.06182?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+arxiv%2FQSXk+%28ExcitingAds%21+cs+updates+on+arXiv.org%29)
 
 ## Experiment and Results
@@ -78,52 +77,52 @@ If you use these methods in your research, please consider citing.
 
 ```
 @inproceedings{LWF,
-  author    = {Zhizhong Li and
+    author    = {Zhizhong Li and
                Derek Hoiem},
-  title     = {Learning without Forgetting},
-  booktitle={ECCV},
-  year      = {2016},
+    title     = {Learning without Forgetting},
+    booktitle={ECCV},
+    year      = {2016},
 }
 
 @inproceedings{L2SP,
-  title={Explicit inductive bias for transfer learning with convolutional networks},
-  author={Xuhong, LI and Grandvalet, Yves and Davoine, Franck},
-  booktitle={ICML},
-  year={2018},
+    title={Explicit inductive bias for transfer learning with convolutional networks},
+    author={Xuhong, LI and Grandvalet, Yves and Davoine, Franck},
+    booktitle={ICML},
+    year={2018},
 }
 
 @inproceedings{BSS,
-  title={Catastrophic forgetting meets negative transfer: Batch spectral shrinkage for safe transfer learning},
-  author={Chen, Xinyang and Wang, Sinan and Fu, Bo and Long, Mingsheng and Wang, Jianmin},
-  booktitle={NeurIPS},
-  year={2019}
+    title={Catastrophic forgetting meets negative transfer: Batch spectral shrinkage for safe transfer learning},
+    author={Chen, Xinyang and Wang, Sinan and Fu, Bo and Long, Mingsheng and Wang, Jianmin},
+    booktitle={NeurIPS},
+    year={2019}
 }
 
 @inproceedings{DELTA,
-  title={Delta: Deep learning transfer using feature map with attention for convolutional networks},
-  author={Li, Xingjian and Xiong, Haoyi and Wang, Hanchao and Rao, Yuxuan and Liu, Liping and Chen, Zeyu and Huan, Jun},
-  booktitle={ICLR},
-  year={2019}
+    title={Delta: Deep learning transfer using feature map with attention for convolutional networks},
+    author={Li, Xingjian and Xiong, Haoyi and Wang, Hanchao and Rao, Yuxuan and Liu, Liping and Chen, Zeyu and Huan, Jun},
+    booktitle={ICLR},
+    year={2019}
 }
 
 @inproceedings{StocNorm,
-  title={Stochastic Normalization},
-  author={Kou, Zhi and You, Kaichao and Long, Mingsheng and Wang, Jianmin},
-  booktitle={NeurIPS},
-  year={2020}
+    title={Stochastic Normalization},
+    author={Kou, Zhi and You, Kaichao and Long, Mingsheng and Wang, Jianmin},
+    booktitle={NeurIPS},
+    year={2020}
 }
 
 @inproceedings{CoTuning,
-  title={Co-Tuning for Transfer Learning},
-  author={You, Kaichao and Kou, Zhi and Long, Mingsheng and Wang, Jianmin},
-  booktitle={NeurIPS},
-  year={2020}
+    title={Co-Tuning for Transfer Learning},
+    author={You, Kaichao and Kou, Zhi and Long, Mingsheng and Wang, Jianmin},
+    booktitle={NeurIPS},
+    year={2020}
 }
 
 @article{BiTuning,
-  title={Bi-tuning of Pre-trained Representations},
-  author={Zhong, Jincheng and Wang, Ximei and Kou, Zhi and Wang, Jianmin and Long, Mingsheng},
-  journal={arXiv preprint arXiv:2011.06182},
-  year={2020}
+    title={Bi-tuning of Pre-trained Representations},
+    author={Zhong, Jincheng and Wang, Ximei and Kou, Zhi and Wang, Jianmin and Long, Mingsheng},
+    journal={arXiv preprint arXiv:2011.06182},
+    year={2020}
 }
 ```
diff --git a/projects/README.md b/projects/README.md
@@ -0,0 +1,6 @@
+Here are a few projects that are built on Trans-Learn. They are examples of how to use Trans-Learn as a library, to
+facilitate your own research.
+
+## Projects by [THUML](https://github.com/thuml)
+
+- [Self-Tuning for Data-Efficient Deep Learning (2021 ICML)](http://ise.thss.tsinghua.edu.cn/~mlong/doc/Self-Tuning-for-Data-Efficient-Deep-Learning-icml21.pdf)
diff --git a/projects/self_tuning/README.md b/projects/self_tuning/README.md
@@ -0,0 +1,105 @@
+# Self-Tuning
+
+In this repository, we implement self-tuning and various SSL (semi-supervised learning) algorithms in Trans-Learn.
+
+## Installation
+
+Example scripts support all models in [PyTorch-Image-Models](https://github.com/rwightman/pytorch-image-models). You
+need to install timm to use PyTorch-Image-Models.
+
+```
+pip install timm
+```
+
+## Dataset
+
+Following datasets can be downloaded automatically:
+
+- [CUB200](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html)
+- [StanfordCars](https://ai.stanford.edu/~jkrause/cars/car_dataset.html)
+- [Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/)
+
+## Supported Methods
+
+Supported methods include:
+
+- Pseudo Label (directly utilize model predictions as pseudo labels on unlabeled samples)
+- [Temporal Ensembling for Semi-Supervised Learning (pi model, ICLR 2017)](https://arxiv.org/abs/1610.02242)
+- [Weight-averaged consistency targets improve semi-supervised deep learning results (mean teacher, NIPS 2017)](https://openreview.net/references/pdf?id=ry8u21rtl)
+- [Unsupervised Data Augmentation for Consistency Training (uda, NIPS 2020)](https://proceedings.neurips.cc/paper/2020/file/44feb0096faa8326192570788b38c1d1-Paper.pdf)
+- [FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence (FixMatch, NIPS 2020)](https://proceedings.neurips.cc/paper/2020/file/f7ac67a9aa8d255282de7d11391e1b69-Paper.pdf)
+- [Self-Tuning for Data-Efficient Deep Learning (self-tuning, ICML 2021)](http://ise.thss.tsinghua.edu.cn/~mlong/doc/Self-Tuning-for-Data-Efficient-Deep-Learning-icml21.pdf)
+
+## Experiments and Results
+
+### SSL with supervised pre-trained model
+
+The shell files give the script to reproduce our [results](benchmark.md) with specified hyper-parameters. For example,
+if you want to run baseline on CUB200 with 15% labeled samples, use the following script
+
+```shell script
+# SSL with ResNet50 backbone on CUB200.
+# Assume you have put the datasets under the path `data/cub200`, 
+# or you are glad to download the datasets automatically from the Internet to this path
+CUDA_VISIBLE_DEVICES=0 python baseline.py data/cub200 -d CUB200 -sr 15 --seed 0 --log logs/baseline/cub200_15
+```
+
+### SSL with unsupervised pre-trained model
+
+Take MoCo as an example.
+
+1. Download MoCo pretrained checkpoints from https://github.com/facebookresearch/moco
+2. Convert the format of the MoCo checkpoints to the standard format of pytorch
+
+```shell
+mkdir checkpoints
+python convert_moco_to_pretrained.py checkpoints/moco_v1_200ep_pretrain.pth.tar checkpoints/moco_v1_200ep_backbone.pth checkpoints/moco_v1_200ep_fc.pth
+```
+
+3. Start training
+
+```shell
+CUDA_VISIBLE_DEVICES=0 python baseline.py data/cub200 -d CUB200 -sr 15 --seed 0 --log logs/baseline_moco/cub200_15 \
+  --pretrained checkpoints/moco_v1_200ep_backbone.pth
+```
+
+## TODO
+
+Support datasets: CIFAR10, CIFAR100, ImageNet
+
+## Citation
+
+If you use these methods in your research, please consider citing.
+
+```
+@inproceedings{pi-model,
+    title={Temporal ensembling for semi-supervised learning},
+    author={Laine, Samuli and Aila, Timo},
+    booktitle={ICLR},
+    year={2017}
+}
+@inproceedings{mean-teacher,
+    title={Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results},
+    author={Tarvainen, Antti and Valpola, Harri},
+    booktitle={NIPS},
+    year={2017}
+}
+@inproceedings{uda,
+    title={Unsupervised data augmentation for consistency training},
+    author={Xie, Qizhe and Dai, Zihang and Hovy, Eduard and Luong, Minh-Thang and Le, Quoc V},
+    booktitle={NIPS},
+    year={2019}
+}
+@inproceedings{fixmatch,
+    title={Fixmatch: Simplifying semi-supervised learning with consistency and confidence},
+    author={Sohn, Kihyuk and Berthelot, David and Li, Chun-Liang and Zhang, Zizhao and Carlini, Nicholas and Cubuk, Ekin D and Kurakin, Alex and Zhang, Han and Raffel, Colin},
+    booktitle={NIPS},
+    year={2020}
+}
+@inproceedings{self-tuning,
+    title={Self-tuning for data-efficient deep learning},
+    author={Wang, Ximei and Gao, Jinghan and Long, Mingsheng and Wang, Jianmin},
+    booktitle={ICML},
+    year={2021},
+}
+```