This repo covers the implementation for CMC, which learns representations from multiview data in a self-supervised way (by multiview, we mean multiple sensory, multiple modal data, or literally multiple viewpoint data. It's flexible to define what is a "view"):
"Contrastive Multiview Coding" Paper, Project Page.
(1) Representation quality as a function of number of contrasted views.
We found that, the more views we train with, the better the representation (of each single view).
(2) Contrastive objective v.s. Predictive objective
We compare the contrastive objective to cross-view prediction, finding an advantage to the contrastive approach.
(3) Unsupervised v.s. Supervised
ResNet-101 ResNet-50 trained with our unsupervised CMC objective surpasses supervisedly trained AlexNet on ImageNet classification ( 60.1% 63.0% v.s. 59.3%). For this first time on ImageNet classification, unsupervised methods are surpassing the classic supervised-AlexNet proposed in 2012 (CPC++ and DIM++ also achieve this milestone concurrently).
This repo was tested with Ubuntu 16.04.5 LTS, Python 3.5, PyTorch 0.4.0, and CUDA 9.0. But it should be runnable with recent PyTorch versions >=0.4.0
Note: It seems to us that training with Pytorch version >= 1.0 yields slightly worse results. If you find the similar discrepancy and figure out the problem, please report this since we are trying to fix it as well.
NCE flags:
--nce_k
: number of negatives to contrast for each positive. Default: 4096--nce_m
: the momentum for dynamically updating the memory. Default: 0.5--nce_t
: temperature that modulates the distribution. Default: 0.07 for ImageNet, 0.1 for STL-10
Path flags:
--data_folder
: specify the ImageNet data folder.--model_path
: specify the path to save model.--tb_path
: specify where to save tensorboard monitoring events.
An example of command line for training CMC (Default: AlexNet on Single GPU)
CUDA_VISIBLE_DEVICES=0 python train_CMC.py --data_folder path/to/data --model_path path/to/save --tb_path path/to/tensorboard
By default, the training scripts will use L and ab as two views to contrast with each other. If you want to specify other image channels as different views, simply modifying here,
Path flags:
--data_folder
: specify the ImageNet data folder. Should be the same as above.--save_path
: specify the path to save the linear classifier.--tb_path
: specify where to save tensorboard events monitoring linear classifier training.
Specify the checkpoint that you want to evaluate with --model_path
flag, this path should directly point to the .pth
file.
Therefore, an example of command line for evaluating, say ./models/ckpt.pth
, should look like:
CUDA_VISIBLE_DEVICES=0 python LinearProbing.py --data_folder path/to/data --save_path path/to/save --tb_path path/to/tensorboard --model_path ./models/ckpt.pth
If you find this repo useful for your research, please consider citing the paper
@article{tian2019contrastive,
title={Contrastive Multiview Coding},
author={Tian, Yonglong and Krishnan, Dilip and Isola, Phillip},
journal={arXiv preprint arXiv:1906.05849},
year={2019}
}
For any questions, please contact Yonglong Tian ([email protected]).
Part of this code is inspired by Zhirong Wu's unsupervised learning algorithm lemniscate.