Code accompanying the paper
The algorithm is based on continuous relaxation and gradient descent in the architecture space. It is able to efficiently design high-performance convolutional architectures for image classification (on CIFAR-10 and ImageNet) and recurrent architectures for language modeling (on Penn Treebank and WikiText-2). Only a single GPU is required.DARTS: Differentiable Architecture Search
Hanxiao Liu, Karen Simonyan, Yiming Yang.
arXiv:1806.09055.
Python >= 3.5.5, PyTorch == 0.3.1, torchvision == 0.2.0
NOTE: PyTorch 0.4 is not supported at this moment and would lead to OOM.
Instructions for acquiring PTB and WT2 can be found here. While CIFAR-10 can be automatically downloaded by torchvision, ImageNet needs to be manually downloaded (preferably to a SSD) following the instructions here.
To carry out architecture search using 1st-order approximation, run
cd cnn && python train_search.py # for conv cells on CIFAR-10
cd rnn && python train_search.py # for recurrent cells on PTB
2nd-order approximation can be enabled by adding the --unrolled
flag.
Snapshots of the most likely convolutional & recurrent cells over time:
To reproduce our results using the best cells, run
cd cnn && python train.py --auxiliary --cutout # CIFAR-10
cd rnn && python train.py # PTB
cd rnn && python train.py --data ../data/wikitext-2 \ # WT2
--dropouth 0.15 --emsize 700 --nhidlast 700 --nhid 700 --wdecay 5e-7
cd cnn && python train_imagenet.py --auxiliary # ImageNet
Customized architectures are supported through the --arch
flag once specified in genotypes.py
.
@article{liu2018darts,
title={DARTS: Differentiable Architecture Search},
author={Liu, Hanxiao and Simonyan, Karen and Yang, Yiming},
journal={arXiv preprint arXiv:1806.09055},
year={2018}
}