This is the code for our paper titled Automated Domain Discovery from Multiple Sources to Improve Zero-Shot Generalization - https://arxiv.org/abs/2112.09802
MulDEns is written on top of DomainBeD, a PyTorch suite containing benchmark datasets and algorithms for domain generalization, as introduced in (https://arxiv.org/abs/2007.01434).
Download the datasets:
python -m domainbed.scripts.download \
--data_dir='DATA'
Train a MULDENS ensemble with M=2 :
python3 -m domainbed.scripts.train_aug --data_dir='DATA'\
--dataset OfficeHome --test_env 0 --trial_seed 0\
--output_dir='muldens_ouputs/OfficeHome_M2/env0/trial_seed0/'\
--hparams='{"batch_size":32,"data_augmentation":1,"MULDENS_num_models":2}'
- In domainbed/algorithms.py MULDENS is implemented
- Through domainbed/scripts/train_aug.py and domainbed/lib/misc_aug.py we train MULDENS
- Once checkpoints are saved, if we want to load and re-evaluate, we use domainbed/scripts/eval_muldens_aug.py
We use two different model selection criteria Overall Average and Overall Ensemble. More details in paper.
We report baseline results In search of lost domain generalization paper (https://openreview.net/pdf?id=lQdXeXDoWtI) and their github repo. Full results for commit 7df6f06 in LaTeX format available here.