This project is a showcase for dvc exp
commands to manage large number of experiments. It trains a CNN on Fashion
MNIST dataset in Tensorflow.
### Installation Instructions
After installing DVC and cloning the repository, you can run:
virtualenv .venv
. .venv/bin/activate
pip install -r requirements.txt
Retrieve all the required data and model files:
dvc pull
You can run the experiment defined in the project.
dvc exp run
This new command in DVC 2.0 also allows to change the parameters on the fly with --set-param
option.
dvc exp run --set-param model.conv_units=128
params.yaml
defines two parameters to modify with dvc exp run --set-param/-S
option. The above command updates params.yaml
with
conv_units: 128
before running the experiment.
The experiment will produce metrics.json
along with a models/model.h5
.
You can check the changes in metrics:
dvc exp diff
It's also possible to queue experiments with --queue
option and run them all
in a single batch with --run-all
.
dvc exp run --queue -S model.conv_units=32
dvc exp run --queue -S model.conv_units=64
dvc exp run --queue -S model.conv_units=96
The queued experiments can be run in parallel with --jobs
.
dvc exp run --run-all --jobs 2
You can get the summary of experiments with:
dvc exp show
Limit the parameters and metrics to show with --include-params
and
--include-metrics
options, respectively.
By default experiments are given auto-generated names derived from their inputs
and environment. It may be easier to review them when you give names with the
--name/-n
option.
dvc exp run -n my-baseline-experiment
Artifacts produced by experiments are normally not checked out to the repository. If you want to do so, you can use:
dvc exp apply exp-123456
where exp-123456
is the experiment ID you see with dvc exp show
.
Then, you can use DVC and Git commands on the artifacts and code as usual.
You can push and pull the code changes related to an experiment with dvc exp push
and dvc exp pull
respectively. These two commands work with Git
remotes and DVC remotes
together. Changes in the text files tracked by Git are transferred from/to Git
repositories, and binary tracked by DVC are transferred from/to DVC remotes.
You can clean up the unused experiments with:
dvc exp gc --workspace
There are two parameters in the project. They are set in params.yaml
. models.conv_units
defines the number of convolutional units in the model, and train.epochs
sets the number of epochs to train the model.
train:
- epochs: 1
model:
- conv_units: 16
You can select these parameters in dvc exp show
with --include-params
.
There are two metrics produced by the training stage.
loss
: Categorical Crosstentropy loss valueacc
: Categorical Accuracy metrics for the classes.
You can select these metrics in dvc exp show
with --include-metrics
.
The data files used in the project are found in data/images
. All of these
files are tracked by DVC and should be retrieved using dvc pull
from the
configured remote.
This repository is generated by
example-repos-dev
. For Pull
Requests regarding the fixes, please use that repository.