pip install -e .
Place datasets in data/datasets/DATASET
, where DATASET
is the name of the dataset.
Complete all TODOs in preprocess/
, then run python -m ppgs.preprocess DATASET
. All preprocessed data is saved in data/cache/DATASET
.
Complete all TODOs in partition/
, then run python -m ppgs.partition DATASET
.
Complete all TODOs in data/
and model.py
, then run python -m ppgs.train --config <config> --dataset DATASET --gpus <gpus>
.
Complete all TODOs in evaluate/
, then run python -m ppgs.evaluate --datasets <datasets> --checkpoint <checkpoint> --gpu <gpu>
.
Run tensorboard --logdir runs/
. If you are running training
remotely, you must create a SSH connection with port forwarding to view
Tensorboard. This can be done with ssh -L 6006:localhost:6006 <user>@<server-ip-address>
. Then, open localhost:6006
in your browser.
Tests are written using pytest
. Run pip install pytest
to install pytest.
Complete all TODOs in test_model.py
and test_data.py
, then run pytest
.
Adding project-specific tests for preprocessing, inference, and inference is
encouraged.
This directory is for
package data.
When you pip install a package, pip will
automatically copy the python files to the installation folder (in
site_packages
). Pip will not automatically copy files that are not Python
files. So if your code depends on non-Python files to run (e.g., a pretrained
model, normalizing statistics, or data partitions), you have to manually
specify these files in setup.py
. This is done for you in this repo. In
general, only small files that are essential at runtime should be placed in
this folder.
Code release involves making sure that setup.py
is up-to-date and then
uploading your code to pypi
.
Here is a good
tutorial for this process.