forked from grok-ai/nn-template
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* Add dynamic badges for tests, docs and nn-core version * Remove unnecessary information in the README * Update README * Update structure in README * Update checks badges Co-authored-by: Valentino Maiorca <[email protected]>
- Loading branch information
Showing
1 changed file
with
96 additions
and
166 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,197 +1,127 @@ | ||
# NN Template | ||
|
||
<p align="center"> | ||
<a href="https://pytorch.org/get-started/locally/"><img alt="PyTorch" src="https://img.shields.io/badge/-PyTorch-red?logo=pytorch&labelColor=gray"></a> | ||
<a href="https://pytorchlightning.ai/"><img alt="Lightning" src="https://img.shields.io/badge/code-Lightning-blueviolet"></a> | ||
<a href="https://hydra.cc/"><img alt="Conf: hydra" src="https://img.shields.io/badge/conf-hydra-blue"></a> | ||
<a href="https://wandb.ai/site"><img alt="Logging: wandb" src="https://img.shields.io/badge/logging-wandb-yellow"></a> | ||
<a href="https://dvc.org/"><img alt="Conf: hydra" src="https://img.shields.io/badge/data-dvc-9cf"></a> | ||
<a href="https://streamlit.io/"><img alt="UI: streamlit" src="https://img.shields.io/badge/ui-streamlit-orange"></a> | ||
<a href="https://github.com/grok-ai/nn-template/actions/workflows/test_suite.yml"><img alt="CI" src=https://img.shields.io/github/workflow/status/grok-ai/nn-template/Test%20Suite/main?label=main%20checks></a> | ||
<a href="https://github.com/grok-ai/nn-template/actions/workflows/test_suite.yml"><img alt="CI" src=https://img.shields.io/github/workflow/status/grok-ai/nn-template/Test%20Suite/develop?label=develop%20checks></a> | ||
<a href="https://grok-ai.github.io/nn-template"><img alt="Docs" src=https://img.shields.io/github/workflow/status/grok-ai/nn-template/pages%20build%20and%20deployment/gh-pages?label=docs></a> | ||
<a href="https://pypi.org/project/nn-template-core/"><img alt="Release" src="https://img.shields.io/pypi/v/nn-template-core?label=nn-core"></a> | ||
<a href="https://black.readthedocs.io/en/stable/"><img alt="Code style: black" src="https://img.shields.io/badge/code%20style-black-000000.svg"></a> | ||
</p> | ||
|
||
[comment]: <> (<p align="center">) | ||
|
||
Generic template to bootstrap your [PyTorch](https://pytorch.org/get-started/locally/) project. Click on [![](https://img.shields.io/badge/-Use_this_template-success?style=flat)](https://github.com/lucmos/nn-template/generate) and avoid writing boilerplate code for: | ||
[comment]: <> ( <a href="https://pytorch.org/get-started/locally/"><img alt="PyTorch" src="https://img.shields.io/badge/-PyTorch-red?logo=pytorch&labelColor=gray"></a>) | ||
|
||
- [PyTorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning), lightweight PyTorch wrapper for high-performance AI research. | ||
- [Hydra](https://github.com/facebookresearch/hydra), a framework for elegantly configuring complex applications. | ||
- [DVC](https://dvc.org/doc/start/data-versioning), track large files, directories, or ML models. Think "Git for data". | ||
- [Weights and Biases](https://wandb.ai/home), organize and analyze machine learning experiments. *(educational account available)* | ||
- [Streamlit](https://streamlit.io/), turns data scripts into shareable web apps in minutes. | ||
|
||
*`nn-template`* is opinionated so you don't have to be. | ||
If you use this template, please add | ||
[![](https://shields.io/badge/-nn--template-emerald?style=flat&logo=github&labelColor=gray)](https://github.com/lucmos/nn-template) | ||
to your `README`. | ||
|
||
|
||
### Usage Examples | ||
|
||
Checkout the [`mwe` branch](https://github.com/lucmos/nn-template/tree/mwe) to view a minimum working example on MNIST. | ||
|
||
# Structure | ||
|
||
```bash | ||
. | ||
├── .cache | ||
├── conf # hydra compositional config | ||
│ ├── nn | ||
│ ├── default.yaml # current experiment configuration | ||
│ ├── hydra | ||
│ └── train | ||
├── data # datasets | ||
├── .env # system-specific env variables, e.g. PROJECT_ROOT | ||
├── requirements.txt # basic requirements | ||
├── src | ||
│ ├── common # common modules and utilities | ||
│ ├── data # PyTorch Lightning datamodules and datasets | ||
│ ├── modules # PyTorch Lightning modules | ||
│ ├── run.py # entry point to run current conf | ||
│ └── ui # interactive streamlit apps | ||
└── wandb # local experiments (auto-generated) | ||
``` | ||
|
||
# Streamlit | ||
[Streamlit](https://docs.streamlit.io/) is an open-source Python library that makes | ||
it easy to create and share beautiful, custom web apps for machine learning and data science. | ||
|
||
In just a few minutes, you can build and deploy powerful data apps to: | ||
|
||
- **Explore** your data | ||
- **Interact** with your model | ||
- **Analyze** your model behavior and input sensitivity | ||
- **Showcase** your prototype with [awesome web apps](https://streamlit.io/gallery) | ||
|
||
Moreover, Streamlit enables interactive development with automatic rerun on files changes. | ||
|
||
Launch a minimal app with `PYTHONPATH=. streamlit run src/ui/run.py`. There is a built-in function to restore a model checkpoint stored on W&B, with automatic download if the checkpoint is not present in the local machine: | ||
|
||
![](https://i.imgur.com/3lTnOA1.png) | ||
|
||
|
||
|
||
# Data Version Control | ||
|
||
DVC runs alongside `git` and uses the current commit hash to version control the data. | ||
|
||
Initialize the `dvc` repository: | ||
|
||
```bash | ||
$ dvc init | ||
``` | ||
|
||
To start tracking a file or directory, use `dvc add`: | ||
|
||
```bash | ||
$ dvc add data/ImageNet | ||
``` | ||
|
||
DVC stores information about the added file (or a directory) in a special `.dvc` file named `data/ImageNet.dvc`, a small text file with a human-readable format. | ||
This file can be easily versioned like source code with Git, as a placeholder for the original data (which gets listed in `.gitignore`): | ||
|
||
```bash | ||
git add data/ImageNet.dvc data/.gitignore | ||
git commit -m "Add raw data" | ||
``` | ||
|
||
## Making changes | ||
|
||
When you make a change to a file or directory, run `dvc add` again to track the latest version: | ||
[comment]: <> ( <a href="https://pytorchlightning.ai/"><img alt="Lightning" src="https://img.shields.io/badge/code-Lightning-blueviolet"></a>) | ||
|
||
```bash | ||
$ dvc add data/ImageNet | ||
``` | ||
|
||
## Switching between versions | ||
|
||
The regular workflow is to use `git checkout` first to switch a branch, checkout a commit, or a revision of a `.dvc` file, and then run `dvc checkout` to sync data: | ||
|
||
```bash | ||
$ git checkout <...> | ||
$ dvc checkout | ||
``` | ||
|
||
--- | ||
|
||
Read more in the [docs](https://dvc.org/doc/start/data-versioning)! | ||
|
||
|
||
# Weights and Biases | ||
[comment]: <> ( <a href="https://hydra.cc/"><img alt="Conf: hydra" src="https://img.shields.io/badge/conf-hydra-blue"></a>) | ||
|
||
Weights & Biases helps you keep track of your machine learning projects. Use tools to log hyperparameters and output metrics from your runs, then visualize and compare results and quickly share findings with your colleagues. | ||
[comment]: <> ( <a href="https://wandb.ai/site"><img alt="Logging: wandb" src="https://img.shields.io/badge/logging-wandb-yellow"></a>) | ||
|
||
[This](https://wandb.ai/gladia/nn-template?workspace=user-lucmos) is an example of a simple dashboard. | ||
[comment]: <> ( <a href="https://dvc.org/"><img alt="Conf: hydra" src="https://img.shields.io/badge/data-dvc-9cf"></a>) | ||
|
||
## Quickstart | ||
[comment]: <> ( <a href="https://streamlit.io/"><img alt="UI: streamlit" src="https://img.shields.io/badge/ui-streamlit-orange"></a>) | ||
|
||
Login to your `wandb` account, running once `wandb login`. | ||
Configure the logging in `conf/logging/*`. | ||
[comment]: <> (</p>) | ||
|
||
|
||
--- | ||
|
||
|
||
Read more in the [docs](https://docs.wandb.ai/). Particularly useful the [`log` method](https://docs.wandb.ai/library/log), accessible from inside a PyTorch Lightning module with `self.logger.experiment.log`. | ||
|
||
> W&B is our logger of choice, but that is a purely subjective decision. Since we are using Lightning, you can replace | ||
`wandb` with the logger you prefer (you can even build your own). | ||
More about Lightning loggers [here](https://pytorch-lightning.readthedocs.io/en/latest/extensions/logging.html). | ||
|
||
# Hydra | ||
|
||
Hydra is an open-source Python framework that simplifies the development of research and other complex applications. The key feature is the ability to dynamically create a hierarchical configuration by composition and override it through config files and the command line. The name Hydra comes from its ability to run multiple similar jobs - much like a Hydra with multiple heads. | ||
|
||
The basic functionalities are intuitive: it is enough to change the configuration files in `conf/*` accordingly to your preferences. Everything will be logged in `wandb` automatically. | ||
|
||
Consider creating new root configurations `conf/myawesomeexp.yaml` instead of always using the default `conf/default.yaml`. | ||
<p align="center"> | ||
<i> | ||
nn-template is opinionated so you don't have to be | ||
</i> | ||
</p> | ||
|
||
|
||
## Sweeps | ||
Generic cookiecutter template to bootstrap your [PyTorch](https://pytorch.org/get-started/locally/) project, | ||
read more in the [documentation](https://lucmos.github.io/nn-template). | ||
|
||
You can easily perform hyperparameters [sweeps](https://hydra.cc/docs/advanced/override_grammar/extended), which override the configuration defined in `/conf/*`. | ||
## Get started | ||
|
||
The easiest one is the grid-search. It executes the code with every possible combinations of the specified hyperparameters: | ||
Generate your project with cookiecutter: | ||
|
||
```bash | ||
PYTHONPATH=. python src/run.py -m optim.optimizer.lr=0.02,0.002,0.0002 optim.lr_scheduler.T_mult=1,2 optim.optimizer.weight_decay=0,1e-5 | ||
cookiecutter https://github.com/lucmos/nn-template | ||
``` | ||
|
||
You can explore aggregate statistics or compare and analyze each run in the W&B dashboard. | ||
> This is a *parametrized* template that uses [cookiecutter](https://github.com/cookiecutter/cookiecutter). | ||
> Install cookiecutter with: | ||
> | ||
> ```pip install cookiecutter``` | ||
--- | ||
|
||
We recommend to go through at least the [Basic Tutorial](https://hydra.cc/docs/tutorials/basic/your_first_app/simple_cli), and the docs about [Instantiating objects with Hydra](https://hydra.cc/docs/patterns/instantiate_objects/overview). | ||
## Integrations | ||
|
||
Avoid writing boilerplate code to integrate: | ||
|
||
# PyTorch Lightning | ||
|
||
Lightning makes coding complex networks simple. | ||
It is not a high level framework like `keras`, but forces a neat code organization and encapsulation. | ||
|
||
You should be somewhat familiar with PyTorch and [PyTorch Lightning](https://pytorch-lightning.readthedocs.io/en/stable/index.html) before using this template. | ||
|
||
# Environment Variables | ||
|
||
System specific variables (e.g. absolute paths to datasets) should not be under version control, otherwise there will be conflicts between different users. | ||
|
||
The best way to handle system specific variables is through environment variables. | ||
- [PyTorch Lightning](https://github.com/PyTorchLightning/pytorch-lightning), lightweight PyTorch wrapper for high-performance AI research. | ||
- [Hydra](https://github.com/facebookresearch/hydra), a framework for elegantly configuring complex applications. | ||
- [Weights and Biases](https://wandb.ai/home), organize and analyze machine learning experiments. *(educational account available)* | ||
- [Streamlit](https://streamlit.io/), turns data scripts into shareable web apps in minutes. | ||
- [MkDocs](https://www.mkdocs.org/) and [Material for MkDocs](https://squidfunk.github.io/mkdocs-material/), a fast, simple and downright gorgeous static site generator. | ||
- [DVC](https://dvc.org/doc/start/data-versioning), track large files, directories, or ML models. Think "Git for data". | ||
- [GitHub Actions](https://github.com/features/actions), to run the tests, publish the documentation and to PyPI automatically. | ||
- Python best practices for developing and publishing research projects. | ||
|
||
You can define new environment variables in a `.env` file in the project root. A copy of this file (e.g. `.env.template`) can be under version control to ease new project configurations. | ||
## Structure | ||
|
||
To define a new variable write inside `.env`: | ||
The generated projects will contain the following files: | ||
|
||
```bash | ||
export MY_VAR=/home/user/my_system_path | ||
``` | ||
|
||
You can dynamically resolve the variable name from Python code with: | ||
|
||
```python | ||
get_env("MY_VAR") | ||
``` | ||
|
||
and in the Hydra `.yaml` configuration files with: | ||
|
||
```yaml | ||
${oc.env:MY_VAR} | ||
. | ||
├── conf | ||
│ ├── default.yaml | ||
│ ├── hydra | ||
│ │ └── default.yaml | ||
│ ├── nn | ||
│ │ └── default.yaml | ||
│ └── train | ||
│ └── default.yaml | ||
├── data | ||
│ └── .gitignore | ||
├── docs | ||
│ ├── index.md | ||
│ └── overrides | ||
│ └── main.html | ||
├── .editorconfig | ||
├── .env | ||
├── .env.template | ||
├── env.yaml | ||
├── .flake8 | ||
├── .github | ||
│ └── workflows | ||
│ ├── publish.yml | ||
│ └── test_suite.yml | ||
├── .gitignore | ||
├── LICENSE | ||
├── mkdocs.yml | ||
├── .pre-commit-config.yaml | ||
├── pyproject.toml | ||
├── README.md | ||
├── setup.cfg | ||
├── setup.py | ||
├── src | ||
│ └── awesome_project | ||
│ ├── data | ||
│ │ ├── datamodule.py | ||
│ │ ├── dataset.py | ||
│ │ └── __init__.py | ||
│ ├── __init__.py | ||
│ ├── modules | ||
│ │ ├── __init__.py | ||
│ │ └── module.py | ||
│ ├── pl_modules | ||
│ │ ├── __init__.py | ||
│ │ └── pl_module.py | ||
│ ├── run.py | ||
│ └── ui | ||
│ ├── __init__.py | ||
│ └── run.py | ||
└── tests | ||
├── conftest.py | ||
├── __init__.py | ||
├── test_checkpoint.py | ||
├── test_configuration.py | ||
├── test_nn_core_integration.py | ||
├── test_resume.py | ||
├── test_storage.py | ||
└── test_training.py | ||
``` |