👟 Trainer

An opinionated general purpose model trainer on PyTorch with a simple code base.

Installation

From Github:

git clone https://github.com/coqui-ai/Trainer
cd Trainer
make install

From PyPI:

pip install trainer

Prefer installing from Github as it is more stable.

Implementing a model

Subclass and overload the functions in the TrainerModel()

Training a model

See the test script here training a basic MNIST model.

Training with DDP

$ python -m trainer.distribute --script path/to/your/train.py --gpus "0,1"

We don't use .spawn() to initiate multi-gpu training since it causes certain limitations.

Everything must the pickable.
.spawn() trains the model in subprocesses and the model in the main process is not updated.
DataLoader with N processes gets really slow when the N is large.

Profiling example

Create the torch profiler as you like and pass it to the trainer.

import torch
profiler = torch.profiler.profile(
    activities=[
        torch.profiler.ProfilerActivity.CPU,
        torch.profiler.ProfilerActivity.CUDA,
    ],
    schedule=torch.profiler.schedule(wait=1, warmup=1, active=3, repeat=2),
    on_trace_ready=torch.profiler.tensorboard_trace_handler("./profiler/"),
    record_shapes=True,
    profile_memory=True,
    with_stack=True,
)
prof = trainer.profile_fit(profiler, epochs=1, small_run=64)
then run Tensorboard

Run the tensorboard.
```
tensorboard --logdir="./profiler/"
```

Supported Experiment Loggers

Tensorboard - actively maintained
ClearML - actively maintained
MLFlow
Aim
WandDB

To add a new logger, you must subclass BaseDashboardLogger and overload its functions.

Name		Name	Last commit message	Last commit date
Latest commit History 151 Commits
.github		.github
bin		bin
tests		tests
trainer		trainer
.gitignore		.gitignore
.pylintrc		.pylintrc
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.dev.txt		requirements.dev.txt
requirements.test.txt		requirements.test.txt
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

👟 Trainer

Installation

Implementing a model

Training a model

Training with DDP

Profiling example

Supported Experiment Loggers

About

Releases

Packages

Languages

lucifar777/Trainer

Folders and files

Latest commit

History

Repository files navigation

👟 Trainer

Installation

Implementing a model

Training a model

Training with DDP

Profiling example

Supported Experiment Loggers

About

Resources

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages