DALI_pytorch_demo

Example code showing how to use Nvidia DALI in pytorch, with fallback to torchvision. Contains a few differences to the official Nvidia example:

Reimport DALI & recreate dataloaders at end of every epoch to reduce long term memory usage
Move CPU DALI pipeline completely to CPU, freeing up GPU resources
Keep DALI validation pipeline off GPU during training, reducing GPU memory usage

Compared to the official example, these mods allow for a ~50% increase in max batch size (tested using ResNet18 on a GCloud V100 instance with 10 workers:

Dataloader Type	Max Batch Size
DALI GPU reference	640
DALI GPU	928 / 45% increase
DALI CPU reference	800
DALI CPU	1216 / 52% increase
Torchvision w/ PIL-SIMD	1248

Here are some benchmarks on a Google Cloud V100 instance with 12 vCPUs (6 physical cores), 78GB RAM, Apex FP16 training with Shufflenet V2 0.5 & batch size 512:

Dataloader Type	Speed (images/s)
DALI GPU	3910
DALI CPU	1816
Torchvision w/ PIL-SIMD	1058

You can read the correspondig blog post here

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dali.py		dali.py
dataloader.py		dataloader.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DALI_pytorch_demo

About

Releases

Packages

Languages

License

ML2ViS/DALI_pytorch_demo

Folders and files

Latest commit

History

Repository files navigation

DALI_pytorch_demo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages