GitHub - msharrock/DALI: A library containing both highly optimized building blocks and an execution engine for data pre-processing in deep learning applications

NVIDIA DALI

Today’s deep learning applications include complex, multi-stage pre-processing data pipelines that include compute-intensive steps mainly carried out on the CPU. For instance, steps such as load data from disk, decode, crop, random resize, color and spatial augmentations and format conversions are carried out on the CPUs, limiting the performance and scalability of training and inference tasks. In addition, the deep learning frameworks today have multiple data pre-processing implementations, resulting in challenges such as portability of training and inference workflows and code maintainability.

NVIDIA Data Loading Library (DALI) is a collection of highly optimized building blocks and an execution engine to accelerate input data pre-processing for deep learning applications. DALI provides both performance and flexibility of accelerating different data pipelines, as a single library, that can be easily integrated into different deep learning training and inference applications.

Key highlights of DALI include:

Full data pipeline accelerated from reading from disk to getting ready for training/inference
Flexibility through configurable graphs and custom operators
Support for image classification and segmentation workloads
Ease of integration through direct framework plugins and open source bindings
Portable training workflows with multiple input formats - JPEG, LMDB, RecordIO, TFRecord
Extensible for user specific needs through open source license

DALI and NGC

DALI is preinstalled in the NVIDIA GPU Cloud TensorFlow, PyTorch, and MXNet containers in versions 18.07 and later.

Installing prebuilt DALI packages

Prerequisities

Linux x64
NVIDIA Driver supporting CUDA 9.0 or later (i.e., 384.xx or later driver releases)
One or more of the following Deep Learning frameworks:
- MXNet 1.3 beta mxnet-cu90==1.3.0b20180612 or later
- pyTorch 0.4
- TensorFlow 1.7 or later

Installation

pip install --extra-index-url https://developer.download.nvidia.com/compute/redist nvidia-dali

Compiling DALI from source

Prerequisities

Linux x64
NVIDIA CUDA 9.0 (CUDA 8.0 compatibility is provided unofficially)
nvJPEG library (This can be unofficially disabled; see below)
protobuf version 2 or later (version 3 or later is required for TensorFlow TFRecord file format support)
CMake 3.5 or later
libjpeg-turbo 1.5.x or later (This can be unofficially disabled; see below)
OpenCV 3 or later (OpenCV 2.x compatibility is provided unofficially)
(Optional) liblmdb 0.9.x or later
One or more of the following Deep Learning frameworks:
- MXNet 1.3 beta mxnet-cu90==1.3.0b20180612 or later
- pyTorch 0.4
- TensorFlow 1.7 or later

Note

TensorFlow installation is required to build the TensorFlow plugin for DALI

Note

Items marked "unofficial" are community contributions that are believed to work but not officially tested or maintained by NVIDIA.

Get the DALI source

git clone --recursive https://github.com/NVIDIA/dali
cd dali

Make the build directory

mkdir build
cd build

Compile DALI

To build DALI without LMDB support:

cmake ..
make -j"$(nproc)"

To build DALI with LMDB support:

cmake -DBUILD_LMDB=ON ..
make -j"$(nproc)"

Optional CMake build parameters:

BUILD_PYTHON - build Python bindings (default: ON)
BUILD_TEST - include building test suite (default: ON)
BUILD_BENCHMARK - include building benchmarks (default: ON)
BUILD_LMDB - build with support for LMDB (default: OFF)
BUILD_NVTX - build with NVTX profiling enabled (default: OFF)
BUILD_TENSORFLOW - build TensorFlow plugin (default: OFF)
(Unofficial) BUILD_JPEG_TURBO - build with libjpeg-turbo (default: ON)
(Unofficial) BUILD_NVJPEG - build with nvJPEG (default: ON)

Install Python bindings

pip install dali/python

Getting started

docs/examples directory contains a series of examples (in the form of Jupyter notebooks) of different features of DALI. It also contains examples of how to use DALI to interface with DL frameworks.

Documentation for the latest stable release is available here. Nightly version of the documentation that stays in sync with the master branch is available here.

Additional resources

GPU Technology Conference 2018 presentation about DALI, T. Gale, S. Layton and P. Tredak: slides, recording.

Contributing to DALI

Contributions to DALI are more than welcome. To make the pull request process smooth, please follow these guidelines.

Contributors

DALI was built with major contributions from Trevor Gale, Przemek Tredak, Simon Layton, Andrei Ivanov, Serge Panev

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
cmake		cmake
dali		dali
docker		docker
docs		docs
qa		qa
third_party		third_party
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
Acknowledgements.txt		Acknowledgements.txt
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
COPYRIGHT		COPYRIGHT
Dockerfile		Dockerfile
Dockerfile.deps		Dockerfile.deps
Doxyfile		Doxyfile
LICENSE		LICENSE
README.rst		README.rst
VERSION		VERSION

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NVIDIA DALI

DALI and NGC

Installing prebuilt DALI packages

Prerequisities

Installation

Compiling DALI from source

Prerequisities

Get the DALI source

Make the build directory

Compile DALI

Install Python bindings

Getting started

Additional resources

Contributing to DALI

Contributors

About

Releases

Packages

Languages

License

msharrock/DALI

Folders and files

Latest commit

History

Repository files navigation

NVIDIA DALI

DALI and NGC

Installing prebuilt DALI packages

Prerequisities

Installation

Compiling DALI from source

Prerequisities

Get the DALI source

Make the build directory

Compile DALI

Install Python bindings

Getting started

Additional resources

Contributing to DALI

Contributors

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages