Skip to content

Latest commit

 

History

History

fbgemm_gpu

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

FBGEMM_GPU

FBGEMM_GPU (FBGEMM GPU kernel library) is a collection of high-performance CUDA GPU operator library for GPU training.

The library provides efficient table batched embedding bag, data layout transformation, and quantization supports.

Examples

The tests (in test folder) and benchmarks (in bench folder) are some great examples of using FBGEMM_GPU.

Build Notes

FBGEMM_GPU uses the standard CMAKE-based build flow and PyTorch TorchScript extension with custom C++ operator build flow.

Dependencies

FBGEMM_GPU requires nvcc and a Nvidia GPU with compute capability of 3.5+.

  • CUB

For the CUB build time dependency, if you are using conda, you can continue with

conda install -c bottler nvidiacub

Otherwise download the CUB library from https://github.com/NVIDIA/cub/releases and unpack it to a folder of your choice. Define the environment variable CUB_DIR before building and point it to the directory that contains CMakeLists.txt for CUB. For example on Linux/Mac,

curl -LO https://github.com/NVIDIA/cub/archive/1.10.0.tar.gz
tar xzf 1.10.0.tar.gz
export CUB_DIR=$PWD/cub-1.10.0
  • googletest

googletest is required to build and run FBGEMM_GPU's tests. googletest is not required if you don't want to run FBGEMM_GPU tests. By default, building of tests is on. Turn it off by setting FBGEMMGPU_BUILD_TESTS to off.

  • PyTorch, Jinja2

PyTorch and Jinja2 are required to build and run the table batched embedding bag operator. One thing to note is that the implementation of this op relies on the latest version of PyTorch (1.8+), so it requires the installation with PyTorch Nightly:

conda uninstall pytorch
# update with the corresponding CUDA version
conda install pytorch cudatoolkit=9.2 -c pytorch-nightly
conda install jinja2

You can download googletest and set GOOGLETEST_SOURCE_DIR respectively for cmake to find these libraries. If any of these variables is not set, cmake will build the git submodules found in the third_party directory.

General build instructions are as follows:

git clone --recursive https://github.com/pytorch/FBGEMM.git
cd FBGEMM/fbgemm_gpu
# if you are updating an existing checkout
git submodule sync
git submodule update --init --recursive
# configure the NVCC and CUB path
export CUDACXX=/usr/local/cuda/bin/nvcc
export CUB_DIR=${CUB_DIR}
# in fbgemm_gpu folder
# build the data layout transform op, quantized ops, etc.
mkdir build && cd build
cmake ..
make
# build the table batched embedding bag op
cd ..
python setup.py build develop

Running FBGEMM_GPU

To run the tests or benchmarks after building FBGEMM_GPU (if tests or benchmarks are built), use the following command:

# run the tests for the data layout transform op, quantized ops, etc.
cd build && make test
# run the tests and benchmarks of table batched embedding bag op
cd ..
python test/split_table_batched_embeddings_test.py
python bench/split_table_batched_embeddings_benchmark.py

How FBGEMM_GPU works

For a high-level overview, design philosophy and brief descriptions of various parts of FBGEMM_GPU please see our Wiki (work in progress).

Full documentation

We have extensively used comments in our source files. The best and up-do-date documentation is available in the source files.

Join the FBGEMM community

See the CONTRIBUTING file for how to help out.

License

FBGEMM is BSD licensed, as found in the LICENSE file.