FBGEMM_GPU (FBGEMM GPU Kernels Library) is a collection of high-performance PyTorch GPU operator libraries for training and inference. The library provides efficient table batched embedding bag, data layout transformation, and quantization supports.
FBGEMM_GPU is currently tested with CUDA 11.7.1 and 11.8 in CI, and with PyTorch packages (1.13+) that are built against those CUDA versions.
Only Intel/AMD CPUs with AVX2 extensions are currently supported.
See our Documentation for more information.
The full installation instructions for the CUDA, ROCm, and CPU-only variants of FBGEMM_GPU can be found here. In addition, instructions for running example tests and benchmarks can be found here.
This section is intended for FBGEMM_GPU developers only. The full build instructions for the CUDA, ROCm, and CPU-only variants of FBGEMM_GPU can be found here.
For questions, support, news updates, or feature requests, please feel free to:
- File a ticket in GitHub Issues
- Post a discussion in GitHub Discussions
- Reach out to us on the
#fbgemm
channel in PyTorch Slack
For contributions, please see the CONTRIBUTING
file for
ways to help out.
FBGEMM_GPU is BSD licensed, as found in the LICENSE
file.