GitHub - maqtech/cutlass_fpA_intB_gemm: A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer

Extracted fp16 A and int8/4 B CUTLASS GEMM kernels from FasterTransformer for easier integration in third-party projects. See the original code below.

Build with

mkdir build && cd build
cmake ..
make

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
cmake/utils		cmake/utils
cutlass @ cc85b64		cutlass @ cc85b64
cutlass_extensions/include/cutlass_extensions		cutlass_extensions/include/cutlass_extensions
cutlass_kernels		cutlass_kernels
tvm_binding		tvm_binding
utils		utils
weightOnlyBatchedGemv		weightOnlyBatchedGemv
.clang-format		.clang-format
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

License

maqtech/cutlass_fpA_intB_gemm

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages