Tags: codecat-he/cutlass
Tags
Fix Parallel Split-K on Gemm Operation Profiler (NVIDIA#1109) * Debug and fix for parallel split-k in profiler * restore debug files and remove prints
Add simple hash and eq methods for gemm_operations. (NVIDIA#1053)
Updates for 3.0 (NVIDIA#857) Co-authored-by: Aniket Shivam <[email protected]>
New updates for 2.11 (NVIDIA#775) * New updates. * Minor profiler updates Co-authored-by: Aniket Shivam <[email protected]>
Update linear_combination_generic.h (NVIDIA#472) add `skip_elementwise_` to support serial splitk in linear_combination_generic.h`
Update CMakeLists.txt (NVIDIA#473) * Update CMakeLists.txt Add 128bit int support if using nvc++ to solve NVIDIA#310 @jeffhammond, would you please give it a try? * Update CMakeLists.txt correct copy paste error
Updated GEMM performance plot with CUTLASS 2.8 compiled with CUDA 11.… …5 Toolkit (NVIDIA#375) Updated GEMM performance plot with CUTLASS 2.8 compiled using CUDA 11.5 Toolkit. GPUs under test: NVIDIA A100 NVIDIA A2 NVIDIA TitanV NVIDIA GeForce 2080 Ti
PreviousNext