-
flashinfer Public
Forked from flashinfer-ai/flashinferFlashInfer: Kernel Library for LLM Serving
Cuda Apache License 2.0 UpdatedJan 13, 2025 -
vattention Public
Forked from microsoft/vattentionDynamic Memory Management for Serving LLMs without PagedAttention
C MIT License UpdatedDec 6, 2024 -
tiny-flash-attention Public
Forked from 66RING/tiny-flash-attentionflash attention tutorial written in python, triton, cuda, cutlass
Cuda UpdatedNov 18, 2024 -
flash-attention Public
Forked from Dao-AILab/flash-attentionFast and memory-efficient exact attention
Python BSD 3-Clause "New" or "Revised" License UpdatedOct 28, 2024 -
TensorRT-LLM Public
Forked from NVIDIA/TensorRT-LLMTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
C++ Apache License 2.0 UpdatedSep 26, 2024 -
mlc-llm Public
Forked from mlc-ai/mlc-llmUniversal LLM Deployment Engine with ML Compilation
Python Apache License 2.0 UpdatedSep 23, 2024 -
Nanoflow Public
Forked from efeslab/NanoflowA throughput-oriented high-performance serving framework for LLMs
Cuda Apache License 2.0 UpdatedSep 2, 2024 -
Paddle Public
Forked from PaddlePaddle/PaddlePArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
C++ Apache License 2.0 UpdatedJun 20, 2024 -
PaddleNLP Public
Forked from PaddlePaddle/PaddleNLP👑 Easy-to-use and powerful NLP library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search, ❓ Ques…
Python UpdatedJun 19, 2024 -
fast-hadamard-transform Public
Forked from Dao-AILab/fast-hadamard-transformFast Hadamard transform in CUDA, with a PyTorch interface
C BSD 3-Clause "New" or "Revised" License UpdatedMay 24, 2024 -
Paddle-Inference-Demo Public
Forked from PaddlePaddle/Paddle-Inference-DemoC++ Apache License 2.0 UpdatedMar 14, 2024 -
-
how-to-optim-algorithm-in-cuda Public
Forked from BBuf/how-to-optim-algorithm-in-cudahow to optimize some algorithm in cuda.
Cuda UpdatedJan 27, 2024 -
cutlass Public
Forked from NVIDIA/cutlassCUDA Templates for Linear Algebra Subroutines
C++ Other UpdatedJan 26, 2024 -
stable-diffusion-webui Public
Forked from AUTOMATIC1111/stable-diffusion-webuiStable Diffusion web UI
Python GNU Affero General Public License v3.0 UpdatedDec 19, 2023 -
-
hello-algo Public
Forked from krahets/hello-algo《Hello 算法》:动画图解、一键运行的数据结构与算法教程,支持 Java, C++, Python, Go, JS, TS, C#, Swift, Rust, Dart, Zig 等语言。
-
openmlsys-zh Public
Forked from openmlsys/openmlsys-zh《Machine Learning Systems: Design and Implementation》- Chinese Version
TeX UpdatedAug 6, 2023 -
How_to_optimize_in_GPU Public
Forked from Liu-xiandong/How_to_optimize_in_GPUThis is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…
Cuda Apache License 2.0 UpdatedJul 29, 2023 -
-
cpp_backend_awsome_blog Public
Forked from 0voice/cpp_backend_awsome_blog2023年最新整理 c++后端开发,1000篇优秀博文,含内存,网络,架构设计,高性能,数据结构,基础组件,中间件,分布式相关
UpdatedMar 17, 2023 -
FasterTransformer Public
Forked from NVIDIA/FasterTransformerTransformer related optimization, including BERT, GPT
C++ Apache License 2.0 UpdatedFeb 17, 2023 -
PaddleFleetX Public
Forked from PaddlePaddle/PaddleFleetXPaddle Distributed Training Examples. 飞桨分布式训练示例 Resnet Bert GPT MOE DataParallel ModelParallel PipelineParallel HybridParallel AutoParallel Zero Sharding Recompute GradientMerge Offload AMP DGC Loc…
Python Apache License 2.0 UpdatedFeb 13, 2023 -
PaddleHub Public
Forked from PaddlePaddle/PaddleHubAwesome pre-trained models toolkit based on PaddlePaddle. (400+ models including Image, Text, Audio, Video and Cross-Modal with Easy Inference & Serving)
Python Apache License 2.0 UpdatedSep 15, 2022 -
ppl.nn Public
Forked from OpenPPL/ppl.nnA primitive library for neural network
C++ Apache License 2.0 UpdatedAug 12, 2022 -
docs Public
Forked from PaddlePaddle/docsDocumentations for PaddlePaddle
Python Apache License 2.0 UpdatedAug 10, 2022 -
community Public
Forked from PaddlePaddle/communityPaddlePaddle Developer Community
Apache License 2.0 UpdatedJul 14, 2022 -
cmake-examples-Chinese Public
Forked from SFUMECJF/cmake-examples-Chinese快速入门CMake,通过例程学习语法。在线阅读地址:https://sfumecjf.github.io/cmake-examples-Chinese/
C++ UpdatedJul 10, 2022 -
-
ailearning Public
Forked from apachecn/ailearningAiLearning:数据分析+机器学习实战+线性代数+PyTorch+NLTK+TF2
Python Other UpdatedJul 7, 2022