grapevine-AI

grapevine-AI

1 follower · 1 following

Stars

4 stars written in Python

Clear filter

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 40,500 6,094 Updated Mar 7, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 16,124 1,527 Updated Mar 5, 2025

NVIDIA-AI-IOT / torch2trt

An easy to use PyTorch to TensorRT converter

Python 4,688 683 Updated Aug 17, 2024

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,242 373 Updated Mar 6, 2025

Provide feedback

Saved searches