grapevine-AI

grapevine-AI

1 follower · 1 following

Stars

open-webui / desktop

WIP: Open WebUI desktop application, based on Electron.

TypeScript 280 22 Updated Jan 18, 2025

ggml-org / llama.vscode

VS Code extension for LLM-assisted code/text completion

TypeScript 576 34 Updated Mar 5, 2025

NVIDIA-AI-IOT / torch2trt

An easy to use PyTorch to TensorRT converter

Python 4,688 683 Updated Aug 17, 2024

NVIDIA / TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 9,625 1,130 Updated Mar 6, 2025

NVIDIA / TensorRT

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 11,291 2,162 Updated Feb 1, 2025

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilizatio…

Python 2,242 373 Updated Mar 6, 2025

Dao-AILab / flash-attention

Fast and memory-efficient exact attention

Python 16,125 1,527 Updated Mar 5, 2025

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 40,492 6,087 Updated Mar 6, 2025

open-webui / open-webui

User-friendly AI Interface (Supports Ollama, OpenAI API, ...)

JavaScript 81,358 9,766 Updated Mar 6, 2025

ggml-org / llama.cpp

LLM inference in C/C++

C++ 75,975 10,992 Updated Mar 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly