Change the repository type filter
All
Repositories list
16 repositories
public_access_scripts
Publicads-triton-server
Publicvllm-xx
Publicvllm-pr
Publicfoundation-model-stack
Publicvllm
Publicads-TensorRT-LLM
PublicTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.WizardLM
Public archiveazure-openai-proxy
Public archiveflash-attention-rocm
Public archivetvm.dx
Public archivedeort
Public archivedxpy
Public archiveDirectXShaderCompiler
Public archiveDirectX-Headers
Public archive