Skip to content
Change the repository type filter

All

    Repositories list

    • Shell
      Apache License 2.0
      0010Updated Feb 5, 2025Feb 5, 2025
    • Third-party source packages that are modified for use in Triton.
      C
      BSD 3-Clause "New" or "Revised" License
      59000Updated Jan 28, 2025Jan 28, 2025
    • The Triton Inference Server provides an optimized cloud and edge inferencing solution.
      Python
      BSD 3-Clause "New" or "Revised" License
      1.5k001Updated Jan 28, 2025Jan 28, 2025
    • vllm-xx

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      5.9k008Updated Sep 26, 2024Sep 26, 2024
    • vllm-pr

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      5.9k000Updated Jul 10, 2024Jul 10, 2024
    • Python
      Apache License 2.0
      62000Updated May 10, 2024May 10, 2024
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      5.9k000Updated May 8, 2024May 8, 2024
    • TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
      C++
      Apache License 2.0
      1.1k000Updated Jan 26, 2024Jan 26, 2024
    • WizardLM

      Public archive
      WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions
      Python
      731000Updated May 12, 2023May 12, 2023
    • azure-openai-proxy

      Public archive
      A proxy for Azure OpenAI API that can convert an OpenAI request into an Azure OpenAI request.
      Go
      MIT License
      69000Updated Apr 7, 2023Apr 7, 2023
    • flash-attention-rocm

      Public archive
      Fast and memory-efficient exact attention
      C++
      BSD 3-Clause "New" or "Revised" License
      1.5k100Updated Feb 25, 2023Feb 25, 2023
    • tvm.dx

      Public archive
      TVM with DirectX support
      Python
      Apache License 2.0
      1001Updated Dec 12, 2022Dec 12, 2022
    • deort

      Public archive
      Python
      MIT License
      0000Updated Sep 24, 2022Sep 24, 2022
    • dxpy

      Public archive
      DirectX Python Runtime
      C++
      MIT License
      0000Updated Apr 3, 2022Apr 3, 2022
    • DirectXShaderCompiler

      Public archive
      This repo hosts the source for the DirectX Shader Compiler which is based on LLVM/Clang.
      C++
      Other
      722001Updated Mar 26, 2022Mar 26, 2022
    • DirectX-Headers

      Public archive
      Official DirectX headers available under an open source license
      C
      MIT License
      165000Updated Mar 11, 2022Mar 11, 2022