wenxcs-msft
Popular repositories Loading
-
flash-attention-rocm
flash-attention-rocm Public archiveForked from ROCm/flash-attention
Fast and memory-efficient exact attention
C++ 1
-
DirectX-Headers
DirectX-Headers Public archiveForked from microsoft/DirectX-Headers
Official DirectX headers available under an open source license
C
-
DirectXShaderCompiler
DirectXShaderCompiler Public archiveForked from microsoft/DirectXShaderCompiler
This repo hosts the source for the DirectX Shader Compiler which is based on LLVM/Clang.
C++
-
-
Repositories
- public_access_scripts Public
wenxcs-msft/public_access_scripts’s past year of commit activity - ads-triton-server-third_party Public Forked from JiushengChen/third_party
Third-party source packages that are modified for use in Triton.
wenxcs-msft/ads-triton-server-third_party’s past year of commit activity - ads-triton-server Public Forked from JiushengChen/triton-server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
wenxcs-msft/ads-triton-server’s past year of commit activity - vllm-xx Public Forked from xiaoxiawu-microsoft/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
wenxcs-msft/vllm-xx’s past year of commit activity - vllm-pr Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
wenxcs-msft/vllm-pr’s past year of commit activity - vllm Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
wenxcs-msft/vllm’s past year of commit activity - ads-TensorRT-LLM Public Forked from JiushengChen/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
wenxcs-msft/ads-TensorRT-LLM’s past year of commit activity - WizardLM Public archive Forked from nlpxucan/WizardLM
WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions
wenxcs-msft/WizardLM’s past year of commit activity - azure-openai-proxy Public archive Forked from diemus/azure-openai-proxy
A proxy for Azure OpenAI API that can convert an OpenAI request into an Azure OpenAI request.
wenxcs-msft/azure-openai-proxy’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…