Popular repositories Loading
-
TransformerCompression
TransformerCompression PublicForked from microsoft/TransformerCompression
For releasing code related to compression methods for transformers, accompanying our publications
Python
-
BiTA
BiTA PublicForked from linfeng93/BiTA
An innovative method expediting LLMs via streamlined semi-autoregressive generation and draft verification.
Python
-
QUICK
QUICK PublicForked from SqueezeBits/QUICK
QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference
Python
-
sglang
sglang PublicForked from sgl-project/sglang
SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.
Python
If the problem persists, check the GitHub status page or contact support.