Skip to content
View jianyuheng's full-sized avatar
  • Tencent

Block or report jianyuheng

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 8,118 1,031 Updated Dec 18, 2024

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 2,670 185 Updated Nov 14, 2024

High-quality multi-lingual text-to-speech library by MyShell.ai. Support English, Spanish, French, Chinese, Japanese and Korean.

Python 5,036 668 Updated Dec 16, 2024

End-To-End SpeechSynthesis system with knowledge distillation

Jupyter Notebook 16 7 Updated Jul 16, 2022

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Python 232 18 Updated Oct 8, 2024

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 10,946 1,086 Updated Dec 16, 2024

Go ahead and axolotl questions

Python 8,120 894 Updated Dec 20, 2024

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Python 466 28 Updated Nov 9, 2024

Code for Neurips24 paper: QuaRot, an end-to-end 4-bit inference of large language models.

Python 301 25 Updated Nov 26, 2024

FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation

Python 46 2 Updated Jul 11, 2024

Official implementation of Half-Quadratic Quantization (HQQ)

Python 719 72 Updated Nov 22, 2024

[ICML 2024] CLLMs: Consistency Large Language Models

Python 363 18 Updated Nov 16, 2024

Spec-Bench: A Comprehensive Benchmark and Unified Evaluation Platform for Speculative Decoding (ACL 2024 Findings)

Python 201 21 Updated Oct 25, 2024

Medusa: Simple Framework for Accelerating LLM Generation with Multiple Decoding Heads

Jupyter Notebook 2,345 163 Updated Jun 25, 2024
Python 165 32 Updated May 24, 2024

Official Pytorch repository for Extreme Compression of Large Language Models via Additive Quantization https://arxiv.org/pdf/2401.06118.pdf and PV-Tuning: Beyond Straight-Through Estimation for Ext…

Python 1,191 180 Updated Nov 28, 2024

[ICLR 2024] The Need for Speed: Pruning Transformers with One Recipe

Python 22 2 Updated Sep 2, 2024

[ICLR 2024] Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Python 566 48 Updated Mar 4, 2024

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8,933 1,028 Updated Dec 17, 2024

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 742 56 Updated Oct 8, 2024

A fast inference library for running LLMs locally on modern consumer-class GPUs

Python 3,775 289 Updated Dec 18, 2024

Simple implementation of Speculative Sampling in NumPy for GPT-2.

Python 89 9 Updated Aug 20, 2023

A curated list for Efficient Large Language Models

Python 1,340 97 Updated Dec 9, 2024

Implementation of the 2023 CVPR Award Candidate: On Distillation of Guided Diffusion Models

Python 42 3 Updated Aug 16, 2023

A Compressed Stable Diffusion for Efficient Text-to-Image Generation [ECCV'24]

Python 267 17 Updated Jul 6, 2024

Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"

Python 354 32 Updated Feb 24, 2024

General technology for enabling AI capabilities w/ LLMs and MLLMs

Python 3,755 284 Updated Dec 18, 2024

Segmind Distilled diffusion

Python 579 37 Updated Oct 18, 2023

Universal LLM Deployment Engine with ML Compilation

Python 19,431 1,599 Updated Dec 19, 2024
Next