Skip to content
View CUHKSZzxy's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report CUHKSZzxy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 3,270 289 Updated Oct 16, 2024

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 4,504 478 Updated Jan 16, 2025
Python 1,339 85 Updated Jan 16, 2025

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Python 2,761 221 Updated Jan 16, 2025

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 1,980 207 Updated Jan 16, 2025

Tile primitives for speedy kernels

Cuda 1,932 95 Updated Jan 14, 2025

paper and its code for AI System

257 17 Updated Jan 16, 2025

Disaggregated serving system for Large Language Models (LLMs).

Jupyter Notebook 438 50 Updated Aug 19, 2024

A generative world for general-purpose robotics & embodied AI learning.

Python 22,817 1,869 Updated Jan 12, 2025

The benchmark of SOTA text-to-image diffusion models with a new benchmarking strategy based on MiniGPT-4, namely X-IQE.

115 3 Updated Jun 22, 2023
Jupyter Notebook 1,464 271 Updated Jan 18, 2024

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,394 418 Updated Jan 12, 2025

Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).

Python 6,678 475 Updated Jan 16, 2025

Resource-adaptive cluster scheduler for deep learning training.

Python 435 79 Updated Mar 5, 2023

本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)

HTML 12,986 1,450 Updated Jan 15, 2025

A LaTeX resume template designed for optimal information density and aesthetic appeal.

TeX 345 45 Updated Jun 26, 2024

📄 适合中文的简历模板收集(LaTeX,HTML/JS and so on)由 @hoochanlon 维护

4,857 407 Updated Oct 18, 2024

个人中文简历 Latex 源码 https://hijiangtao.github.io/

TeX 1,950 557 Updated Sep 4, 2024

KV cache compression for high-throughput LLM inference

Python 103 5 Updated Dec 13, 2024

[ICML‘2024] "LoCoCo: Dropping In Convolutions for Long Context Compression", Ruisi Cai, Yuandong Tian, Zhangyang Wang, Beidi Chen

Python 16 Updated Sep 7, 2024

Cold Compress is a hackable, lightweight, and open-source toolkit for creating and benchmarking cache compression methods built on top of GPT-Fast, a simple, PyTorch-native generation codebase.

Python 106 10 Updated Aug 9, 2024

DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads

Python 417 24 Updated Oct 31, 2024

Triton-based implementation of Sparse Mixture of Experts.

Python 192 15 Updated Nov 28, 2024

Official PyTorch implementation of FlatQuant: Flatness Matters for LLM Quantization

Python 90 8 Updated Nov 12, 2024

Next-Token Prediction is All You Need

Python 1,964 77 Updated Oct 24, 2024

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,187 461 Updated Jan 16, 2025

The Official Implementation of Ada-KV: Optimizing KV Cache Eviction by Adaptive Budget Allocation for Efficient LLM Inference

Python 54 Updated Dec 20, 2024

All Algorithms implemented in Python

Python 196,448 46,119 Updated Jan 14, 2025

Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.

Python 286,068 47,693 Updated Dec 2, 2024
Next