Skip to content
View xxxxyu's full-sized avatar
🎯
Focusing
🎯
Focusing

Highlights

  • Pro

Block or report xxxxyu

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Low-bit LLM inference on CPU with lookup table

C++ 689 53 Updated Jan 9, 2025

Tmux Plugin Manager

Shell 12,734 445 Updated Aug 5, 2024

Here is My Termux Terminal Emulator Setup & Packages

174 15 Updated Sep 8, 2023

an android OTA payload dumper written in Go

Go 2,566 212 Updated Nov 20, 2024

My learning notes/codes for ML SYS.

Python 1,121 54 Updated Feb 27, 2025

internal docs

Shell 165 102 Updated Jun 3, 2021

Power Usage Monitor for Apple Silicon

Rust 150 5 Updated Sep 20, 2024

[ACL 2024] A novel QAT with Self-Distillation framework to enhance ultra low-bit LLMs.

Python 100 15 Updated May 16, 2024

Official inference framework for 1-bit LLMs

C++ 12,767 897 Updated Feb 18, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,511 364 Updated Feb 26, 2025

Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.

217 12 Updated Dec 7, 2024

BitBLAS is a library to support mixed-precision matrix multiplications, especially for quantized LLM deployment.

Python 528 39 Updated Feb 14, 2025

Repository for "MultiHop-RAG: A Dataset for Evaluating Retrieval-Augmented Generation Across Documents" (COLM 2024)

Python 265 19 Updated Nov 19, 2024

A library for efficient similarity search and clustering of dense vectors.

C++ 33,301 3,771 Updated Feb 27, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 10,976 1,092 Updated Feb 27, 2025

A comprehensive guide to building RAG-based LLM applications for production.

Jupyter Notebook 1,772 243 Updated Aug 2, 2024

A python module to scrape arxiv.org for a date range and category

Python 292 53 Updated Jan 22, 2024

Contains system design materials to prepare for system design interviews 🚩👨‍💻👨‍💻👨‍💻

884 305 Updated Apr 21, 2023

Notes on books I read, talks I watch, articles I study, and papers I love

SCSS 5,765 1,236 Updated Jan 2, 2024

Official repository for paper "MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement"

Python 288 11 Updated Sep 16, 2024

😎 Awesome list of tools and projects with the awesome LangChain framework

8,000 562 Updated Feb 21, 2025

RAGChecker: A Fine-grained Framework For Diagnosing RAG

Python 770 66 Updated Dec 13, 2024

the resources about the application based on LLM with RAG pattern

1,155 68 Updated Jan 22, 2025

Langchain-Chatchat(原Langchain-ChatGLM)基于 Langchain 与 ChatGLM, Qwen 与 Llama 等语言模型的 RAG 与 Agent 应用 | Langchain-Chatchat (formerly langchain-ChatGLM), local knowledge based LLM (like ChatGLM, Qwen and…

TypeScript 33,787 5,751 Updated Nov 29, 2024

Easy usage of Rockchip's NPUs found in RK3588 and similar chips

Shell 130 7 Updated Nov 13, 2024

Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch

Python 1,761 158 Updated Jan 27, 2025

Ubuntu for Rockchip RK35XX Devices

Shell 2,969 320 Updated Jan 24, 2025

A beautiful, simple, clean, and responsive Jekyll theme for academics

HTML 12,287 11,603 Updated Feb 24, 2025

A reference application for a local AI assistant with LLM and RAG

Python 105 18 Updated Dec 5, 2024

FlashInfer: Kernel Library for LLM Serving

Cuda 2,199 226 Updated Feb 27, 2025
Next