Skip to content
View weidong2018's full-sized avatar
  • FDU
  • shanghai&beijing

Block or report weidong2018

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DeepEP: an efficient expert-parallel communication library

Cuda 6,812 554 Updated Mar 3, 2025
Jupyter Notebook 12 1 Updated Nov 9, 2024

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,262 155 Updated Mar 1, 2025

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,580 84 Updated Feb 22, 2025

Reproduce R1 Zero on Logic Puzzle

Python 1,936 125 Updated Feb 26, 2025

A PyTorch native library for large model training

Python 3,377 295 Updated Mar 3, 2025
Python 2,268 160 Updated Feb 24, 2025

This repository serves as a comprehensive survey of LLM development, featuring numerous research papers along with their corresponding code links.

71 1 Updated Feb 18, 2025

Code for BLT research paper

Python 1,419 108 Updated Mar 1, 2025
Python 6 Updated Nov 21, 2024

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 8,864 729 Updated Feb 20, 2025

A method for calculating scaling laws for LLMs from publicly available models

Python 9 Updated Apr 22, 2024

Modeling, training, eval, and inference code for OLMo

Python 5,284 561 Updated Mar 1, 2025

Awesome-LLM-KV-Cache: A curated list of 📙Awesome LLM KV Cache Papers with Codes.

221 12 Updated Dec 7, 2024

Official inference repo for FLUX.1 models

Python 20,518 1,442 Updated Feb 6, 2025

🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton

Python 2,032 124 Updated Mar 2, 2025

A description for recent long-context large language model Jamba.

14 1 Updated May 22, 2024

Integrating Mamba/SSMs with Transformer for Enhanced Long Context and High-Quality Sequence Modeling

Python 187 17 Updated Jan 27, 2025

Some preliminary explorations of Mamba's context scaling.

Python 213 11 Updated Feb 8, 2024

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Jupyter Notebook 1,730 190 Updated Aug 17, 2024

PyTorch bindings for CUTLASS grouped GEMM.

Cuda 98 31 Updated Jan 2, 2025

Example UI implementing the RTVI web client

TypeScript 475 71 Updated Dec 3, 2024

GoldFinch and other hybrid transformer components

Python 44 2 Updated Jul 20, 2024

Code for exploring Based models from "Simple linear attention language models balance the recall-throughput tradeoff"

Python 222 14 Updated Feb 17, 2025

A simple and easily understandable version of RWKV

Python 15 1 Updated Aug 15, 2023
Python 42 3 Updated Mar 29, 2023

RWKV in nanoGPT style

Python 187 11 Updated Jun 9, 2024

RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…

Python 13,244 893 Updated Feb 27, 2025

The CUDA version of the RWKV language model ( https://github.com/BlinkDL/RWKV-LM )

Cuda 219 35 Updated Dec 14, 2024
Next