renll

Liliang Ren renll

Senior Researcher at Mircosoft GenAI. Sequence Modeling, NLP.

80 followers · 13 following

Microsoft
Redmond, Washington
https://renll.github.io/

Achievements

Organizations

Stars

simplescaling / s1

s1: Simple test-time scaling

Python 5,914 680 Updated Mar 6, 2025

microsoft / nnscaler

nnScaler: Compiling DNN models for Parallel Training

Python 100 13 Updated Feb 14, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,105 613 Updated Mar 10, 2025

deepseek-ai / FlashMLA

FlashMLA: Efficient MLA decoding kernels

C++ 11,232 785 Updated Mar 1, 2025

fla-org / native-sparse-attention

🐳 Efficient Triton implementations for "Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention"

Python 527 27 Updated Mar 4, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,712 197 Updated Mar 4, 2025

MoonshotAI / MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,635 93 Updated Mar 7, 2025

SparkJiao / llama-pipeline-parallel

A prototype repo for hybrid training of pipeline parallel and distributed data parallel with comments on core code snippets. Feel free to copy code and launch discussions about the problems you hav…

Python 55 2 Updated Jul 4, 2023

seal-rg / recurrent-pretraining

Pretraining code for a large-scale depth-recurrent language model

Python 667 53 Updated Mar 5, 2025

agentica-project / deepscaler

Democratizing Reinforcement Learning for LLMs

Python 1,962 171 Updated Feb 16, 2025

OpenEvaByte / evabyte

EvaByte: Efficient Byte-level Language Models at Scale

Python 84 3 Updated Feb 28, 2025

hkust-nlp / simpleRL-reason

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 3,108 229 Updated Feb 19, 2025

Jiayi-Pan / TinyZero

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,071 1,411 Updated Feb 1, 2025

facebookresearch / coconut

Training Large Language Model to Reason in a Continuous Latent Space

Python 950 83 Updated Jan 24, 2025

apple / ml-sigmoid-attention

Python 258 13 Updated Feb 21, 2025

deepseek-ai / DeepSeek-V3

Python 91,620 14,831 Updated Feb 24, 2025

AntNLP / nope_head_scale

Python 20 Updated May 4, 2024

NVlabs / GatedDeltaNet

[ICLR 2025] Official PyTorch Implementation of Gated Delta Networks: Improving Mamba2 with Delta Rule

Python 138 9 Updated Feb 23, 2025

facebookresearch / blt

Code for BLT research paper

Python 1,432 109 Updated Mar 5, 2025

FoundationVision / VAR

[NeurIPS 2024 Best Paper][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ult…

Jupyter Notebook 6,878 443 Updated Jan 12, 2025

ClashLuke / SOAP

Forked from nikhilvyas/SOAP

Python 21 Updated Nov 9, 2024

mlcommons / algorithms_results_v0.5

This repository contains the results and code for the AlgoPerf v0.5 benchmark.

Python 5 Updated Oct 4, 2024

pytorch-labs / attention-gym

Helpful tools and examples for working with flex-attention

Python 677 36 Updated Mar 9, 2025

ChenLiu-1996 / CitationMap

A simple pip-installable Python tool to generate your own HTML citation world map from your Google Scholar ID.

Python 505 42 Updated Feb 25, 2025

xiayuqing0622 / flex_head_fa

Forked from Dao-AILab/flash-attention

Fast and memory-efficient exact attention

Python 66 7 Updated Mar 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Liliang Ren renll

Achievements

Achievements

Organizations

Block or report renll

Stars

simplescaling / s1

microsoft / nnscaler

deepseek-ai / DeepEP

deepseek-ai / FlashMLA

fla-org / native-sparse-attention

deepseek-ai / open-infra-index

MoonshotAI / MoBA

SparkJiao / llama-pipeline-parallel

seal-rg / recurrent-pretraining

agentica-project / deepscaler

OpenEvaByte / evabyte

hkust-nlp / simpleRL-reason

Jiayi-Pan / TinyZero

facebookresearch / coconut

apple / ml-sigmoid-attention

deepseek-ai / DeepSeek-V3

AntNLP / nope_head_scale

NVlabs / GatedDeltaNet

facebookresearch / blt

FoundationVision / VAR

ClashLuke / SOAP

mlcommons / algorithms_results_v0.5

pytorch-labs / attention-gym

ChenLiu-1996 / CitationMap

xiayuqing0622 / flex_head_fa

RobertCsordas / moe_layer

KellerJordan / modded-nanogpt

JonasGeiping / linear_cross_entropy_loss

mgmalek / efficient_cross_entropy

OpenMOSE / RWKV-Infer