shiml20

Follow

🎯

Focusing

Minglei Shi 史明磊 shiml20

🎯

Focusing

Follow

Self-discipline \ Moderation \ Initiative

30 followers · 26 following

Tsinghua University
Beijing, China
https://shiml20.github.io/

Achievements

Achievements

Highlights

Pro

Stars

213 results for source starred repositories

MoonshotAI / MoBA

MoBA: Mixture of Block Attention for Long-Context LLMs

Python 1,449 76 Updated Feb 22, 2025

deepseek-ai / DeepSeek-R1

81,218 10,487 Updated Feb 24, 2025

Gen-Verse / HermesFlow

HermesFlow: Seamlessly Closing the Gap in Multimodal Understanding and Generation

Python 46 3 Updated Feb 18, 2025

Gen-Verse / Diffusion-Sharpening

Diffusion-Sharpening: Fine-tuning Diffusion Models with Denoising Trajectory Sharpening

Python 42 2 Updated Feb 21, 2025

SkyworkAI / SkyReels-V1

SkyReels V1: The first and most advanced open-source human-centric video foundation model

Python 1,432 123 Updated Feb 24, 2025

kvfrans / shortcut-models

Python 377 9 Updated Dec 5, 2024

Alpha-VLLM / Lumina-Video

Python 203 10 Updated Feb 21, 2025

jingyaogong / minimind-v

🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM！🌏 Train a 26M-parameter VLM from scratch in just 1 hours!

Python 1,305 139 Updated Feb 23, 2025

jingyaogong / minimind

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT！🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 12,682 1,348 Updated Feb 23, 2025

ali-vilab / TeaCache

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Python 380 15 Updated Jan 24, 2025

erwold / qwen2vl-flux

Python 428 29 Updated Nov 26, 2024

ostris / ai-toolkit

Various AI scripts. Mostly Stable Diffusion stuff.

Python 4,063 459 Updated Feb 24, 2025

deepseek-ai / DeepSeek-V3

Python 88,203 14,237 Updated Feb 24, 2025

simplescaling / s1

s1: Simple test-time scaling

Python 5,658 642 Updated Feb 23, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 21,312 1,872 Updated Feb 24, 2025

nerfies / nerfies.github.io

JavaScript 2,931 1,084 Updated Jun 21, 2024

MiniMax-AI / MiniMax-01

Python 2,223 153 Updated Feb 24, 2025

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 11,499 2,583 Updated Feb 24, 2025

LINs-lab / DynMoE

[ICLR 2025] Dynamic Mixture of Experts: An Auto-Tuning Approach for Efficient Transformer Models

Python 70 10 Updated Feb 7, 2025

LMM101 / Awesome-Multimodal-Next-Token-Prediction

[Survey] Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

362 9 Updated Jan 17, 2025

withinmiaov / A-Survey-on-Mixture-of-Experts-in-LLMs

The official GitHub page for the survey paper "A Survey on Mixture of Experts in Large Language Models".

250 16 Updated Jan 21, 2025

segmind / segmoe

Python 416 29 Updated Mar 27, 2024

baofff / U-ViT

A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".

Jupyter Notebook 972 67 Updated Mar 25, 2023

NVIDIA / Cosmos

Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…

Python 7,572 479 Updated Feb 12, 2025

OliverRensu / FlowAR

“FlowAR: Scale-wise Autoregressive Image Generation Meets Flow Matching” FlowAR employs a simplest scale design and is compatible with any VAE.

Python 90 2 Updated Dec 23, 2024

thu-ml / ReMoE

Codebase for "ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing", built on Megatron-LM.

Python 58 Updated Dec 20, 2024

SkyworkAI / MoH

MoH: Multi-Head Attention as Mixture-of-Head Attention

Python 205 7 Updated Oct 29, 2024

leverimmy / THU-Annual-Eat

一年过去了，你在华子食堂里花的钱都花在哪儿了？

Python 459 80 Updated Dec 23, 2024

huggingface / pytorch-image-models

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 33,260 4,850 Updated Feb 23, 2025

Robertwyq / Drivingdojo

[NeurIPS 2024] DrivingDojo Dataset: Advancing Interactive and Knowledge-Enriched Driving World Model

Python 62 10 Updated Dec 5, 2024