-
Nanjing University
Highlights
- Pro
Starred repositories
Recipes to train reward model for RLHF.
Memory-Guided Diffusion for Expressive Talking Video Generation
An open-source lightweight game generation paradigm. It includes everything from data processing to model architecture design and playability-based evaluation methods. The game runs at 20 FPS on a …
[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…
YaRN: Efficient Context Window Extension of Large Language Models
Replicating O1 inference-time scaling laws
veRL: Volcano Engine Reinforcement Learning for LLM
Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…
Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging
Easy, fast, and cheap pretrain,finetune, serving for everyone
Doing simple retrieval from LLM models at various context lengths to measure accuracy
OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.
Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.
TradeMaster is an open-source platform for quantitative trading empowered by reinforcement learning 🔥 ⚡ 🌈
Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends
A framework for few-shot evaluation of language models.
AI driven development in your terminal. Designed for large, real-world tasks.
The implementation of the AAMAS'24 paper "Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation"
This is a repository for Hidden-utility Self-Play.
该项目可以让你通过订阅的方式使用Cloudflare WARP+,自动获取流量。This project enables you to use Cloudflare WARP+ through subscription, automatically acquiring traffic.
Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).
📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉
An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.