Skip to content
View mansicer's full-sized avatar
🎆
coding
🎆
coding
  • Nanjing University

Highlights

  • Pro

Organizations

@LAMDA-RL

Block or report mansicer

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Recipes to train reward model for RLHF.

Python 1,086 75 Updated Dec 12, 2024

Memory-Guided Diffusion for Expressive Talking Video Generation

Python 574 54 Updated Dec 16, 2024

An open-source lightweight game generation paradigm. It includes everything from data processing to model architecture design and playability-based evaluation methods. The game runs at 20 FPS on a …

Jupyter Notebook 56 2 Updated Dec 5, 2024

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Jupyter Notebook 6,467 431 Updated Dec 22, 2024

YaRN: Efficient Context Window Extension of Large Language Models

Python 1,376 118 Updated Apr 17, 2024

Replicating O1 inference-time scaling laws

Python 66 3 Updated Dec 1, 2024

veRL: Volcano Engine Reinforcement Learning for LLM

Python 515 36 Updated Dec 23, 2024

Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory

Python 19,759 1,393 Updated Dec 29, 2024

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 3,392 317 Updated Dec 28, 2024

Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…

Jupyter Notebook 15,758 2,334 Updated Dec 23, 2024

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging

Python 98 8 Updated Oct 23, 2023

The LLM Evaluation Framework

Python 4,123 336 Updated Dec 28, 2024

Easy, fast, and cheap pretrain,finetune, serving for everyone

Python 269 39 Updated Dec 9, 2024

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Jupyter Notebook 1,627 177 Updated Aug 17, 2024

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 4,382 464 Updated Dec 27, 2024

Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.

Python 4,607 417 Updated Dec 13, 2024

TradeMaster is an open-source platform for quantitative trading empowered by reinforcement learning 🔥 ⚡ 🌈

Jupyter Notebook 1,476 300 Updated Oct 29, 2024

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 903 109 Updated Dec 26, 2024

A framework for few-shot evaluation of language models.

Python 7,327 1,978 Updated Dec 25, 2024

AI driven development in your terminal. Designed for large, real-world tasks.

Go 10,921 756 Updated Dec 15, 2024

The implementation of the AAMAS'24 paper "Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation"

Python 3 Updated Mar 14, 2024

Overcooked human-AI experiment platform

Python 29 4 Updated Dec 21, 2023

This is a repository for Hidden-utility Self-Play.

JavaScript 26 2 Updated Jul 27, 2023

该项目可以让你通过订阅的方式使用Cloudflare WARP+,自动获取流量。This project enables you to use Cloudflare WARP+ through subscription, automatically acquiring traffic.

Python 8,565 1,160 Updated Sep 4, 2024

Firefly: 大模型训练工具,支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,987 533 Updated Oct 24, 2024

Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)

Python 36,899 4,545 Updated Dec 27, 2024

📚150+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 1,806 189 Updated Dec 29, 2024

📖A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. 🎉🎉

3,070 207 Updated Dec 27, 2024

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook 1,586 246 Updated Dec 27, 2024
Next