mansicer

🎆

coding

sicer mansicer

🎆

coding

Doing machine learning research at @LAMDA-NJU and @LAMDA-RL

47 followers · 36 following

Nanjing University

Achievements

Highlights

Organizations

Starred repositories

RLHFlow / RLHF-Reward-Modeling

Recipes to train reward model for RLHF.

Python 1,086 75 Updated Dec 12, 2024

memoavatar / memo

Memory-Guided Diffusion for Expressive Talking Video Generation

Python 574 54 Updated Dec 16, 2024

GreatX3 / Playable-Game-Generation

An open-source lightweight game generation paradigm. It includes everything from data processing to model architecture design and playability-based evaluation methods. The game runs at 20 FPS on a …

Jupyter Notebook 56 2 Updated Dec 5, 2024

FoundationVision / VAR

[NeurIPS 2024 Oral][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-sim…

Jupyter Notebook 6,467 431 Updated Dec 22, 2024

jquesnelle / yarn

YaRN: Efficient Context Window Extension of Large Language Models

Python 1,376 118 Updated Apr 17, 2024

hughbzhang / o1_inference_scaling_laws

Replicating O1 inference-time scaling laws

Python 66 3 Updated Dec 1, 2024

volcengine / verl

veRL: Volcano Engine Reinforcement Learning for LLM

Python 515 36 Updated Dec 23, 2024

unslothai / unsloth

Finetune Llama 3.3, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 70% less memory

Python 19,759 1,393 Updated Dec 29, 2024

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 3,392 317 Updated Dec 28, 2024

meta-llama / llama-recipes

Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…

Jupyter Notebook 15,758 2,334 Updated Dec 23, 2024

joeljang / RLPHF

Personalized Soups: Personalized Large Language Model Alignment via Post-hoc Parameter Merging

Python 98 8 Updated Oct 23, 2023

confident-ai / deepeval

The LLM Evaluation Framework

Python 4,123 336 Updated Dec 28, 2024

allwefantasy / byzer-llm

Easy, fast, and cheap pretrain,finetune, serving for everyone

Python 269 39 Updated Dec 9, 2024

gkamradt / LLMTest_NeedleInAHaystack

Doing simple retrieval from LLM models at various context lengths to measure accuracy

Jupyter Notebook 1,627 177 Updated Aug 17, 2024

open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Python 4,382 464 Updated Dec 27, 2024

QwenLM / Qwen-Agent

Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.

Python 4,607 417 Updated Dec 13, 2024

TradeMaster-NTU / TradeMaster

TradeMaster is an open-source platform for quantitative trading empowered by reinforcement learning 🔥 ⚡ 🌈

Jupyter Notebook 1,476 300 Updated Oct 29, 2024

huggingface / lighteval

Lighteval is your all-in-one toolkit for evaluating LLMs across multiple backends

Python 903 109 Updated Dec 26, 2024

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 7,327 1,978 Updated Dec 25, 2024

plandex-ai / plandex

AI driven development in your terminal. Designed for large, real-world tasks.

Go 10,921 756 Updated Dec 15, 2024

LAMDA-RL / ReDA

The implementation of the AAMAS'24 paper "Disentangling Policy from Offline Task Representation Learning via Adversarial Data Augmentation"

Python 3 Updated Mar 14, 2024

liyang619 / COLE-Platform

Overcooked human-AI experiment platform

Python 29 4 Updated Dec 21, 2023

samjia2000 / HSP

This is a repository for Hidden-utility Self-Play.

JavaScript 26 2 Updated Jul 27, 2023

vvbbnn00 / WARP-Clash-API

该项目可以让你通过订阅的方式使用Cloudflare WARP+，自动获取流量。This project enables you to use Cloudflare WARP+ through subscription, automatically acquiring traffic.

Python 8,565 1,160 Updated Sep 4, 2024

yangjianxin1 / Firefly

Firefly: 大模型训练工具，支持训练Qwen2.5、Qwen2、Yi1.5、Phi-3、Llama3、Gemma、MiniCPM、Yi、Deepseek、Orion、Xverse、Mixtral-8x7B、Zephyr、Mistral、Baichuan2、Llma2、Llama、Qwen、Baichuan、ChatGLM2、InternLM、Ziya2、Vicuna、Bloom等大模型

Python 5,987 533 Updated Oct 24, 2024