Stars
The official implementation of Natural Language Fine-Tuning
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
A series of technical report on Slow Thinking with LLM
Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)
Official Repository for "DrEureka: Language Model Guided Sim-To-Real Transfer" (RSS 2024)
Codebase for the BestMan Mobile Manipulator Platform
A library for advanced large language model reasoning
Code for SDS: Quadrupedal Skill Synthesis from Single Video Demonstration
A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 π and reasoning techniques.
Interpretable Contrastive Monte Carlo Tree Search Reasoning
OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models
Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.
HE-Drive: Human-Like End-to-End Driving with Vision Language Models
[NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling better-reasoned decision-making for daily task planning problems.
[NeurIPS 2024] A Generalizable World Model for Autonomous Driving
ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)
Towards Large Multimodal Models as Visual Foundation Agents
π An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)
A programming framework for agentic AI π€ PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour
Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"