Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)

Python 1,822 226 Updated Oct 23, 2024

eureka-research / DrEureka

Official Repository for "DrEureka: Language Model Guided Sim-To-Real Transfer" (RSS 2024)

Python 839 66 Updated Aug 26, 2024

AutonoBot-Lab / BestMan_Pybullet

Codebase for the BestMan Mobile Manipulator Platform

Python 184 11 Updated Dec 17, 2024

zhentingqi / rStar

Python 879 103 Updated Jan 23, 2025

maitrix-org / llm-reasoners

A library for advanced large language model reasoning

Python 1,807 158 Updated Feb 6, 2025

RPL-CS-UCL / SDS

Code for SDS: Quadrupedal Skill Synthesis from Single Video Demonstration

Python 75 6 Updated Oct 21, 2024

johnzhang3 / SLoMo

Motion Imitation from Casual Videos for Legged Robots

61 1 Updated Jul 17, 2024

hijkzzz / Awesome-LLM-Strawberry

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,406 358 Updated Feb 8, 2025

SIMONLQY / RethinkMCTS

Python 21 1 Updated Oct 2, 2024

ksyang2013 / AFLOW

Automatic - FLOW for Materials Discovery

C++ 5 3 Updated Sep 14, 2024

zitian-gao / SC-MCTS

Interpretable Contrastive Monte Carlo Tree Search Reasoning

Python 43 4 Updated Nov 9, 2024

openreasoner / openr

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 1,515 117 Updated Jan 17, 2025

openai / swarm

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Python 18,551 1,961 Updated Oct 15, 2024

SidU / MathBlackBox

Forked from trotsky1997/MathBlackBox

Python 11 1 Updated Jul 21, 2024

jmwang0117 / HE-Drive

HE-Drive: Human-Like End-to-End Driving with Vision Language Models

Python 196 13 Updated Dec 9, 2024

GAIR-NLP / O1-Journey

O1 Replication Journey

1,934 61 Updated Jan 14, 2025

1989Ryan / llm-mcts

[NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling better-reasoned decision-making for daily task planning problems.

Python 251 21 Updated Nov 16, 2024

NumberChiffre / mcts-llm

Jupyter Notebook 89 1 Updated Dec 16, 2024

ack-sec / toyberry

Toy implementation of Strawberry

Python 30 2 Updated Sep 24, 2024

OpenDriveLab / Vista

[NeurIPS 2024] A Generalizable World Model for Autonomous Driving

Python 652 49 Updated Dec 12, 2024

THUDM / ReST-MCTS

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

Python 560 44 Updated Jan 20, 2025

THUDM / VisualAgentBench

Towards Large Multimodal Models as Visual Foundation Agents

Python 173 6 Updated Feb 5, 2025

InternLM / MindSearch

🔍 An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)

JavaScript 5,911 599 Updated Jan 8, 2025

microsoft / autogen

A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour

Python 39,087 5,735 Updated Feb 9, 2025

dvlab-research / Step-DPO

Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"

Python 344 13 Updated Jan 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Yubin Wang yubinwang11

Highlights

Block or report yubinwang11

Stars

Julia-LiuJ / NLFT

OpenRLHF / OpenRLHF

RUCAIBox / Slow_Thinking_with_LLMs

lqtrung1998 / mwp_ReFT

collaborative-mapush / MAPush

XinJingHao / DRL-Pytorch