Skip to content
View yubinwang11's full-sized avatar
🐒
🐒

Highlights

  • Pro

Block or report yubinwang11

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

The official implementation of Natural Language Fine-Tuning

Python 34 3 Updated Jan 7, 2025

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 4,434 444 Updated Feb 10, 2025

A series of technical report on Slow Thinking with LLM

Python 376 20 Updated Jan 26, 2025
Python 459 56 Updated Jan 2, 2025
Python 26 4 Updated Nov 26, 2024

Clean, Robust, and Unified PyTorch implementation of popular Deep Reinforcement Learning (DRL) algorithms (Q-learning, Duel DDQN, PER, C51, Noisy DQN, PPO, DDPG, TD3, SAC, ASL)

Python 1,822 226 Updated Oct 23, 2024

Official Repository for "DrEureka: Language Model Guided Sim-To-Real Transfer" (RSS 2024)

Python 839 66 Updated Aug 26, 2024

Codebase for the BestMan Mobile Manipulator Platform

Python 184 11 Updated Dec 17, 2024
Python 879 103 Updated Jan 23, 2025

A library for advanced large language model reasoning

Python 1,807 158 Updated Feb 6, 2025

Code for SDS: Quadrupedal Skill Synthesis from Single Video Demonstration

Python 75 6 Updated Oct 21, 2024

Motion Imitation from Casual Videos for Legged Robots

61 1 Updated Jul 17, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 πŸ“ and reasoning techniques.

6,406 358 Updated Feb 8, 2025
Python 21 1 Updated Oct 2, 2024

Automatic - FLOW for Materials Discovery

C++ 5 3 Updated Sep 14, 2024

Interpretable Contrastive Monte Carlo Tree Search Reasoning

Python 43 4 Updated Nov 9, 2024

OpenR: An Open Source Framework for Advanced Reasoning with Large Language Models

Python 1,515 117 Updated Jan 17, 2025

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Python 18,551 1,961 Updated Oct 15, 2024
Python 11 1 Updated Jul 21, 2024

HE-Drive: Human-Like End-to-End Driving with Vision Language Models

Python 196 13 Updated Dec 9, 2024

O1 Replication Journey

1,934 61 Updated Jan 14, 2025

[NeurIPS 2023] We use large language models as commonsense world model and heuristic policy within Monte-Carlo Tree Search, enabling better-reasoned decision-making for daily task planning problems.

Python 251 21 Updated Nov 16, 2024
Jupyter Notebook 89 1 Updated Dec 16, 2024

Toy implementation of Strawberry

Python 30 2 Updated Sep 24, 2024

[NeurIPS 2024] A Generalizable World Model for Autonomous Driving

Python 652 49 Updated Dec 12, 2024

ReST-MCTS*: LLM Self-Training via Process Reward Guided Tree Search (NeurIPS 2024)

Python 560 44 Updated Jan 20, 2025

Towards Large Multimodal Models as Visual Foundation Agents

Python 173 6 Updated Feb 5, 2025

πŸ” An LLM-based Multi-agent Framework of Web Search Engine (like Perplexity.ai Pro and SearchGPT)

JavaScript 5,911 599 Updated Jan 8, 2025

A programming framework for agentic AI πŸ€– PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour

Python 39,087 5,735 Updated Feb 9, 2025

Implementation for "Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs"

Python 344 13 Updated Jan 19, 2025
Next