Skip to content
View longzh211's full-sized avatar

Block or report longzh211

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Fine-tune LLM agents with online reinforcement learning

Python 1,077 49 Updated Mar 19, 2024

该仓库主要记录 大模型(LLMs) 算法工程师相关的面试题

1,785 122 Updated Dec 26, 2024
Python 18 2 Updated Jun 8, 2023

PyTorch implementation of Soft Actor-Critic (SAC)

Jupyter Notebook 528 106 Updated Dec 5, 2021

Codebase of ηψ-Learning algorithm that learns a non-Markovian maximum state entropy exploration policy by combining predecessor and successor representation to estimate the state visitation distrib…

Python 3 Updated Oct 23, 2023

PyTorch implementation of GAIL and AIRL based on PPO.

Python 210 33 Updated Nov 22, 2020

SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning

Python 123 30 Updated Mar 21, 2021

Code for the paper "Exploration by Random Network Distillation"

Python 892 163 Updated Oct 1, 2020

Representation Learning for RL

123 8 Updated Feb 27, 2023

An elegant PyTorch deep reinforcement learning library.

Python 8,253 1,136 Updated Mar 7, 2025

PyTorch implementations of deep reinforcement learning algorithms and environments

Python 5,730 1,201 Updated Jul 25, 2024

Reinforcement Learning in PyTorch

Python 2,243 327 Updated Jan 4, 2021

Bullet Physics SDK: real-time collision detection and multi-physics simulation for VR, games, visual effects, robotics, machine learning etc.

C++ 13,083 2,900 Updated Jan 29, 2025

DEPRECATED: Open-source software for robot simulation, integrated with OpenAI Gym.

Python 2,139 488 Updated Apr 2, 2023

This repo is intended as an extension for OpenAI Gym for auxiliary tasks (multitask learning, transfer learning, inverse reinforcement learning, etc.)

Python 215 41 Updated Jul 22, 2019

Official repo for the E3B algorithm described in the paper "Exploration via Elliptical Episodic Bonuses".

Python 82 13 Updated Mar 22, 2024

Repository for our ICLR 2023 paper: DEP-RL: Embodied Exploration for Reinforcement Learning in Overactuated and Musculoskeletal Systems

Python 262 23 Updated Oct 16, 2024

Gym reinforcement learning environment for OpenCat robots.

Jupyter Notebook 44 6 Updated Feb 4, 2024

Implement many Sparse Reward algorithms in Gym Fetch environment

Python 85 21 Updated Jul 9, 2020

A JAX Implementation of the Twin Delayed DDPG Algorithm

Jupyter Notebook 33 1 Updated Mar 12, 2020

PyTorch implementation of Soft-Actor-Critic and Prioritized Experience Replay (PER) + Emphasizing Recent Experience (ERE) + Munchausen RL + D2RL and parallel Environments.

Python 286 32 Updated Feb 24, 2021

Author's PyTorch implementation of LAP and PAL with TD3 and DDQN

Python 34 7 Updated Dec 7, 2021

Lightweight and scalable framework for Reinforcement Learning

C++ 119 51 Updated Dec 20, 2023

Public repo for HF blog posts

Jupyter Notebook 2,750 836 Updated Mar 6, 2025

TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.

Python 2,867 726 Updated Feb 7, 2025

🔥数据科学竞赛 Baseline & Topline

139 5 Updated May 27, 2023

Official implementation for "Anti-Exploration by Random Network Distillation", ICML 2023

Python 52 5 Updated Feb 3, 2023

A lightweight reimplementation of Adversarially Trained Actor Critic

Python 18 4 Updated Sep 11, 2023

Code accompanying the paper Adversarially Trained Actor Critic for Offline Reinforcement Learning by Ching-An Cheng*, Tengyang Xie*, Nan Jiang, and Alekh Agarwal.

Python 69 6 Updated Feb 2, 2023

AGAC: Adversarially Guided Actor-Critic

Python 48 8 Updated Sep 16, 2021
Next