Skip to content
View Roythuly's full-sized avatar

Block or report Roythuly

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data

Python 2,279 170 Updated Feb 7, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 9,145 1,187 Updated Feb 1, 2025

Fully open reproduction of DeepSeek-R1

Python 17,713 1,467 Updated Feb 8, 2025

Bayesian learning and inference for state space models

Jupyter Notebook 595 205 Updated Aug 14, 2024
Python 9 Updated May 29, 2024

Reference implementation for Token-level Direct Preference Optimization(TDPO)

Python 126 13 Updated Jul 3, 2024

TORAX: Tokamak transport simulation in JAX

Python 395 40 Updated Feb 8, 2025

[NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"

Jupyter Notebook 32 2 Updated Nov 15, 2024
Python 19 Updated May 27, 2024
Python 7 1 Updated Jul 19, 2023
Jupyter Notebook 59 11 Updated Mar 9, 2024

Kolmogorov Arnold Networks

Jupyter Notebook 15,363 1,446 Updated Jan 19, 2025

Humanoid-Gym: Reinforcement Learning for Humanoid Robot with Zero-Shot Sim2Real Transfer https://arxiv.org/abs/2404.05695

Python 984 159 Updated Jan 26, 2025

📄 Awesome CV is LaTeX template for your outstanding job application

TeX 23,717 4,869 Updated Feb 6, 2025

LaTeX Thesis Template for Tsinghua University

TeX 4,700 1,093 Updated Jan 10, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 39,736 4,878 Updated Feb 8, 2025

Thanks for stopping by! This repository now has 14 stars~🌟🌟🌟

Python 14 Updated Dec 31, 2024

OpenChat: Advancing Open-source Language Models with Imperfect Data

Python 5,297 404 Updated Sep 13, 2024

JAX (Flax) implementation of algorithms for Deep Reinforcement Learning with continuous action spaces.

Jupyter Notebook 651 71 Updated Oct 26, 2022

A minimal and stable PPO.

Python 129 4 Updated Feb 9, 2024

Set of robotic environments based on PyBullet physics engine and gymnasium.

Python 606 118 Updated Jul 23, 2024

PyTorch implementation of (Deep) Reinforcement Learning (RL) algorithms

Python 22 Updated Jun 26, 2022

A unified framework for robot learning

Python 548 86 Updated Nov 26, 2024

This repo includes ChatGPT prompt curation to use ChatGPT and other LLM tools better.

HTML 119,985 16,158 Updated Feb 5, 2025

Official Pytorch Implementation of CMLO in the paper ”When to Update Your Model: Constrained Model-based Reinforcement Learning“

Python 10 1 Updated Nov 2, 2023

Conservative Q Learning on top of SAC

Python 122 25 Updated Oct 15, 2022

PyTorch implementation of soft actor critic

Python 852 181 Updated Nov 9, 2021

Basic constrained RL agents used in experiments for the "Benchmarking Safe Exploration in Deep Reinforcement Learning" paper.

Python 400 113 Updated Apr 2, 2023
Next