Skip to content

Reinforcement Learning algorithms and use-cases, including DQN, PG, A3C, PPO etc. and RLHF, AlphaZero implementations. Designed for clarity, ease of use, and educational purposes.

License

Notifications You must be signed in to change notification settings

ssdutyuyang199401/CleanRL

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CleanRL

基础RL算法

  • q-learning
    • 基础的基于价值的强化学习算法,通过qtable记录状态-动作-价值,每次用下一个状态的最大价值进行估计

About

Reinforcement Learning algorithms and use-cases, including DQN, PG, A3C, PPO etc. and RLHF, AlphaZero implementations. Designed for clarity, ease of use, and educational purposes.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%