Stars
An OpenAI Gym interface to Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NES
An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)
OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"
Reference implementation for DPO (Direct Preference Optimization)
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀
🔥LeetCode solutions in any programming language | 多种编程语言实现 LeetCode、《剑指 Offer(第 2 版)》、《程序员面试金典(第 6 版)》题解
为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…
DSAC-v2; DSAC-T; DASC; Distributional Soft Actor-Critic
A large-scale 7B pretraining language model developed by BaiChuan-Inc.
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
PPO x Family DRL Tutorial Course(决策智能入门级公开课:8节课帮你盘清算法理论,理顺代码逻辑,玩转决策AI应用实践 )
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Instruct-tune LLaMA on consumer hardware
Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch
A latent text-to-image diffusion model
Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch
Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A playbook for systematically maximizing the performance of deep learning models.
OpenAI Baselines: high-quality implementations of reinforcement learning algorithms
An educational resource to help anyone learn deep reinforcement learning.
FinRL: Financial Reinforcement Learning. 🔥
An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.