Skip to content
View xucheng95's full-sized avatar

Block or report xucheng95

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

DQN_play_sekiro

Python 487 94 Updated Aug 31, 2024

An OpenAI Gym interface to Super Mario Bros. & Super Mario Bros. 2 (Lost Levels) on The NES

Python 709 143 Updated Aug 1, 2023

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention & RFT)

Python 4,179 406 Updated Jan 30, 2025

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.

Python 3,206 387 Updated Jan 27, 2025

Graphic notes on Gilbert Strang's "Linear Algebra for Everyone"

PostScript 18,429 2,233 Updated Nov 13, 2024

Reference implementation for DPO (Direct Preference Optimization)

Python 2,344 195 Updated Aug 11, 2024
Jupyter Notebook 69 6 Updated Oct 17, 2024

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 4,491 586 Updated Dec 26, 2024

CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image

Jupyter Notebook 27,157 3,419 Updated Jul 23, 2024

《代码随想录》LeetCode 刷题攻略:200道经典题目刷题顺序,共60w字的详细图解,视频难点剖析,50余张思维导图,支持C++,Java,Python,Go,JavaScript等多语言版本,从此算法学习不再迷茫!🔥🔥 来看看,你会发现相见恨晚!🚀

Shell 53,551 11,747 Updated Jan 27, 2025

🔥LeetCode solutions in any programming language | 多种编程语言实现 LeetCode、《剑指 Offer(第 2 版)》、《程序员面试金典(第 6 版)》题解

Java 32,794 8,247 Updated Feb 1, 2025

为GPT/GLM等LLM大语言模型提供实用化交互接口,特别优化论文阅读/润色/写作体验,模块化设计,支持自定义快捷按钮&函数插件,支持Python和C++等项目剖析&自译解功能,PDF/LaTex论文翻译&总结功能,支持并行问询多种LLM模型,支持chatglm3等本地模型。接入通义千问, deepseekcoder, 讯飞星火, 文心一言, llama2, rwkv, claude2, m…

Python 67,179 8,245 Updated Jan 29, 2025

DSAC-v2; DSAC-T; DASC; Distributional Soft Actor-Critic

Python 282 33 Updated Nov 7, 2024

A large-scale 7B pretraining language model developed by BaiChuan-Inc.

Python 5,680 507 Updated Jul 18, 2024

a lightweight LLM model inference framework

C++ 714 86 Updated Apr 7, 2024

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Python 18,664 1,883 Updated Apr 30, 2024

PPO x Family DRL Tutorial Course(决策智能入门级公开课:8节课帮你盘清算法理论,理顺代码逻辑,玩转决策AI应用实践 )

Python 2,060 180 Updated May 15, 2024

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 11,199 706 Updated Dec 17, 2024

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 17,119 1,709 Updated Jan 29, 2025

Instruct-tune LLaMA on consumer hardware

Jupyter Notebook 18,786 2,224 Updated Jul 29, 2024

Implementation of Muse: Text-to-Image Generation via Masked Generative Transformers, in Pytorch

Python 886 81 Updated Feb 29, 2024

A latent text-to-image diffusion model

Jupyter Notebook 69,341 10,286 Updated Jun 18, 2024

Implementation of Imagen, Google's Text-to-Image Neural Network, in Pytorch

Python 8,163 773 Updated Oct 7, 2024

Implementation of DALL-E 2, OpenAI's updated text-to-image synthesis neural network, in Pytorch

Python 11,206 1,088 Updated May 11, 2024

Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Python 35,121 5,971 Updated Feb 1, 2025

A playbook for systematically maximizing the performance of deep learning models.

27,920 2,300 Updated Jun 18, 2024

OpenAI Baselines: high-quality implementations of reinforcement learning algorithms

Python 15,982 4,882 Updated Aug 1, 2024

An educational resource to help anyone learn deep reinforcement learning.

Python 10,421 2,266 Updated Aug 5, 2024

FinRL: Financial Reinforcement Learning. 🔥

Jupyter Notebook 10,536 2,513 Updated Jan 29, 2025

An artificial intelligence platform for the StarCraft II with large-scale distributed training and grand-master agents.

Python 1,248 117 Updated May 6, 2024
Next