add cn readme (alibaba#5)

Co-authored-by: xianyan.xianyanjia <[email protected]>
lwmlyy · Aug 29, 2023 · 318192a · 318192a
1 parent 7e9ab35
commit 318192a
Show file tree

Hide file tree

Showing 6 changed files with 113 additions and 36 deletions.
diff --git a/README.md b/README.md
@@ -0,0 +1,70 @@
+<p align="center">
+  <picture>
+    <img alt="ChatLearn" src="docs/images/logo.png" width=30%>
+  </picture>
+</p>
+
+<h3 align="center">
+灵活易用、大规模 RLHF 高效训练框架。
+</h3>
+
+
+---
+
+*最新进展* 🔥
+- [2023/08] 正式开源 ChatLearn，更多介绍请参考我们的 [文档](docs/zh/chatlearn.md)。
+
+---
+
+ChatLearn 是一个灵活易用、大规模 RLHF 高效训练框架。 
+ChatLearn 通过对模型计算逻辑的抽象，解耦了模型和计算 backend、分布式策略的绑定，提供灵活的资源调度机制，可以支持灵活的资源分配和并行调度策略。
+得益于此，
+1. ChatLearn 支持各种并行策略组合：Data Parallel/Tensor Parallel/Sequence Parallel/Pipeline Parallel/ZeRO 等任意组合。
+2. ChatLearn 支持灵活的资源调度机制，支持各模型的资源独占或复用，通过系统调度策略支持高效的串行/并行执行；
+3. 用户可以使用不同的计算 backend 进行模型建模，如 Megatron-LM、DeepSpeed 等。
+4. 用户只需要专注于单模型的编程，系统负责资源调度、数据流传输、控制流传输、分布式执行等。
+5. 相较于当前的 SOTA 系统，ChatLearn 在 7B 到 30 B 规模提升 29%-68%。同时，ChatLearn 支持更大规模的 RLHF 训练 (175B Policy + 175B Reward)。
+
+
+# 快速开始
+
+请参考 [文档](https://chatlearn.readthedocs.io/zh/latest/) 快速开始.
+
+1. [环境和代码准备](docs/zh/installation.md)
+2. [基于 LLaMA 模型的端到端训练教程](docs/zh/tutorial.md)
+
+# 支持的模型
+
+当前 ChatLearn 框架支持任意规模的 GPT/LLaMA 模型 RLHF 训练。
+
+| 模型类型                                                                                                                                                                         |
+|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| GPT (GPT 系列各种规模的模型)                                                                                                                                                          |
+| LLaMA (`lmsys/vicuna-13b-v1.3`, `decapoda-research/llama-7b-hf`, `decapoda-research/llama-13b-hf`, `decapoda-research/llama-30b-hf`, `decapoda-research/llama-65b-hf`, etc.) |
+| LLaMA2 (`meta-llama/Llama-2-7b-hf`, `meta-llama/Llama-2-13b-hf`)                                                                                                             |
+
+注：当前的性能 benchmark 均基于 GPT 系列模型。
+
+# 性能评估
+
+我们比较了不同参数量规模模型的 RLHF 训练吞吐量，我们采取 N+N 的模型配置，即 Policy 模型和 Reward 模型采用相同大小的参数量。测试基于 A800-80GB GPU 进行，单节点配置 8 卡 GPU，节点间采用 800Gb RDMA 互联。我们和 DeepSpeed-Chat 对比了从 7B 到 66B 的模型配置，关闭/开启 LoRA 后的性能对比，ChatLearn 在不同规模有 29% 到 68% 的加速，在更大的规模下，在 30B+30B，32GPUs 的配置下，不开启 LoRA 的情况下，DeepSpeed-chat 出现 OOM，在 66B+66B，32GPUs 的配置下，DeepSpeed-Chat 无论是否开启 LoRA 均会出现 OOM，ChatLearn 在相同机器规模下，可以支持更大的模型配置训练。在 seq_len=2048 时，DeepSpeed-Chat 出现了 kernel error。
+
+![Compare ChatLearn with DeepSpeed-Chat](docs/images/gpt-perf-cmp.png)
+
+同时，我们评估了在更大规模以及不同 sequence length 配置下的性能。下图分别为 66B+66B，175B+175B 的 RLHF 训练性能。
+
+![ChatLearn 66B 175B](docs/images/gpt-perf-66-175.png)
+
+# Roadmap
+
+ChatLearn 接下来会支持以下特性：
+- [ ] 支持更多的模型；
+- [ ] 接入 DeepSpeed 作为训练 backend；
+- [ ] 自动并行策略调优；
+- [ ] 支持 vLLM 等高效推理引擎；
+- [ ] 支持更多的 RL 算法；
+
+<br><br>
+我们欢迎社区小伙伴参与进来合作开发。
+
+
diff --git a/docs/images/logo.png b/docs/images/logo.png
diff --git a/docs/zh/chatlearn.md b/docs/zh/chatlearn.md
@@ -1,4 +1,6 @@
-ChatLearn是一个灵活易用、超大规模RLHF高效训练框架。
+# ChatLearn
+
+ChatLearn 是一个灵活易用，支持大规模 RLHF 的高效训练框架。
 
 # 概述
 
@@ -34,45 +36,20 @@ ChatLearn Executor 将 RLHF 训练流程划分为两个主要的模块，`Enviro
 
 # 快速开始
 
-## 环境 setup
-```python
-git clone https://github.com/alibaba/ChatLearn.git
-cd docker/ngc
-docker build -f Dockerfile.ngc22.10 .
-```
-
-## ChatLearn 训练示例
-以下为如何使用基于 LLaMA-13B 模型构建 End-to-End 的 LLaMA 模型训练流程。这里 `CHATLEARN` 为 ChatLearn 项目代码存放位置。数据准备工作和详细流程请参考[ChatLearn Tutorial](https://aliyuque.antfin.com/pai/torchx/ntxclugo8l45vycf)。
-
-[Step1: SFT](https://aliyuque.antfin.com/pai/torchx/ntxclugo8l45vycf#Vj879)
-
-```python
-cd ${CHATLEARN}/examples/megatron/step1_sft/
-bash llama_sft.sh
-```
-
-[Step2: Reward](https://aliyuque.antfin.com/pai/torchx/ntxclugo8l45vycf#wIY63)
-
-```python
-cd ${CHATLEARN}/examples/megatron/step2_reward/
-bash llama_reward.sh
-```
-
-[Step3: RLHF](https://aliyuque.antfin.com/pai/torchx/ntxclugo8l45vycf#obkKT)
+请参考 [文档](https://chatlearn.readthedocs.io/zh/latest/) 快速开始.
 
-```python
-cd ${CHATLEARN}/examples/megatron/step3_rlhf/
-bash run_scripts/llama/run_13b_13b.sh
-```
+1. [环境和代码准备](installation.md)
+2. [基于 LLaMA 模型的端到端训练教程](tutorial.md)
 
 # 支持的模型
 
 当前 ChatLearn 框架支持任意规模的 GPT/LLaMA 模型 RLHF 训练。
 
-| 模型类型 |
-| :----: |
-| GPT |
-| LLaMA |
+| 模型类型                                                                                                                                                                         |
+|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| GPT (GPT 系列各种规模的模型)                                                                                                                                                          |
+| LLaMA (`lmsys/vicuna-13b-v1.3`, `decapoda-research/llama-7b-hf`, `decapoda-research/llama-13b-hf`, `decapoda-research/llama-30b-hf`, `decapoda-research/llama-65b-hf`, etc.) |
+| LLaMA2 (`meta-llama/Llama-2-7b-hf`, `meta-llama/Llama-2-13b-hf`)                                                                                                             |
 
 注：当前的性能 benchmark 均基于 GPT 系列模型。
 
@@ -94,7 +71,7 @@ ChatLearn 接下来会支持以下特性：
 - [ ] 支持 vLLM 等高效推理引擎；
 - [ ] 支持更多的 RL 算法；
 
-<br></br>
+<br><br>
 我们欢迎社区小伙伴参与进来合作开发。
 
 # Reference

diff --git a/docs/zh/index.rst b/docs/zh/index.rst
@@ -4,13 +4,22 @@ ChatLearn Documentation
 
 .. toctree::
    :maxdepth: 1
-   :caption: ChatLearn: 超大规模 RLHF 高效训练框架
+   :caption: ChatLearn: 大规模 RLHF 高效训练框架
 
    chatlearn
 
 |
 |
 
+.. toctree::
+   :maxdepth: 1
+   :caption: 安装
+
+   installation
+
+|
+|
+
 .. toctree::
    :maxdepth: 1
    :caption: Tutorial

diff --git a/docs/zh/installation.md b/docs/zh/installation.md
@@ -0,0 +1,19 @@
+# 环境和代码准备
+
+1. 镜像准备
+
+推荐参考 `https://github.com/alibaba/ChatLearn/tree/master/docker/ngc/Dockerfile.ngc22.10` 准备镜像。
+
+2. 代码准备: 用户需要下载 `ChatLearn` 框架代码。
+
+```
+# 下载 ChatLearn 代码
+git clone https://github.com/alibaba/ChatLearn.git
+```
+
+3. 如果您需要运行基于 Megatron-LM 框架的 RLHF 训练程序，您也需要下载为支持ChatLearn训练修改后的 `Megatron-LM-extension` 代码。
+
+```
+# 下载 Megatron-LM-extension
+git clone -b chatlearn-2308 https://github.com/alibaba/Megatron-LM-extension.git
+```
diff --git a/docs/zh/tutorial.md b/docs/zh/tutorial.md
@@ -1,3 +1,5 @@
+# 基于 LLaMA 模型的端到端训练教程
+
 本文档介绍基于 ChatLearn, Megatron-LM 框架和 LLaMA 模型的训练流程。包含三阶段的训练：SFT, Reward 和 RLHF 训练。
 
 # Setup: 镜像和代码准备