Skip to content

Commit

Permalink
fixed docs
Browse files Browse the repository at this point in the history
Signed-off-by: ftgreat <[email protected]>
  • Loading branch information
ftgreat committed Jun 10, 2023
1 parent d4bb7a0 commit b0d164a
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 7 deletions.
7 changes: 4 additions & 3 deletions examples/Aquila/Aquila-pretrain/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,12 @@ We also support [Huggingface](https://huggingface.co/BAAI).

| 模型/Model | 状态/State | 能否商用/Commercial use? | 所用显卡/GPU |
| :---------------- | :------- | :-- |:-- |
| <font color=red>Aquila-7B </font> | 已发布 || Nvidia-A100 |
| <font color=red>Aquila-30B </font> | 敬请期待 | | Nvidia-A100 |
| Aquila-7B | 已发布 || Nvidia-A100 |
| AquilaChat-7B |已发布 | | Nvidia-A100 |
| AquilaCode-7B-NV |已发布 || Nvidia-A100 |
| AquilaCode-7B-TS |已发布 || Tianshu-BI-V100 |
| AquilaChat-7B |已发布 || Nvidia-A100 |
| Aquila-33B | **敬请期待** || Nvidia-A100 |
| AquilaChat-33B |**敬请期待** || Nvidia-A100 |

我们使用了一系列更高效的底层算子来辅助模型训练,其中包括参考[flash-attention](https://github.com/HazyResearch/flash-attention)的方法并替换了一些中间计算,同时还使用了RMSNorm。在此基础上,我们升级了[BMtrain](https://github.com/OpenBMB/BMTrain)技术进行轻量化的并行训练,该技术采用了数据并行、ZeRO(零冗余优化器)、优化器卸载、检查点和操作融合、通信-计算重叠等方法来优化模型训练过程。

Expand Down
7 changes: 4 additions & 3 deletions examples/Aquila/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,12 @@ We also support [Huggingface](https://huggingface.co/BAAI).

| 模型/Model | 状态/State | 能否商用/Commercial use? | 所用显卡/GPU |
| :---------------- | :------- | :-- |:-- |
| <font color=red>Aquila-7B </font> | 已发布 || Nvidia-A100 |
| <font color=red>Aquila-33B </font> | 敬请期待 | | Nvidia-A100 |
| Aquila-7B | 已发布 || Nvidia-A100 |
| AquilaChat-7B |已发布 | | Nvidia-A100 |
| AquilaCode-7B-NV |已发布 || Nvidia-A100 |
| AquilaCode-7B-TS |已发布 || Tianshu-BI-V100 |
| AquilaChat-7B |已发布 || Nvidia-A100 |
| Aquila-33B | **敬请期待** || Nvidia-A100 |
| AquilaChat-33B |**敬请期待** || Nvidia-A100 |

我们使用了一系列更高效的底层算子来辅助模型训练,其中包括参考[flash-attention](https://github.com/HazyResearch/flash-attention)的方法并替换了一些中间计算,同时还使用了RMSNorm。在此基础上,我们升级了[BMtrain](https://github.com/OpenBMB/BMTrain)技术进行轻量化的并行训练,该技术采用了数据并行、ZeRO(零冗余优化器)、优化器卸载、检查点和操作融合、通信-计算重叠等方法来优化模型训练过程。

Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@
'tensorboard==2.9.0',
'transformers==4.20.1',
'datasets==2.0.0',
'setuptools==59.5.0',
'setuptools==66.0.0',
'protobuf==3.19.6',
'ftfy == 6.1.1',
'Pillow >= 9.3.0',
Expand Down

0 comments on commit b0d164a

Please sign in to comment.