update readme

tszgc · Jun 2, 2023 · f6aa3ac · f6aa3ac
1 parent bb8ecc5
commit f6aa3ac
Show file tree

Hide file tree

Showing 5 changed files with 11 additions and 9 deletions.
diff --git a/README.md b/README.md
@@ -17,7 +17,8 @@ The advantages of our solution are high parameter efficiency, graphics card frie
 - Llama-13B instruction tuning is possible on a 3090 (24G) ([13b-instruct](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-lora-13b-belle-and-guanaco))
 - Llama 7B can be fine-tuned on 3090 even for conversations of 2048 length; Use 50,000 pieces of data to get good results ([chatv1](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-lora-7b-chatv1))
 - Llama 7B fine-tuning example on [medical](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-continue-finetune-7epoch-cMedQA2) and [legal](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-7b-legal-lora) domains
-- Easily deployable on 2080Ti/3090
+- Support `qlora-4bit` which can train Llama 13B on 2080Ti. 
+- Easily deployable on 2080Ti/3090, support multiple-gpu inference, which can reduce VRAM more.
 
 The repo contains:
 - code for finetune the model 
@@ -40,8 +41,8 @@ Before asking questions, take a look at this [FAQ](https://github.com/Facico/Chi
 
 ## What‘s New
 
-- June, 1, 2023: support for 4bit training + inference, providing a multi-GPU inference interface (the environment is different from the original 8bit! Also provides test_tokenizers.py to check EOS token)
-- **May 17, 2023: Llama 7B fine-tuning example on [legal](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-7b-legal-lora) domains, The performance is in [here](https://github.com/Facico/Chinese-Vicuna/blob/master/docs/performance-chatv1-legal.md)**
+- **June, 1, 2023: support for 4bit training + inference, providing a multi-GPU inference interface (NOTICE THAT the environment is different from the original 8bit! Also provides test_tokenizers.py to further check EOS token)**
+- May 17, 2023: Llama 7B fine-tuning example on [legal](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-7b-legal-lora) domains, The performance is in [here](https://github.com/Facico/Chinese-Vicuna/blob/master/docs/performance-chatv1-legal.md)
 - May 10, 2023: Released [chatv1](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-lora-7b-chatv1) which have better conversational ability. The performance is in [here](https://github.com/Facico/Chinese-Vicuna/blob/master/docs/performance-chatv1.md)
 - May 10, 2023: Released [instruct_chat_50k.jsonl](https://huggingface.co/datasets/Chinese-Vicuna/instruct_chat_50k.jsonl) which is composed of 30k Chinese sharegpt dataset and 20k [alpaca-instruction-Chinese-dataset](https://github.com/hikariming/alpaca_chinese_dataset)
 - April 11, 2023: Released our continuous-finetune on the vertical corpus of Chinese medical quizzes [Chinese-Vicuna-medical](https://github.com/Facico/Chinese-Vicuna/blob/master/docs/performance-medical.md).Provides examples of vertical corpus training

diff --git a/docs/readme-zh.md b/docs/readme-zh.md
@@ -15,7 +15,8 @@
 - 在一张3090（24G）上可以对Llama-13B进行指令微调 ([13b-instruct](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-lora-13b-belle-and-guanaco))
 - 即使是长度为2048的对话，在3090上也可以完成Llama-7B的微调；使用5万条数据即可有不错效果 ([chatv1](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-lora-7b-chatv1))
 - 领域微调的例子：医学问答 和 法律问答。([medical](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-continue-finetune-7epoch-cMedQA2) and [legal](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-7b-legal-lora))
-- 可在2080Ti/3090上轻松部署
+- 支持`qlora-4bit`，使用4bit可以在2080Ti上完成13B的训练
+- 可在2080Ti/3090上轻松部署，支持多卡同时推理，可进一步降低显存占用
 
 项目包括
 
@@ -36,8 +37,8 @@ https://user-images.githubusercontent.com/72137647/229739363-1b48f3a9-02a1-46ab-
 在提问题之前，请务必先看看这个[FAQ](https://github.com/Facico/Chinese-Vicuna/blob/master/docs/notes.md)，这里总结了大部分常见的问题。
 
 ## What‘s New
-- June, 1, 2023：支持4bit训练+推理，提供了多卡推理接口（环境与原本8bit不同！同时提供了test_tokenizers.py测eos正不正常）
-- **May 17, 2023: 开放法律问答模型 [legal](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-7b-legal-lora) ，表现参考[这里](https://github.com/Facico/Chinese-Vicuna/blob/master/docs/performance-chatv1-legal.md)**
+- **June, 1, 2023：支持4bit训练+推理，提供了多卡推理接口（注意需要使用和原本8bit不同的环境！建议使用`requirement_4bit.txt`安装新conda环境。同时提供了test_tokenizers.py测eos正不正常）**
+- May 17, 2023: 开放法律问答模型 [legal](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-7b-legal-lora) ，表现参考[这里](https://github.com/Facico/Chinese-Vicuna/blob/master/docs/performance-chatv1-legal.md)
 - May 10, 2023：开放有更好对话能力的 [chatv1](https://huggingface.co/Chinese-Vicuna/Chinese-Vicuna-lora-7b-chatv1) . 表现参考[这里](https://github.com/Facico/Chinese-Vicuna/blob/master/docs/performance-chatv1.md)
 - May 10, 2023：开放上述模型的微调数据[instruct_chat_50k.jsonl](https://huggingface.co/datasets/Chinese-Vicuna/instruct_chat_50k.jsonl)：3万条sharegpt中文数据和2万条[alpaca-instruction-Chinese-dataset](https://github.com/hikariming/alpaca_chinese_dataset)数据组成
 - March 23, 2023：开放了在belle+guanaco数据上训练50w条数据训练的checkpoint-4000

diff --git a/scripts/finetune_4bit.sh b/scripts/finetune_4bit.sh
@@ -3,7 +3,7 @@ CUDAs=(${TOT_CUDA//,/ })
 CUDA_NUM=${#CUDAs[@]}
 PORT="12345"
 
-DATA_PATH="/home/cciip/private/fanchenghao/dataset/instruction/merge.json" #"sample/instruct/data_sample.jsonl" #"../dataset/instruction/guanaco_non_chat_mini_52K-utf8.json" #"./sample/merge_sample.json"
+DATA_PATH="sample/instruct/data_sample.jsonl"
 OUTPUT_PATH="lora-Vicuna"
 MODEL_PATH="/model/yahma_llama_7b"
 lora_checkpoint="./lora-Vicuna/checkpoint-11600"

diff --git a/scripts/generate_4bit.sh b/scripts/generate_4bit.sh
@@ -1,6 +1,6 @@
 TOT_CUDA="0,1,2,3" #Upgrade bitsandbytes to the latest version to enable balanced loading of multiple GPUs, for example: pip install bitsandbytes==0.39.0
 BASE_MODEL="decapoda-research/llama-7b-hf"
-LORA_PATH="/home/cciip/private/fanchenghao/branch/Chinese-Vicuna/lora-Vicuna/checkpoint-16200" #"Chinese-Vicuna/Chinese-Vicuna-lora-7b-belle-and-guanaco" #"./lora-Vicuna/checkpoint-final"
+LORA_PATH="./lora-Vicuna/checkpoint-16200" #"Chinese-Vicuna/Chinese-Vicuna-lora-7b-belle-and-guanaco" #"./lora-Vicuna/checkpoint-final"
 USE_LOCAL=1 # 1: use local model, 0: use huggingface model
 TYPE_WRITER=1 # whether output streamly
 if [[ USE_LOCAL -eq 1 ]]

diff --git a/tools/application/chitchat_finetune.py b/tools/application/chitchat_finetune.py
@@ -42,7 +42,7 @@
     "q_proj",
     "v_proj",
 ]
-DATA_PATH = args.data_path #"/home/cciip/private/fanchenghao/dataset/instruction/merge.json"
+DATA_PATH = args.data_path 
 OUTPUT_DIR = args.output_path #"lora-Vicuna"
 
 device_map = "auto"