add guide to prepare vicuna

xgd · Apr 18, 2023 · 307f0ee · 307f0ee
1 parent 8718057
commit 307f0ee
Show file tree

Hide file tree

Showing 2 changed files with 33 additions and 2 deletions.
diff --git a/PrepareVicuna.md b/PrepareVicuna.md
@@ -0,0 +1,30 @@
+## How to Prepare Vicuna Weight
+Vicuna is an open-source LLAMA-based LLM that has a performance close to ChatGPT. 
+We currently use the v0 version of Vicuna-13B. 
+
+To prepare Vicuna’s weight, first download Vicuna’s **delta** weight from [https://huggingface.co/lmsys/vicuna-13b-delta-v0](https://huggingface.co/lmsys/vicuna-13b-delta-v0). In case you have git-lfs installed (https://git-lfs.com), this can be done by
+
+```
+git lfs install
+git clone https://huggingface.co/lmsys/vicuna-13b-delta-v0
+```
+
+Note that this is not directly the working weight, but the difference between the working weight and the original weight of LLAMA-13B. (Due to LLAMA’s rules, we cannot distribute the weight of LLAMA.)
+
+Then, you need to obtain the original LLAMA-13B weights in the HuggingFace format either following the instruction provided by HuggingFace [here](https://huggingface.co/docs/transformers/main/model_doc/llama) or from the Internet. 
+
+When these two weights are ready, we can use tools from Vicuna’s team to create the real working weight.
+First, Install their library that is compatible with v0 Vicuna by
+
+```
+pip install git+https://github.com/huggingface/[email protected]
+```
+
+Then, run the following command to create the final working weight
+
+```
+python -m fastchat.model.apply_delta --base /path/to/llama-13b-hf/  --target /path/to/save/working/vicuna/weight/  --delta /path/to/vicuna-13b-delta-v0/
+```
+
+Now you are good to go!
+
diff --git a/README.md b/README.md
@@ -53,7 +53,8 @@ conda activate minigpt4
 **2. Prepare the pretrained Vicuna weights**
 
 The current version of MiniGPT-4 is built on the v0 versoin of Vicuna-13B.
-Please refer to their instructions [here](https://huggingface.co/lmsys/vicuna-13b-delta-v0) to obtaining the weights.
+Please refer to our instruction [here](PrepareVicuna.md) 
+to prepare the Vicuna weights.
 The final weights would be in a single folder with the following structure:
 
 ```
@@ -105,7 +106,7 @@ You can change the save path in the config file
 torchrun --nproc-per-node NUM_GPU train.py --cfg-path train_configs/minigpt4_stage1_pretrain.yaml
 ```
 
-**1. Second finetuning stage**
+**2. Second finetuning stage**
 
 In the second stage, we use a small high quality image-text pair dataset created by ourselves
 and convert it to a conversation format to further align MiniGPT-4.