Skip to content

Commit

Permalink
add checkpoint for vicuna 7b
Browse files Browse the repository at this point in the history
  • Loading branch information
TsuTikgiau committed Apr 20, 2023
1 parent 6daf123 commit 446ede2
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 10 deletions.
13 changes: 9 additions & 4 deletions PrepareVicuna.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,16 +2,21 @@
Vicuna is an open-source LLAMA-based LLM that has a performance close to ChatGPT.
We currently use the v0 version of Vicuna-13B.

To prepare Vicuna’s weight, first download Vicuna’s **delta** weight from [https://huggingface.co/lmsys/vicuna-13b-delta-v0](https://huggingface.co/lmsys/vicuna-13b-delta-v0). In case you have git-lfs installed (https://git-lfs.com), this can be done by
To prepare Vicuna’s weight, first download Vicuna’s **delta** weight from [https://huggingface.co/lmsys/vicuna-13b-delta-v0](https://huggingface.co/lmsys/vicuna-13b-delta-v0).
In case you have git-lfs installed (https://git-lfs.com), this can be done by

```
git lfs install
git clone https://huggingface.co/lmsys/vicuna-13b-delta-v0
git clone https://huggingface.co/lmsys/vicuna-13b-delta-v0 # more powerful, need at least 24G gpu memory
# or
git clone https://huggingface.co/lmsys/vicuna-7b-delta-v0 # smaller, need 12G gpu memory
```

Note that this is not directly the working weight, but the difference between the working weight and the original weight of LLAMA-13B. (Due to LLAMA’s rules, we cannot distribute the weight of LLAMA.)

Then, you need to obtain the original LLAMA-13B weights in the HuggingFace format either following the instruction provided by HuggingFace [here](https://huggingface.co/docs/transformers/main/model_doc/llama) or from the Internet.
Then, you need to obtain the original LLAMA-7B or LLAMA-13B weights in the HuggingFace format
either following the instruction provided by HuggingFace
[here](https://huggingface.co/docs/transformers/main/model_doc/llama) or from the Internet.

When these two weights are ready, we can use tools from Vicuna’s team to create the real working weight.
First, Install their library that is compatible with v0 Vicuna by
Expand All @@ -23,7 +28,7 @@ pip install git+https://github.com/lm-sys/[email protected]
Then, run the following command to create the final working weight

```
python -m fastchat.model.apply_delta --base /path/to/llama-13b-hf/ --target /path/to/save/working/vicuna/weight/ --delta /path/to/vicuna-13b-delta-v0/
python -m fastchat.model.apply_delta --base /path/to/llama-13bOR7b-hf/ --target /path/to/save/working/vicuna/weight/ --delta /path/to/vicuna-13bOR7b-delta-v0/
```

Now you are good to go!
Expand Down
16 changes: 10 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,13 @@ Then, set the path to the vicuna weight in the model config file

**3. Prepare the pretrained MiniGPT-4 checkpoint**

To play with our pretrained model, download the pretrained checkpoint
[here](https://drive.google.com/file/d/1a4zLvaiDBr-36pasffmgpvH5P7CKmpze/view?usp=share_link).
Download the pretrained checkpoints according to the Vicuna model you prepare.

| Checkpoint Aligned with Vicuna 13B | Checkpoint Aligned with Vicuna 7B |
:------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------------------------:
[Downlad](https://drive.google.com/file/d/1a4zLvaiDBr-36pasffmgpvH5P7CKmpze/view?usp=share_link) | [Download](https://drive.google.com/file/d/1RY9jV0dyqLX-o38LrumkKRh6Jtaop58R/view?usp=sharing)


Then, set the path to the pretrained checkpoint in the evaluation config file
in [eval_configs/minigpt4_eval.yaml](eval_configs/minigpt4_eval.yaml#L10) at Line 11.

Expand All @@ -84,10 +89,9 @@ Try out our demo [demo.py](demo.py) on your local machine by running
python demo.py --cfg-path eval_configs/minigpt4_eval.yaml --gpu-id 0
```

Here, we load Vicuna as 8 bit by default to save some GPU memory usage.
Besides, the default beam search width is 1.
Under this setting, the demo cost about 23G GPU memory.
If you have a more powerful GPU with larger GPU memory, you can run the model
To save GPU memory, Vicuna loads as 8 bit by default, with a beam search width of 1.
This configuration requires about 23G GPU memory for Vicuna 13B and 11.5G GPU memory for Vicuna 7B.
For more powerful GPUs, you can run the model
in 16 bit by setting low_resource to False in the config file
[minigpt4_eval.yaml](eval_configs/minigpt4_eval.yaml) and use a larger beam search width.

Expand Down

0 comments on commit 446ede2

Please sign in to comment.