Skip to content

Commit

Permalink
Use the CLI in the tutorials (Lightning-AI#1094)
Browse files Browse the repository at this point in the history
  • Loading branch information
carmocca authored and awaelchli committed Mar 15, 2024
1 parent a7f41ef commit b540bbf
Show file tree
Hide file tree
Showing 13 changed files with 90 additions and 90 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ To generate text predictions, you need to download the model weights. **If you d
Run inference:

```bash
python litgpt/generate/base.py --prompt "Hello, my name is"
litgpt generate base --prompt "Hello, my name is"
```

This will run the 3B pretrained model and require ~7 GB of GPU memory using the `bfloat16` datatype.
Expand All @@ -112,7 +112,7 @@ This will run the 3B pretrained model and require ~7 GB of GPU memory using the
You can also chat with the model interactively:

```bash
python litgpt/chat/base.py
litgpt chat
```

 
Expand All @@ -131,19 +131,19 @@ For example, you can either use
Adapter ([Zhang et al. 2023](https://arxiv.org/abs/2303.16199)):

```bash
python litgpt/finetune/adapter.py
litgpt finetune adapter
```

or Adapter v2 ([Gao et al. 2023](https://arxiv.org/abs/2304.15010)):

```bash
python litgpt/finetune/adapter_v2.py
litgpt finetune adapter_v2
```

or LoRA ([Hu et al. 2021](https://arxiv.org/abs/2106.09685)):

```bash
python litgpt/finetune/lora.py
litgpt finetune lora
```

(Please see the [tutorials/finetune_adapter](tutorials/finetune_adapter.md) for details on the differences between the two adapter methods.)
Expand Down
2 changes: 1 addition & 1 deletion litgpt/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ def check_valid_checkpoint_dir(checkpoint_dir: Path, lora: bool = False) -> None
error_message = (
f"--checkpoint_dir {str(checkpoint_dir.absolute())!r}{problem}."
"\nFind download instructions at https://github.com/Lightning-AI/litgpt/blob/main/tutorials\n"
f"{extra}\nSee all download options by running:\n python litgpt/scripts/download.py"
f"{extra}\nSee all download options by running:\n litgpt download"
)
print(error_message, file=sys.stderr)
raise SystemExit(1)
Expand Down
6 changes: 3 additions & 3 deletions tests/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ def test_check_valid_checkpoint_dir(tmp_path):
Find download instructions at https://github.com/Lightning-AI/litgpt/blob/main/tutorials
See all download options by running:
python litgpt/scripts/download.py
litgpt download
""".strip()
assert out == expected

Expand All @@ -61,7 +61,7 @@ def test_check_valid_checkpoint_dir(tmp_path):
Find download instructions at https://github.com/Lightning-AI/litgpt/blob/main/tutorials
See all download options by running:
python litgpt/scripts/download.py
litgpt download
""".strip()
assert out == expected

Expand All @@ -79,7 +79,7 @@ def test_check_valid_checkpoint_dir(tmp_path):
--checkpoint_dir '{str(checkpoint_dir.absolute())}'
See all download options by running:
python litgpt/scripts/download.py
litgpt download
""".strip()
assert out == expected

Expand Down
6 changes: 3 additions & 3 deletions tutorials/convert_hf_checkpoint.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
By default, the `litgpt/scripts/download.py` script converts the downloaded HF checkpoint files into a LitGPT compatible format after downloading. For example,

```bash
python litgpt/scripts/download.py --repo_id EleutherAI/pythia-14m
litgpt download --repo_id EleutherAI/pythia-14m
```

creates the following files:
Expand All @@ -28,7 +28,7 @@ To disable the automatic conversion, which is useful for development and debuggi
```bash
rm -rf checkpoints/EleutherAI/pythia-14m

python litgpt/scripts/download.py \
litgpt download \
--repo_id EleutherAI/pythia-14m \
--convert_checkpoint false

Expand All @@ -49,7 +49,7 @@ ls checkpoints/EleutherAI/pythia-14m
The required files `lit_config.json` and `lit_model.pth` files can then be manually generated via the `litgpt/scripts/convert_hf_checkpoint.py` script:

```bash
python litgpt/scripts/convert_hf_checkpoint.py \
litgpt convert to_litgpt \
--checkpoint_dir checkpoints/EleutherAI/pythia-14m
```

12 changes: 6 additions & 6 deletions tutorials/convert_lit_models.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ LitGPT weights need to be converted to a format that Hugging Face understands wi
We provide a helpful script to convert models LitGPT models back to their equivalent Hugging Face Transformers format:

```sh
python litgpt/scripts/convert_lit_checkpoint.py \
litgpt convert from_litgpt \
--checkpoint_dir checkpoint_dir \
--output_dir converted_dir
```
Expand Down Expand Up @@ -47,7 +47,7 @@ model = AutoModel.from_pretrained("online_repo_id", state_dict=state_dict)
Please note that if you want to convert a model that has been fine-tuned using an adapter like LoRA, these weights should be [merged](../litgpt/scripts/merge_lora.py) to the checkpoint prior to converting.

```sh
python litgpt/scripts/merge_lora.py \
litgpt merge_lora \
--checkpoint_dir path/to/lora/checkpoint_dir
```

Expand All @@ -73,7 +73,7 @@ by running `litgpt/scripts/download.py` without any additional arguments.
Then, we download the model we specified via `$repo_id` above:

```bash
python litgpt/scripts/download.py --repo_id $repo_id
litgpt download --repo_id $repo_id
```

2. Finetune the model:
Expand All @@ -82,7 +82,7 @@ python litgpt/scripts/download.py --repo_id $repo_id
```bash
export finetuned_dir=out/lit-finetuned-model

python litgpt/finetune/lora.py \
litgpt finetune lora \
--checkpoint_dir checkpoints/$repo_id \
--out_dir $finetuned_dir \
--train.epochs 1 \
Expand All @@ -94,15 +94,15 @@ python litgpt/finetune/lora.py \
Note that this step only applies if the model was finetuned with `lora.py` above and not when `full.py` was used for finetuning.

```bash
python litgpt/scripts/merge_lora.py \
litgpt merge_lora \
--checkpoint_dir $finetuned_dir/final
```


4. Convert the finetuning model back into a HF format:

```bash
python litgpt/scripts/convert_lit_checkpoint.py \
litgpt convert from_litgpt \
--checkpoint_dir $finetuned_dir/final/ \
--output_dir out/hf-tinyllama/converted \
```
Expand Down
26 changes: 13 additions & 13 deletions tutorials/download_model_weights.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ LitGPT supports a variety of LLM architectures with publicly available weights.
To see all supported models, run the following command without arguments:

```bash
python litgpt/scripts/download.py
litgpt download
```

The output is shown below:
Expand Down Expand Up @@ -128,7 +128,7 @@ Trelis/Llama-2-7b-chat-hf-function-calling-v2
To download the weights for a specific model, use the `--repo_id` argument. Replace `<repo_id>` with the model's repository ID. For example:

```bash
python litgpt/scripts/download.py --repo_id <repo_id>
litgpt download --repo_id <repo_id>
```
This command downloads the model checkpoint into the `checkpoints/` directory.

Expand All @@ -139,7 +139,7 @@ This command downloads the model checkpoint into the `checkpoints/` directory.
For more options, add the `--help` flag when running the script:

```bash
python litgpt/scripts/download.py --help
litgpt download --help
```

&nbsp;
Expand All @@ -148,7 +148,7 @@ python litgpt/scripts/download.py --help
After conversion, run the model with the `--checkpoint_dir` flag, adjusting `repo_id` accordingly:

```bash
python litgpt/chat/base.py --checkpoint_dir checkpoints/<repo_id>
litgpt chat --checkpoint_dir checkpoints/<repo_id>
```

&nbsp;
Expand All @@ -159,7 +159,7 @@ This section shows a typical end-to-end example for downloading and using TinyLl
1. List available TinyLlama checkpoints:

```bash
python litgpt/scripts/download.py | grep Tiny
litgpt download | grep Tiny
```

```
Expand All @@ -171,13 +171,13 @@ TinyLlama/TinyLlama-1.1B-Chat-v1.0

```bash
export repo_id=TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T
python litgpt/scripts/download.py --repo_id $repo_id
litgpt download --repo_id $repo_id
```

3. Use the TinyLlama model:

```bash
python litgpt/chat/base.py --checkpoint_dir checkpoints/$repo_id
litgpt chat --checkpoint_dir checkpoints/$repo_id
```

&nbsp;
Expand All @@ -190,7 +190,7 @@ For example, to get access to the Gemma 2B model, you can do so by following the
Once you've been granted access and obtained the access token you need to pass the additional `--access_token`:

```bash
python litgpt/scripts/download.py \
litgpt download \
--repo_id google/gemma-2b \
--access_token your_hf_token
```
Expand All @@ -203,7 +203,7 @@ The `download.py` script will automatically convert the downloaded model checkpo


```bash
python litgpt/scripts/download.py \
litgpt download \
--repo_id <repo_id>
--dtype bf16-true
```
Expand All @@ -218,15 +218,15 @@ For development purposes, for example, when adding or experimenting with new mod
You can do this by passing the `--convert_checkpoint false` option to the download script:

```bash
python litgpt/scripts/download.py \
litgpt download \
--repo_id <repo_id> \
--convert_checkpoint false
```

and then calling the `convert_hf_checkpoint.py` script:

```bash
python litgpt/scripts/convert_hf_checkpoint.py \
litgpt convert to_litgpt \
--checkpoint_dir checkpoint_dir/<repo_id>
```

Expand All @@ -236,15 +236,15 @@ python litgpt/scripts/convert_hf_checkpoint.py \
In some cases we don't need the model weight, for example, when we are pretraining a model from scratch instead of finetuning it. For cases like this, you can use the `--tokenizer_only` flag to only download a model's tokenizer, which can then be used in the pretraining scripts:

```bash
python litgpt/scripts/download.py \
litgpt download \
--repo_id TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T \
--tokenizer_only true
```

and

```bash
python litgpt/pretrain.py \
litgpt pretrain \
--data ... \
--model_name tiny-llama-1.1b \
--tokenizer_dir checkpoints/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T/
Expand Down
20 changes: 10 additions & 10 deletions tutorials/finetune_adapter.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,15 @@ For more information about dataset preparation, also see the [prepare_dataset.md
## Running the finetuning

```bash
python litgpt/finetune/adapter.py \
litgpt finetune adapter \
--data Alpaca \
--checkpoint_dir checkpoints/stabilityai/stablelm-base-alpha-3b
```

or for Adapter V2

```bash
python litgpt/finetune/adapter_v2.py \
litgpt finetune adapter_v2 \
--data Alpaca \
--checkpoint_dir checkpoints/stabilityai/stablelm-base-alpha-3b
```
Expand All @@ -49,15 +49,15 @@ For example, the following settings will let you finetune the model in under 1 h
This script will save checkpoints periodically to the `out_dir` directory. If you are finetuning different models or on your own dataset, you can specify an output directory with your preferred name:

```bash
python litgpt/finetune/adapter.py \
litgpt finetune adapter \
--data Alpaca \
--out_dir out/adapter/my-model-finetuned
```

or for Adapter V2

```bash
python litgpt/finetune/adapter_v2.py \
litgpt finetune adapter_v2 \
--data Alpaca \
--out_dir out/adapter_v2/my-model-finetuned
```
Expand All @@ -66,7 +66,7 @@ If your GPU does not support `bfloat16`, you can pass the `--precision 32-true`
For instance, to fine-tune on MPS (the GPU on modern Macs), you can run

```bash
python litgpt/finetune/adapter.py \
litgpt finetune adapter \
--data Alpaca \
--out_dir out/adapter/my-model-finetuned \
--precision 32-true
Expand All @@ -79,13 +79,13 @@ Note that `mps` as the accelerator will be picked up automatically by Fabric whe
Optionally, finetuning using quantization can be enabled via the `--quantize` flag, for example using the 4-bit NormalFloat data type:

```bash
python litgpt/finetune/adapter.py --quantize "bnb.nf4"
litgpt finetune adapter --quantize "bnb.nf4"
```

or using adapter_v2 with double-quantization:

```bash
python litgpt/finetune/adapter_v2.py --quantize "bnb.nf4-dq"
litgpt finetune adapter_v2 --quantize "bnb.nf4-dq"
```

For additional benchmarks and resource requirements, please see the [Resource Tables](resource-tables.md).
Expand All @@ -95,15 +95,15 @@ For additional benchmarks and resource requirements, please see the [Resource Ta
You can test the finetuned model with your own instructions by running:

```bash
python litgpt/generate/adapter.py \
litgpt generate adapter \
--prompt "Recommend a movie to watch on the weekend." \
--checkpoint_dir checkpoints/stabilityai/stablelm-base-alpha-3b
```

or for Adapter V2

```bash
python litgpt/generate/adapter_v2.py \
litgpt generate adapter_v2 \
--prompt "Recommend a movie to watch on the weekend." \
--checkpoint_dir checkpoints/stabilityai/stablelm-base-alpha-3b
```
Expand Down Expand Up @@ -138,7 +138,7 @@ You can easily train on your own instruction dataset saved in JSON format.
2. Run `litgpt/finetune/adapter.py` or `litgpt/finetune/adapter_v2.py` by passing in the location of your data (and optionally other parameters):
```bash
python litgpt/finetune/adapter.py \
litgpt finetune adapter \
--data JSON \
--data.json_path data/mydata.json \
--checkpoint_dir checkpoints/tiiuae/falcon-7b \
Expand Down
Loading

0 comments on commit b540bbf

Please sign in to comment.