From 6c2f236ea45eacdc3739be29645b825118b45b84 Mon Sep 17 00:00:00 2001 From: sekyondaMeta <127536312+sekyondaMeta@users.noreply.github.com> Date: Sat, 9 Sep 2023 18:08:26 -0400 Subject: [PATCH 1/5] Update README.md Adding quick start steps --- README.md | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index bba25525c..179b93116 100755 --- a/README.md +++ b/README.md @@ -28,11 +28,35 @@ We are also providing downloads on [Hugging Face](https://huggingface.co/meta-ll ## Setup -In a conda env with PyTorch / CUDA available, clone the repo and run in the top-level directory: +You can follow the steps below to quickly get up and running with Llama 2 models. These steps will let you run quick inference locally. For more examples, see the [Llama 2 recipes repository](https://github.com/facebookresearch/llama-recipes). -``` +1. In a conda env with PyTorch / CUDA availableClone and download this repository + +2. In the top level directory run: +```bash pip install -e . ``` +3. Visit the [Meta.AI website](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) and register to download the model/s. + +4. Once registered, you will get an email with a URL to download the models. You will need this URL when you run the download.sh script. + +5. Navigate to your downloaded llama repository and run the download.sh script. + - Make sure to grant execution permissions to the download.sh script + - During this process, you will be prompted to enter the URL from the email. + - Do not use the “Copy Link” option but rather make sure to manually copy the link from the email. + +6. Once the model/s you want have been downloaded, you can run the model locally using the command below: +```bash +torchrun --nproc_per_node 1 example_chat_completion.py \ + --ckpt_dir llama-2-7b-chat/ \ + --tokenizer_path tokenizer.model \ + --max_seq_len 512 --max_batch_size 6 +``` +**Note** +- Replace `llama-2-7b-chat/` with the path to your checkpoint directory and `tokenizer.model` with the path to your tokenizer model. +- The `–nproc_per_node` should be set to the [MP](#inference) value for the model you are using. +- Adjust the `max_seq_len` and `max_batch_size` parameters as needed. +- This example runs the example_chat_completion.py but you can change that to a different .py file. ## Inference From d06e1e1c078eb4f1a9e5681e4d9ad0b6561437ab Mon Sep 17 00:00:00 2001 From: sekyondaMeta <127536312+sekyondaMeta@users.noreply.github.com> Date: Sat, 9 Sep 2023 18:10:20 -0400 Subject: [PATCH 2/5] Update README.md --- README.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 179b93116..8359a8da3 100755 --- a/README.md +++ b/README.md @@ -26,16 +26,16 @@ Keep in mind that the links expire after 24 hours and a certain amount of downlo We are also providing downloads on [Hugging Face](https://huggingface.co/meta-llama). You must first request a download from the Meta AI website using the same email address as your Hugging Face account. After doing so, you can request access to any of the models on Hugging Face and within 1-2 days your account will be granted access to all versions. -## Setup +## Quick Start You can follow the steps below to quickly get up and running with Llama 2 models. These steps will let you run quick inference locally. For more examples, see the [Llama 2 recipes repository](https://github.com/facebookresearch/llama-recipes). 1. In a conda env with PyTorch / CUDA availableClone and download this repository 2. In the top level directory run: -```bash -pip install -e . -``` + ```bash + pip install -e . + ``` 3. Visit the [Meta.AI website](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) and register to download the model/s. 4. Once registered, you will get an email with a URL to download the models. You will need this URL when you run the download.sh script. From 001b67243c6dee52650b713f3dfacda2790048e7 Mon Sep 17 00:00:00 2001 From: sekyondaMeta <127536312+sekyondaMeta@users.noreply.github.com> Date: Sat, 9 Sep 2023 18:11:32 -0400 Subject: [PATCH 3/5] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 8359a8da3..6326fe79d 100755 --- a/README.md +++ b/README.md @@ -30,7 +30,7 @@ We are also providing downloads on [Hugging Face](https://huggingface.co/meta-ll You can follow the steps below to quickly get up and running with Llama 2 models. These steps will let you run quick inference locally. For more examples, see the [Llama 2 recipes repository](https://github.com/facebookresearch/llama-recipes). -1. In a conda env with PyTorch / CUDA availableClone and download this repository +1. In a conda env with PyTorch / CUDA available clone and download this repository. 2. In the top level directory run: ```bash From f2e6eac348a718dc3f70b2b34e27ee6a9b9efd4f Mon Sep 17 00:00:00 2001 From: sekyondaMeta <127536312+sekyondaMeta@users.noreply.github.com> Date: Sat, 9 Sep 2023 18:13:03 -0400 Subject: [PATCH 4/5] Update README.md --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 6326fe79d..a6498251e 100755 --- a/README.md +++ b/README.md @@ -36,11 +36,11 @@ You can follow the steps below to quickly get up and running with Llama 2 models ```bash pip install -e . ``` -3. Visit the [Meta.AI website](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) and register to download the model/s. +3. Visit the [Meta AI website](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) and register to download the model/s. 4. Once registered, you will get an email with a URL to download the models. You will need this URL when you run the download.sh script. -5. Navigate to your downloaded llama repository and run the download.sh script. +5. Once you get the email, navigate to your downloaded llama repository and run the download.sh script. - Make sure to grant execution permissions to the download.sh script - During this process, you will be prompted to enter the URL from the email. - Do not use the “Copy Link” option but rather make sure to manually copy the link from the email. From ac19393aeb30c5cf54e53b688d10f8d603f053be Mon Sep 17 00:00:00 2001 From: sekyondaMeta <127536312+sekyondaMeta@users.noreply.github.com> Date: Sat, 9 Sep 2023 18:15:32 -0400 Subject: [PATCH 5/5] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index a6498251e..8dfa62428 100755 --- a/README.md +++ b/README.md @@ -56,7 +56,7 @@ torchrun --nproc_per_node 1 example_chat_completion.py \ - Replace `llama-2-7b-chat/` with the path to your checkpoint directory and `tokenizer.model` with the path to your tokenizer model. - The `–nproc_per_node` should be set to the [MP](#inference) value for the model you are using. - Adjust the `max_seq_len` and `max_batch_size` parameters as needed. -- This example runs the example_chat_completion.py but you can change that to a different .py file. +- This example runs the [example_chat_completion.py](example_chat_completion.py) found in this repository but you can change that to a different .py file. ## Inference