Skip to content

Commit

Permalink
Merge pull request meta-llama#775 from sekyondaMeta/readmeUpdate
Browse files Browse the repository at this point in the history
Readme update
  • Loading branch information
sekyondaMeta authored Sep 11, 2023
2 parents 1bc5221 + ac19393 commit 46646b8
Showing 1 changed file with 28 additions and 4 deletions.
32 changes: 28 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,13 +26,37 @@ Keep in mind that the links expire after 24 hours and a certain amount of downlo

We are also providing downloads on [Hugging Face](https://huggingface.co/meta-llama). You must first request a download from the Meta AI website using the same email address as your Hugging Face account. After doing so, you can request access to any of the models on Hugging Face and within 1-2 days your account will be granted access to all versions.

## Setup
## Quick Start

In a conda env with PyTorch / CUDA available, clone the repo and run in the top-level directory:
You can follow the steps below to quickly get up and running with Llama 2 models. These steps will let you run quick inference locally. For more examples, see the [Llama 2 recipes repository](https://github.com/facebookresearch/llama-recipes).

1. In a conda env with PyTorch / CUDA available clone and download this repository.

2. In the top level directory run:
```bash
pip install -e .
```
3. Visit the [Meta AI website](https://ai.meta.com/resources/models-and-libraries/llama-downloads/) and register to download the model/s.

4. Once registered, you will get an email with a URL to download the models. You will need this URL when you run the download.sh script.

5. Once you get the email, navigate to your downloaded llama repository and run the download.sh script.
- Make sure to grant execution permissions to the download.sh script
- During this process, you will be prompted to enter the URL from the email.
- Do not use the “Copy Link” option but rather make sure to manually copy the link from the email.

6. Once the model/s you want have been downloaded, you can run the model locally using the command below:
```bash
torchrun --nproc_per_node 1 example_chat_completion.py \
--ckpt_dir llama-2-7b-chat/ \
--tokenizer_path tokenizer.model \
--max_seq_len 512 --max_batch_size 6
```
pip install -e .
```
**Note**
- Replace `llama-2-7b-chat/` with the path to your checkpoint directory and `tokenizer.model` with the path to your tokenizer model.
- The `–nproc_per_node` should be set to the [MP](#inference) value for the model you are using.
- Adjust the `max_seq_len` and `max_batch_size` parameters as needed.
- This example runs the [example_chat_completion.py](example_chat_completion.py) found in this repository but you can change that to a different .py file.

## Inference

Expand Down

0 comments on commit 46646b8

Please sign in to comment.