forked from meta-llama/llama
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add FAQ.md // add command line options
- Loading branch information
1 parent
76066b1
commit e6145a0
Showing
3 changed files
with
113 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# FAQ | ||
## <a name="1"></a>1. The download.sh script doesn't work on default bash in MacOS X: | ||
|
||
Please see answers from theses issues: | ||
- https://github.com/facebookresearch/llama/issues/41#issuecomment-1451290160 | ||
- https://github.com/facebookresearch/llama/issues/53#issue-1606582963 | ||
|
||
|
||
## <a name="2"></a>2. Generations are bad! | ||
|
||
Keep in mind these models are not finetuned for question answering. As such, they should be prompted so that the expected answer is the natural continuation of the prompt. | ||
|
||
Here are a few examples of prompts (from [issue#69](https://github.com/facebookresearch/llama/issues/69)) geared towards finetuned models, and how to modify them to get the expected results: | ||
- Do not prompt with "What is the meaning of life? Be concise and do not repeat yourself." but with "I believe the meaning of life is" | ||
- Do not prompt with "Explain the theory of relativity." but with "Simply put, the theory of relativity states that" | ||
- Do not prompt with "Ten easy steps to build a website..." but with "Building a website can be done in 10 simple steps:\n" | ||
|
||
To be able to directly prompt the models with questions / instructions, you can either: | ||
- Prompt it with few-shot examples so that the model understands the task you have in mind. | ||
- Finetune the models on datasets of instructions to make them more robust to input prompts. | ||
|
||
We've updated `example.py` with more sample prompts. Overall, always keep in mind that models are very sensitive to prompts (particularly when they have not been finetuned). | ||
|
||
## <a name="3"></a>3. CUDA Out of memory errors | ||
|
||
The `example.py` file pre-allocates a cache according to these settings: | ||
```python | ||
model_args: ModelArgs = ModelArgs(max_seq_len=max_seq_len, max_batch_size=max_batch_size, **params) | ||
``` | ||
|
||
Accounting for 14GB of memory for the model weights (7B model), this leaves 16GB available for the decoding cache which stores 2 * 2 * n_layers * max_batch_size * max_seq_len * n_heads * head_dim bytes. | ||
|
||
With default parameters, this cache was about 17GB (2 * 2 * 32 * 32 * 1024 * 32 * 128) for the 7B model. | ||
|
||
We've added command line options to `example.py` and changed the default `max_seq_len` to 512 which should allow decoding on 30GB GPUs. | ||
|
||
Feel free to lower these settings according to your hardware. | ||
|
||
## <a name="4"></a>4. Other languages | ||
The model was trained primarily on English, but also on a few other languages with Latin or Cyrillic alphabets. | ||
|
||
For instance, LLaMA was trained on Wikipedia for the 20 following languages: bg, ca, cs, da, de, en, es, fr, hr, hu, it, nl, pl, pt, ro, ru, sl, sr, sv, uk. | ||
|
||
LLaMA's tokenizer splits unseen characters into UTF-8 bytes, as a result, it might also be able to process other languages like Chinese or Japanese, even though they use different characters. | ||
|
||
Although the fraction of these languages in the training was negligible, LLaMA still showcases some abilities in Chinese-English translation: | ||
|
||
``` | ||
Prompt = "J'aime le chocolat = I like chocolate\n祝你一天过得愉快 =" | ||
Output = "I wish you a nice day" | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters