Skip to content

Latest commit

 

History

History
45 lines (29 loc) · 1.85 KB

download_gemma.md

File metadata and controls

45 lines (29 loc) · 1.85 KB

Download Gemma weights

Google developed and publicly released the Gemma large language models (LLMs), a collection of pretrained models in 2B and 7B parameter size that are based on the Gemini architecture.

For more information, please see the technical report.

To see all the available checkpoints, run:

python scripts/download.py | grep gemma

which will print

google/gemma-7b
google/gemma-2b
google/gemma-7b-it
google/gemma-2b-it

In the list above, gemma-2b and gemma-7b are the pretrained models, and gemma-2b-it and gemma-7b-it are the instruction-finetuned models.

In order to use a specific checkpoint, for instance gemma-2b, download the weights and convert the checkpoint to the lit-gpt format.

This requires that you've been granted access to the weights on the HuggingFace hub. You can do so by following the steps at https://huggingface.co/google/gemma-2b. After access is granted, you can find your HF hub token in https://huggingface.co/settings/tokens.

pip install 'huggingface_hub[hf_transfer] @ git+https://github.com/huggingface/huggingface_hub'

python scripts/download.py --repo_id google/gemma-2b --access_token your_hf_token --from_safetensors true

python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/google/gemma-2b

By default, the convert_hf_checkpoint step will use the data type of the HF checkpoint's parameters. In cases where RAM or disk size is constrained, it might be useful to pass --dtype bfloat16 to convert all parameters into this smaller precision before continuing.

You're done! To execute the model just run:

python chat/base.py --checkpoint_dir checkpoints/google/gemma-2b