Download LongChat weights
LongChat is an open-source family of chatbots based on LLaMA featuring an extended context length up to 16K tokens. The technique used to extend the context length is described in this blogpost.
To see all the available checkpoints, run:
python scripts/download.py | grep longchat
which will print
lmsys/longchat-7b-16k
lmsys/longchat-13b-16k
In order to use a specific checkpoint, for instance longchat-7b-16k, download the weights and convert the checkpoint to the lit-gpt format:
pip install 'huggingface_hub[hf_transfer] @ git+https://github.com/huggingface/huggingface_hub'
python scripts/download.py --repo_id lmsys/longchat-7b-16k
python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/lmsys/longchat-7b-16k
By default, the convert_hf_checkpoint step will use the data type of the HF checkpoint's parameters. In cases where RAM
or disk size is constrained, it might be useful to pass --dtype bfloat16
to convert all parameters into this smaller precision before continuing.
You're done! To execute the model just run:
pip install sentencepiece
python chat/base.py --checkpoint_dir checkpoints/lmsys/longchat-7b-16k