Skip to content

Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

License

Notifications You must be signed in to change notification settings

ssghost/lit-gpt

Repository files navigation

Lit-Parrot

⚡ Lit-Parrot

cpu-tests license Discord

Lit-Parrot and pineapple pizza

⚡ Lit-Parrot

Hackable implementation of state-of-the-art open-source large language models:

released under the Apache 2.0 license.

This implementation builds on Lit-LLaMA and nanoGPT, and it's powered by Lightning Fabric ⚡.

Weights can be downloaded following these instructions:

Design principles

This repository follows the main principle of openness through clarity.

Lit-Parrot is:

  • Simple: Single-file implementation without boilerplate.
  • Correct: Numerically equivalent to the original model.
  • Optimized: Runs on consumer hardware or at scale.
  • Open-source: No strings attached.

Avoiding code duplication is not a goal. Readability and hackability are.

Get involved!

Join our Discord to build high-performance, truly open-source models for the common benefit of the community.

 

Setup

Clone the repo

git clone https://github.com/Lightning-AI/lit-parrot
cd lit-parrot

Lit-Parrot currently relies on FlashAttention from PyTorch nightly. Until PyTorch 2.1 is released you'll need to install nightly manually. Luckily that is straightforward:

On CUDA

pip install --index-url https://download.pytorch.org/whl/nightly/cu118 --pre 'torch>=2.1.0dev'

On CPU (incl Macs)

pip install --index-url https://download.pytorch.org/whl/nightly/cpu --pre 'torch>=2.1.0dev'

All good, now install the dependencies:

pip install -r requirements.txt

You are all set! 🎉

 

Use the model

To generate text predictions, you need to download the model weights. If you don't have them, check out our guide.

Run inference:

python generate.py --prompt "Hello, my name is"

This will run the 3B pre-trained model and require ~7 GB of GPU memory using the bfloat16 datatype.

Full guide for generating samples from the model.

You can also chat with the model interactively:

python chat.py

Run large models on smaller consumer devices

We support LLM.int8 and GPTQ.int4 inference by following this guide.

Finetune the model

We provide a simple training script finetune_adapter.py that instruction-tunes a pretrained model on the Alpaca dataset.

  1. Download the data and generate an instruction tuning dataset:
python scripts/prepare_alpaca.py
  1. Run the finetuning script

Adapter:

python finetune_adapter.py

The finetuning requires at least one GPU with ~12 GB memory (GTX 3060). It is expected that you have downloaded the pretrained weights as described above. More details about each finetuning method and how you can apply it to your own data can be found in our technical how-to guides.

Finetuning How-To Guides

These technical tutorials illustrate how to run the finetuning code.

Understanding Finetuning -- Conceptual Tutorials

Looking for conceptual tutorials and explanations? We have some additional articles below:

Pre-training

Porting from Lit-LLaMA in progress 👷

Get involved!

We are on a quest towards fully open source AI.

Lit-Parrot

Join us and start contributing, especially on the following areas:

We welcome all individual contributors, regardless of their level of experience or hardware. Your contributions are valuable, and we are excited to see what you can accomplish in this collaborative and supportive environment.

Unsure about contributing? Check out our Contributing to Lit-LLaMA: A Hitchhiker’s Guide to the Quest for Fully Open-Source AI guide. The same guidelines apply to Lit-Parrot.

Don't forget to join our Discord!

Acknowledgements

License

Lit-Parrot is released under the Apache 2.0 license.

About

Hackable implementation of state-of-the-art open-source LLMs based on nanoGPT. Supports flash attention, 4-bit and 8-bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 98.9%
  • Other 1.1%