Skip to content

Commit

Permalink
Add model sizes to README (Lightning-AI#644)
Browse files Browse the repository at this point in the history
Co-authored-by: Carlos Mocholí <[email protected]>
  • Loading branch information
cx0 and carmocca authored Oct 17, 2023
1 parent 7c10aad commit 8641822
Showing 1 changed file with 17 additions and 17 deletions.
34 changes: 17 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,23 +26,23 @@ Hackable [implementation](lit_gpt/model.py) of state-of-the-art open-source larg

Supports the following popular model checkpoints:

| Model and usage | Reference |
|---------------------------------------------------------------------|--------------------------------------------------------------------------------------------------|
| Meta AI [Llama 2](tutorials/download_llama_2.md) | [Touvron et al. 2023](https://arxiv.org/abs/2307.09288) |
| Stability AI [FreeWilly2](tutorials/download_freewilly_2.md) | [Stability AI 2023](https://stability.ai/blog/stable-beluga-large-instruction-fine-tuned-models) |
| Stability AI StableCode | [Stability AI 2023](https://stability.ai/blog/stablecode-llm-generative-ai-coding) |
| TII UAE [Falcon](tutorials/download_falcon.md) | [TII 2023](https://falconllm.tii.ae) |
| OpenLM Research [OpenLLaMA](tutorials/download_openllama.md) | [Geng & Liu 2023](https://github.com/openlm-research/open_llama) |
| LMSYS [Vicuna](tutorials/download_vicuna.md) | [Li et al. 2023](https://lmsys.org/blog/2023-03-30-vicuna/) |
| LMSYS [LongChat](tutorials/download_longchat.md) | [LongChat Team 2023](https://lmsys.org/blog/2023-06-29-longchat/) |
| Together [RedPajama-INCITE](tutorials/download_redpajama_incite.md) | [Together 2023](https://together.ai/blog/redpajama-models-v1) |
| EleutherAI [Pythia](tutorials/download_pythia.md) | [Biderman et al. 2023](https://arxiv.org/abs/2304.01373) |
| StabilityAI [StableLM](tutorials/download_stablelm.md) | [Stability AI 2023](https://github.com/Stability-AI/StableLM) |
| Platypus | [Lee, Hunter, and Ruiz 2023](https://arxiv.org/abs/2308.07317) |
| NousResearch Nous-Hermes | [Org page](https://huggingface.co/NousResearch) |
| Meta AI [Code Llama](tutorials/download_code_llama.md) | [Rozière et al. 2023](https://arxiv.org/abs/2308.12950) |
| Microsoft Research [phi-1.5](tutorials/download_phi15.md) | [Li et al. 2023](https://arxiv.org/abs/2309.05463) |
| Mistral AI [Mistral](tutorials/download_mistral.md) | [Mistral website](https://mistral.ai/) |
| Model and usage | Model size | Reference |
|--------------------------------------------------------------------------------|------------------------------------|--------------------------------------------------------------------------------------------------|
| Meta AI [Llama 2](tutorials/download_llama_2.md) | 7B, 13B, 70B | [Touvron et al. 2023](https://arxiv.org/abs/2307.09288) |
| Stability AI [FreeWilly2](tutorials/download_freewilly_2.md) (Stable Beluga 2) | 70B | [Stability AI 2023](https://stability.ai/blog/stable-beluga-large-instruction-fine-tuned-models) |
| Stability AI StableCode | 3B | [Stability AI 2023](https://stability.ai/blog/stablecode-llm-generative-ai-coding) |
| TII UAE [Falcon](tutorials/download_falcon.md) | 7B, 40B, 180B | [TII 2023](https://falconllm.tii.ae) |
| OpenLM Research [OpenLLaMA](tutorials/download_openllama.md) | 3B, 7B, 13B | [Geng & Liu 2023](https://github.com/openlm-research/open_llama) |
| LMSYS [Vicuna](tutorials/download_vicuna.md) | 7B, 13B, 33B | [Li et al. 2023](https://lmsys.org/blog/2023-03-30-vicuna/) |
| LMSYS [LongChat](tutorials/download_longchat.md) | 7B, 13B | [LongChat Team 2023](https://lmsys.org/blog/2023-06-29-longchat/) |
| Together [RedPajama-INCITE](tutorials/download_redpajama_incite.md) | 3B, 7B | [Together 2023](https://together.ai/blog/redpajama-models-v1) |
| EleutherAI [Pythia](tutorials/download_pythia.md) | {70,160,410}M, {1,1.4,2.8,6.9,12}B | [Biderman et al. 2023](https://arxiv.org/abs/2304.01373) |
| StabilityAI [StableLM](tutorials/download_stablelm.md) | 3B, 7B | [Stability AI 2023](https://github.com/Stability-AI/StableLM) |
| Platypus | 7B, 13B, 70B | [Lee, Hunter, and Ruiz 2023](https://arxiv.org/abs/2308.07317) |
| NousResearch Nous-Hermes | 7B, 13B, 70B | [Org page](https://huggingface.co/NousResearch) |
| Meta AI [Code Llama](tutorials/download_code_llama.md) | 7B, 13B, 34B | [Rozière et al. 2023](https://arxiv.org/abs/2308.12950) |
| Microsoft Research [phi-1.5](tutorials/download_phi15.md) | 1.3B | [Li et al. 2023](https://arxiv.org/abs/2309.05463) |
| Mistral AI [Mistral](tutorials/download_mistral.md) | 7B | [Mistral website](https://mistral.ai/) |

This implementation extends on [Lit-LLaMA](https://github.com/lightning-AI/lit-llama) and [nanoGPT](https://github.com/karpathy/nanoGPT), and it's **powered by [Lightning Fabric](https://lightning.ai/docs/fabric/stable/)**.

Expand Down

0 comments on commit 8641822

Please sign in to comment.