Lit-GPT -> LitGPT (Lightning-AI#1038)

cromz22 · Mar 15, 2024 · 4839345 · 4839345
1 parent 54fece5
commit 4839345
Show file tree

Hide file tree

Showing 115 changed files with 519 additions and 520 deletions.
diff --git a/.gitignore b/.gitignore
@@ -9,7 +9,7 @@ build
 # data
 data
 datasets
-!lit_gpt/data
+!litgpt/data
 !tests/data
 checkpoints
 out

diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 <div align="center">
-<img src="https://pl-public-data.s3.amazonaws.com/assets_lightning/LitStableLM_Badge.png" alt="Lit-GPT" width="128"/>
+<img src="https://pl-public-data.s3.amazonaws.com/assets_lightning/LitStableLM_Badge.png" alt="LitGPT" width="128"/>
 
-# ⚡ Lit-GPT
+# ⚡ LitGPT
 
 <!--
 <p align="center">
@@ -14,15 +14,15 @@
 ![PyPI - Python Version](https://img.shields.io/pypi/pyversions/pytorch-lightning)
 ![cpu-tests](https://github.com/lightning-AI/lit-stablelm/actions/workflows/cpu-tests.yml/badge.svg) [![license](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/Lightning-AI/lit-stablelm/blob/master/LICENSE) [![Discord](https://img.shields.io/discord/1077906959069626439?style=plastic)](https://discord.gg/VptPCZkGNa)
 
-<img src="https://pl-public-data.s3.amazonaws.com/assets_lightning/LitStableLM.gif" alt="Lit-GPT and pineapple pizza" width="500px"/>
+<img src="https://pl-public-data.s3.amazonaws.com/assets_lightning/LitStableLM.gif" alt="LitGPT and pineapple pizza" width="500px"/>
 
 </div>
 
 &nbsp;
 
-# ⚡ Lit-GPT
+# ⚡ LitGPT
 
-Hackable [implementation](lit_gpt/model.py) of state-of-the-art open-source large language models released under the **Apache 2.0 license**.
+Hackable [implementation](litgpt/model.py) of state-of-the-art open-source large language models released under the **Apache 2.0 license**.
 
 Supports the following popular model checkpoints:
 
@@ -57,17 +57,17 @@ This implementation extends on [Lit-LLaMA](https://github.com/lightning-AI/lit-l
 
 **🏆 NeurIPS 2023 Large Language Model Efficiency Challenge: 1 LLM + 1 GPU + 1 Day**
 
-The Lit-GPT repository was the official starter kit for the [NeurIPS 2023 LLM Efficiency Challenge](https://llm-efficiency-challenge.github.io), which is a competition focused on finetuning an existing non-instruction tuned LLM for 24 hours on a single GPU.
+The LitGPT repository was the official starter kit for the [NeurIPS 2023 LLM Efficiency Challenge](https://llm-efficiency-challenge.github.io), which is a competition focused on finetuning an existing non-instruction tuned LLM for 24 hours on a single GPU.
 
 ---
 
 &nbsp;
 
-## Lit-GPT design principles
+## LitGPT design principles
 
 This repository follows the main principle of **openness through clarity**.
 
-**Lit-GPT** is:
+**LitGPT** is:
 
 - **Simple:** Single-file implementation without boilerplate.
 - **Correct:** Numerically equivalent to the original model.
@@ -89,8 +89,8 @@ Avoiding code duplication is **not** a goal. **Readability** and **hackability**
 Clone the repo:
 
 ```bash
-git clone https://github.com/Lightning-AI/lit-gpt
-cd lit-gpt
+git clone https://github.com/Lightning-AI/litgpt
+cd litgpt
 ```
 
 Install with all dependencies (including CLI, quantization, tokenizers for all models, etc.):
@@ -196,14 +196,14 @@ Follow this guide to start pretraining on
 
 ## Supported datasets
 
-Lit-GPT includes a variety of dataset preparation scripts for finetuning and pretraining. Additional information about the datasets and dataset preparation is provided in the [Preparing Datasets](tutorials/prepare_dataset.md) tutorial.
+LitGPT includes a variety of dataset preparation scripts for finetuning and pretraining. Additional information about the datasets and dataset preparation is provided in the [Preparing Datasets](tutorials/prepare_dataset.md) tutorial.
 
 
 &nbsp;
 
 ## XLA
 
-Lightning AI has partnered with Google to add first-class support for [Cloud TPUs](https://cloud.google.com/tpu) in [Lightning’s frameworks](https://github.com/Lightning-AI/lightning) and Lit-GPT,
+Lightning AI has partnered with Google to add first-class support for [Cloud TPUs](https://cloud.google.com/tpu) in [Lightning’s frameworks](https://github.com/Lightning-AI/lightning) and LitGPT,
 helping democratize AI for millions of developers and researchers worldwide.
 
 Using TPUs with Lightning is as straightforward as changing one line of code.
@@ -216,19 +216,18 @@ We provide scripts fully optimized for TPUs in the [XLA directory](xla)
 
 We are on a quest towards fully open source AI.
 
-<img align="right" src="https://pl-public-data.s3.amazonaws.com/assets_lightning/LitStableLM_Illustration.png" alt="Lit-GPT" width="128"/>
+<img align="right" src="https://pl-public-data.s3.amazonaws.com/assets_lightning/LitStableLM_Illustration.png" alt="LitGPT" width="128"/>
 
 Join us and start contributing, especially on the following areas:
 
-- [ ] [Pretraining](https://github.com/Lightning-AI/lit-gpt/labels/pre-training)
-- [ ] [Fine-tuning](https://github.com/Lightning-AI/lit-gpt/labels/fine-tuning)
-- [ ] [Quantization](https://github.com/Lightning-AI/lit-gpt/labels/quantization)
-- [ ] [Sparsification](https://github.com/Lightning-AI/lit-gpt/labels/sparsification)
+- [ ] [Pretraining](https://github.com/Lightning-AI/litgpt/labels/pre-training)
+- [ ] [Fine-tuning](https://github.com/Lightning-AI/litgpt/labels/fine-tuning)
+- [ ] [Quantization](https://github.com/Lightning-AI/litgpt/labels/quantization)
+- [ ] [Sparsification](https://github.com/Lightning-AI/litgpt/labels/sparsification)
 
 We welcome all individual contributors, regardless of their level of experience or hardware. Your contributions are valuable, and we are excited to see what you can accomplish in this collaborative and supportive environment.
 
-Unsure about contributing? Check out our [How to Contribute to Lit-GPT and Lit-LLaMA
-](https://lightning.ai/pages/community/tutorial/how-to-contribute-to-litgpt/) guide.
+Unsure about contributing? Check out our [How to Contribute to LitGPT](https://lightning.ai/pages/community/tutorial/how-to-contribute-to-litgpt/) guide.
 
 Don't forget to [join our Discord](https://discord.gg/VptPCZkGNa)!
 
@@ -246,13 +245,13 @@ Don't forget to [join our Discord](https://discord.gg/VptPCZkGNa)!
 
 ## Citation
 
-If you use Lit-GPT in your research, please cite the following work:
+If you use LitGPT in your research, please cite the following work:
 
 ```bibtex
-@misc{lit-gpt-2023,
+@misc{litgpt-2023,
   author       = {Lightning AI},
-  title        = {Lit-GPT},
-  howpublished = {\url{https://github.com/Lightning-AI/lit-gpt}},
+  title        = {LitGPT},
+  howpublished = {\url{https://github.com/Lightning-AI/litgpt}},
   year         = {2023},
 }
 ```
@@ -261,4 +260,4 @@ If you use Lit-GPT in your research, please cite the following work:
 
 ## License
 
-Lit-GPT is released under the [Apache 2.0](https://github.com/Lightning-AI/lit-gpt/blob/main/LICENSE) license.
+LitGPT is released under the [Apache 2.0](https://github.com/Lightning-AI/litgpt/blob/main/LICENSE) license.
diff --git a/chat/base.py b/chat/base.py
@@ -14,9 +14,9 @@
 sys.path.append(str(wd))
 
 from generate.base import next_token
-from lit_gpt import GPT, Config, PromptStyle, Tokenizer
-from lit_gpt.prompts import load_prompt_style, has_prompt_style
-from lit_gpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, load_checkpoint
+from litgpt import GPT, Config, PromptStyle, Tokenizer
+from litgpt.prompts import load_prompt_style, has_prompt_style
+from litgpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, load_checkpoint
 
 
 @torch.inference_mode()
@@ -118,7 +118,7 @@ def main(
         quantize: Whether to quantize the model and using which method:
             - bnb.nf4, bnb.nf4-dq, bnb.fp4, bnb.fp4-dq: 4-bit quantization from bitsandbytes
             - bnb.int8: 8-bit quantization from bitsandbytes
-            for more details, see https://github.com/Lightning-AI/lit-gpt/blob/main/tutorials/quantize.md
+            for more details, see https://github.com/Lightning-AI/litgpt/blob/main/tutorials/quantize.md
         precision: Indicates the Fabric precision setting to use.
         compile: Whether to use compilation to speed up token generation. Will increase startup time.
     """

diff --git a/config_hub/finetune/llama-2-7b/full.yaml b/config_hub/finetune/llama-2-7b/full.yaml
@@ -3,7 +3,7 @@ devices: 4
 resume: false
 seed: 1337
 data:
-  class_path: lit_gpt.data.AlpacaGPT4
+  class_path: litgpt.data.AlpacaGPT4
   init_args:
     mask_prompt: false
     test_split_fraction: 0.03847

diff --git a/config_hub/finetune/llama-2-7b/lora.yaml b/config_hub/finetune/llama-2-7b/lora.yaml
@@ -12,7 +12,7 @@ lora_projection: false
 lora_mlp: false
 lora_head: false
 data:
-  class_path: lit_gpt.data.AlpacaGPT4
+  class_path: litgpt.data.AlpacaGPT4
   init_args:
     mask_prompt: false
     test_split_fraction: 0.03847

diff --git a/config_hub/finetune/tiny-llama/lora.yaml b/config_hub/finetune/tiny-llama/lora.yaml
@@ -12,7 +12,7 @@ lora_projection: false
 lora_mlp: false
 lora_head: false
 data:
-  class_path: lit_gpt.data.AlpacaGPT4
+  class_path: litgpt.data.AlpacaGPT4
   init_args:
     mask_prompt: false
     test_split_fraction: 0.03847

diff --git a/config_hub/pretrain/tinystories.yaml b/config_hub/pretrain/tinystories.yaml
@@ -21,7 +21,7 @@ resume: false
 devices: 1
 seed: 1337
 data:
-  class_path: lit_gpt.data.TinyStories
+  class_path: litgpt.data.TinyStories
   init_args:
     path: data
     num_workers: 8

diff --git a/eval/lm_eval_harness.py b/eval/lm_eval_harness.py
@@ -16,8 +16,8 @@
 sys.path.append(str(wd))
 
 from generate.base import generate
-from lit_gpt import GPT, Config, Tokenizer
-from lit_gpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, load_checkpoint
+from litgpt import GPT, Config, Tokenizer
+from litgpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, load_checkpoint
 
 
 class EvalHarnessBase(BaseLM):
@@ -112,7 +112,7 @@ def pattern_match(patterns, source_list):
 
         lm = self
         if not no_cache:
-            lm = base.CachingLM(lm, "lm_cache/lit-gpt.db")
+            lm = base.CachingLM(lm, "lm_cache/litgpt.db")
 
         results = evaluator.evaluate(
             lm=lm,

diff --git a/finetune/adapter.py b/finetune/adapter.py
@@ -20,12 +20,12 @@
 sys.path.append(str(wd))
 
 from generate.base import generate
-from lit_gpt.adapter import GPT, Block, Config, adapter_filter, mark_only_adapter_as_trainable
-from lit_gpt.args import EvalArgs, TrainArgs
-from lit_gpt.data import Alpaca, LitDataModule
-from lit_gpt.prompts import save_prompt_style
-from lit_gpt.tokenizer import Tokenizer
-from lit_gpt.utils import (
+from litgpt.adapter import GPT, Block, Config, adapter_filter, mark_only_adapter_as_trainable
+from litgpt.args import EvalArgs, TrainArgs
+from litgpt.data import Alpaca, LitDataModule
+from litgpt.prompts import save_prompt_style
+from litgpt.tokenizer import Tokenizer
+from litgpt.utils import (
     CLI,
     check_valid_checkpoint_dir,
     chunked_cross_entropy,

diff --git a/finetune/adapter_v2.py b/finetune/adapter_v2.py
@@ -20,12 +20,12 @@
 sys.path.append(str(wd))
 
 from generate.base import generate
-from lit_gpt.adapter_v2 import GPT, Block, Config, adapter_filter, mark_only_adapter_v2_as_trainable
-from lit_gpt.args import EvalArgs, TrainArgs
-from lit_gpt.data import Alpaca, LitDataModule
-from lit_gpt.prompts import save_prompt_style
-from lit_gpt.tokenizer import Tokenizer
-from lit_gpt.utils import (
+from litgpt.adapter_v2 import GPT, Block, Config, adapter_filter, mark_only_adapter_v2_as_trainable
+from litgpt.args import EvalArgs, TrainArgs
+from litgpt.data import Alpaca, LitDataModule
+from litgpt.prompts import save_prompt_style
+from litgpt.tokenizer import Tokenizer
+from litgpt.utils import (
     CLI,
     check_valid_checkpoint_dir,
     chunked_cross_entropy,

diff --git a/finetune/full.py b/finetune/full.py
@@ -21,12 +21,12 @@
 sys.path.append(str(wd))
 
 from generate.base import generate
-from lit_gpt.args import EvalArgs, TrainArgs
-from lit_gpt.model import GPT, Block, Config
-from lit_gpt.tokenizer import Tokenizer
-from lit_gpt.data import Alpaca, LitDataModule
-from lit_gpt.prompts import save_prompt_style
-from lit_gpt.utils import (
+from litgpt.args import EvalArgs, TrainArgs
+from litgpt.model import GPT, Block, Config
+from litgpt.tokenizer import Tokenizer
+from litgpt.data import Alpaca, LitDataModule
+from litgpt.prompts import save_prompt_style
+from litgpt.utils import (
     CLI,
     check_valid_checkpoint_dir,
     chunked_cross_entropy,

diff --git a/finetune/lora.py b/finetune/lora.py
@@ -22,12 +22,12 @@
 sys.path.append(str(wd))
 
 from generate.base import generate
-from lit_gpt.args import EvalArgs, TrainArgs
-from lit_gpt.data import LitDataModule, Alpaca
-from lit_gpt.lora import GPT, Block, Config, lora_filter, mark_only_lora_as_trainable
-from lit_gpt.prompts import save_prompt_style
-from lit_gpt.tokenizer import Tokenizer
-from lit_gpt.utils import (
+from litgpt.args import EvalArgs, TrainArgs
+from litgpt.data import LitDataModule, Alpaca
+from litgpt.lora import GPT, Block, Config, lora_filter, mark_only_lora_as_trainable
+from litgpt.prompts import save_prompt_style
+from litgpt.tokenizer import Tokenizer
+from litgpt.utils import (
     CLI,
     check_valid_checkpoint_dir,
     chunked_cross_entropy,

diff --git a/generate/adapter.py b/generate/adapter.py
@@ -14,10 +14,10 @@
 sys.path.append(str(wd))
 
 from generate.base import generate
-from lit_gpt import Tokenizer, PromptStyle
-from lit_gpt.adapter import GPT, Config
-from lit_gpt.prompts import load_prompt_style, has_prompt_style
-from lit_gpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, lazy_load
+from litgpt import Tokenizer, PromptStyle
+from litgpt.adapter import GPT, Config
+from litgpt.prompts import load_prompt_style, has_prompt_style
+from litgpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, lazy_load
 
 
 
@@ -45,7 +45,7 @@ def main(
         quantize: Whether to quantize the model and using which method:
             - bnb.nf4, bnb.nf4-dq, bnb.fp4, bnb.fp4-dq: 4-bit quantization from bitsandbytes
             - bnb.int8: 8-bit quantization from bitsandbytes
-            for more details, see https://github.com/Lightning-AI/lit-gpt/blob/main/tutorials/quantize.md
+            for more details, see https://github.com/Lightning-AI/litgpt/blob/main/tutorials/quantize.md
         max_new_tokens: The number of generation steps to take.
         top_k: The number of top most probable tokens to consider in the sampling process.
         temperature: A value controlling the randomness of the sampling process. Higher values result in more random

diff --git a/generate/adapter_v2.py b/generate/adapter_v2.py
@@ -14,10 +14,10 @@
 sys.path.append(str(wd))
 
 from generate.base import generate
-from lit_gpt import Tokenizer, PromptStyle
-from lit_gpt.adapter_v2 import GPT, Config
-from lit_gpt.prompts import load_prompt_style, has_prompt_style
-from lit_gpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, lazy_load
+from litgpt import Tokenizer, PromptStyle
+from litgpt.adapter_v2 import GPT, Config
+from litgpt.prompts import load_prompt_style, has_prompt_style
+from litgpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, lazy_load
 
 
 def main(
@@ -44,7 +44,7 @@ def main(
         quantize: Whether to quantize the model and using which method:
             - bnb.nf4, bnb.nf4-dq, bnb.fp4, bnb.fp4-dq: 4-bit quantization from bitsandbytes
             - bnb.int8: 8-bit quantization from bitsandbytes
-            for more details, see https://github.com/Lightning-AI/lit-gpt/blob/main/tutorials/quantize.md
+            for more details, see https://github.com/Lightning-AI/litgpt/blob/main/tutorials/quantize.md
         max_new_tokens: The number of generation steps to take.
         top_k: The number of top most probable tokens to consider in the sampling process.
         temperature: A value controlling the randomness of the sampling process. Higher values result in more random

diff --git a/generate/base.py b/generate/base.py
@@ -15,9 +15,9 @@
 wd = Path(__file__).parent.parent.resolve()
 sys.path.append(str(wd))
 
-from lit_gpt import GPT, Config, Tokenizer, PromptStyle
-from lit_gpt.prompts import load_prompt_style, has_prompt_style
-from lit_gpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, load_checkpoint
+from litgpt import GPT, Config, Tokenizer, PromptStyle
+from litgpt.prompts import load_prompt_style, has_prompt_style
+from litgpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, load_checkpoint
 
 
 def multinomial_num_samples_1(probs: torch.Tensor) -> torch.Tensor:
@@ -120,7 +120,7 @@ def main(
         quantize: Whether to quantize the model and using which method:
             - bnb.nf4, bnb.nf4-dq, bnb.fp4, bnb.fp4-dq: 4-bit quantization from bitsandbytes
             - bnb.int8: 8-bit quantization from bitsandbytes
-            for more details, see https://github.com/Lightning-AI/lit-gpt/blob/main/tutorials/quantize.md
+            for more details, see https://github.com/Lightning-AI/litgpt/blob/main/tutorials/quantize.md
         precision: Indicates the Fabric precision setting to use.
         compile: Whether to compile the model.
     """

diff --git a/generate/full.py b/generate/full.py
@@ -14,9 +14,9 @@
 sys.path.append(str(wd))
 
 from generate.base import generate
-from lit_gpt import GPT, Config, Tokenizer, PromptStyle
-from lit_gpt.prompts import load_prompt_style, has_prompt_style
-from lit_gpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, load_checkpoint
+from litgpt import GPT, Config, Tokenizer, PromptStyle
+from litgpt.prompts import load_prompt_style, has_prompt_style
+from litgpt.utils import CLI, check_valid_checkpoint_dir, get_default_supported_precision, load_checkpoint
 
 
 def main(
@@ -43,7 +43,7 @@ def main(
         quantize: Whether to quantize the model and using which method:
             - bnb.nf4, bnb.nf4-dq, bnb.fp4, bnb.fp4-dq: 4-bit quantization from bitsandbytes
             - bnb.int8: 8-bit quantization from bitsandbytes
-            for more details, see https://github.com/Lightning-AI/lit-gpt/blob/main/tutorials/quantize.md
+            for more details, see https://github.com/Lightning-AI/litgpt/blob/main/tutorials/quantize.md
         max_new_tokens: The number of generation steps to take.
         top_k: The number of top most probable tokens to consider in the sampling process.
         temperature: A value controlling the randomness of the sampling process. Higher values result in more random