LLM from scratch, no pre-trained models, no HF transformers

This is implementation of decoder-only transformer based LLM with next-token prediction objective. This implementation use tokenizers library from HF, GQA (Grouped query attention), normalized-GPT, RoPE (Rotary positional embedding), and Liger Kernel.

There are 6 versions:

Using AdamW optimizer and lorem ipsum datasets (Broken RoPE) [colab notebook]
Using SOAP optimizer and lorem ipsum datasets (Broken RoPE) [colab notebook]
Using SOAP optimizer, synthetic number datasets, and larger parameter (Broken RoPE) [colab notebook]
Using SOAP optimizer, synthetic number datasets, smaller parameters, and larger epochs (Broken RoPE) [colab notebook]
Using SOAP optimizer, harder synthetic number datasets, optimized hyperparameter, liger kernel applied, and Fast-FFN (fixed RoPE) [colab notebook]
Using tuned SOAP optimizer, harder synthetic number datasets, optimized hyperparameter, liger kernel applied, Fast-FFN, and normalized-GPT (fixed RoPE) [colab notebook]

We publish the weights from the latest version on HF Link

Notes: There's a small mistake in RoPE implementation where RoPE is applied to value_embedding (it should be applied only to query and key). The latest two versions fixes this issue.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
LLM_from_scratch_with_AdamW_optimizer.ipynb		LLM_from_scratch_with_AdamW_optimizer.ipynb
LLM_from_scratch_with_HeavyBall,_Fast_FFN,_Liger_Kernel,_and_nGPT.ipynb		LLM_from_scratch_with_HeavyBall,_Fast_FFN,_Liger_Kernel,_and_nGPT.ipynb
LLM_from_scratch_with_SOAP_optimizer,_Fast_FFN,_and_Liger_Kernel.ipynb		LLM_from_scratch_with_SOAP_optimizer,_Fast_FFN,_and_Liger_Kernel.ipynb
LLM_from_scratch_with_SOAP_optimizer.ipynb		LLM_from_scratch_with_SOAP_optimizer.ipynb
LLM_from_scratch_with_SOAP_optimizer_and_synthetic_datasets.ipynb		LLM_from_scratch_with_SOAP_optimizer_and_synthetic_datasets.ipynb
LLM_from_scratch_with_SOAP_optimizer_and_synthetic_datasets_v2.ipynb		LLM_from_scratch_with_SOAP_optimizer_and_synthetic_datasets_v2.ipynb
README.md		README.md
meme.jpg		meme.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM from scratch, no pre-trained models, no HF transformers

About

Languages

kreasof-ai/LLM-from-scratch

Folders and files

Latest commit

History

Repository files navigation

LLM from scratch, no pre-trained models, no HF transformers

About

Topics

Resources

Stars

Watchers

Forks

Languages