lovely-llama

An implementation of the Llama architecture, to instruct and delight.

Setup

Users

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Developers

git submodule update --init
python -m venv .venv
echo 'PYTHONPATH="${PYTHONPATH:+$PYTHONPATH:}llama2_c"' >> .venv/bin/activate
source .venv/bin/activate
pip install -r requirements-dev.txt
pre-commit install --hook-type pre-push
chmod 755 dev

and run ./dev for test, type-checking and formatting (see ./dev --help).

Principles for a lovely implementation

The principles I've adopted for a "lovely" implementation:

Everything is implemented in one file, from basic jax.numpy building blocks
The shapes of tensors in a function's parameters are a) explicit and b) minimal
The code looks like the corresponding maths (with references from the literature!)
No optimizations

These are fulfilled practically via (points corresponding 1-to-1 with the ones above):

Everything is tested for correctness against the python implementation in karpathy's llama2.c repo, and made tidy via ruff and pyright
a) The use of jaxtyping for shape-aware runtime type-checking, b) aggressively vmapping to remove any "batching" dimensions from function parameter-shapes
This is made possible because of the vmapping convention (no einsums required!). Some variable names are made more explicit where the maths-naming would be unclear
Just don't do it

Todo

compare model training loss to baseline and fix any issues
implement training and optim (while keeping training parity with baseline)

License

This project is licensed under the MIT License (see LICENSE). It includes components that are derived from work licensed under the Apache License, Version 2.0 (dev script which is derived from https://github.com/graphcore-research/unit-scaling/blob/main/dev, and typings/jax/ which is derived from https://github.com/google/jax/tree/main/jax/).

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
llama2_c @ 350e04f		llama2_c @ 350e04f
typings		typings
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
dev		dev
lovely_llama.py		lovely_llama.py
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
test_lovely_llama.py		test_lovely_llama.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

lovely-llama

Setup

Users

Developers

Principles for a lovely implementation

Todo

License

About

Releases

Packages

Languages

License

thecharlieblake/lovely-llama

Folders and files

Latest commit

History

Repository files navigation

lovely-llama

Setup

Users

Developers

Principles for a lovely implementation

Todo

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages