Little AI Dreamer based on llama2-esp32

A port of llama2.c for the ESP32-S3 microcontroller. This project implements a lightweight version of the Llama 2 architecture optimized for embedded systems.

This project is aimed to create a very simple AI Dreaming Machine. The only thing this code does is generate little AI dreams.But it does it in only 512Kb of memory on a very underpowered micro controller.

Developed by Massimo Di Leo NuvolaProject starting from the wonderful works of A.Karpathy and D.Bennet.

There are some minor improvements over Bennet implementation of llama2.c on ESP32. I noticed that the original project generates more or less always the same story. I tweaked the code in order to add a little bit more of randomness in the seed generation. Also I changed the model from tiny stories to a custom trained version called aidreams260K. This model has been trained from a dataset of 2000 AI generated dreams. These dreams have beeen created with llama3-8b but with custom prompts in order to get a properly structured AI generated dreams, not human dreams.

Features

Runs on ESP32-S3 with minimal resources
Custom vocabulary size of 512 tokens
Optimized model architecture for embedded systems:
- Dimension: 64
- Layers: 4
- Heads: 4
- KV Heads: 4
- Max Sequence Length: 128
- Multiple of: 4

Requirements

Hardware

ESP32-S3 development board
Minimum 2MB PSRAM
Minimum 4MB Flash

Software

ESP-IDF v4.4 or later
Python 3.7 or later (for training and tokenizer)

Installation

Clone this repository:

git clone https://github.com/mc9625/llama2-esp32.git
cd llama2-esp32

Set up ESP-IDF environment:

. $HOME/esp/esp-idf/export.sh

Configure the project:

idf.py set-target esp32s3
idf.py menuconfig

Build and flash:

idf.py build
idf.py -p /dev/ttyUSB0 flash

Model Configuration

The current model uses these parameters:

--vocab_source=custom
--vocab_size=512
--dim=64
--n_layers=4
--n_heads=4
--n_kv_heads=4
--multiple_of=4
--max_seq_len=128
--batch_size=128

Project Structure

src/llm.c - Main LLM implementation
src/llm.h - Header file with data structures and function declarations
src/main.c - ESP32 application entry point
components/ - External components and dependencies

Memory Usage

Flash: ~XMB for model weights
PSRAM: ~YKB for runtime buffers
RAM: ~ZKB for stack and heap

Performance

Current performance metrics:

Inference speed: ~17 tokens/second
Memory efficiency: Uses optimized data structures and FreeRTOS tasks

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Based on llama2.c by Andrej Karpathy
and on esp32-llm by Dave Bennet

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
main		main
.DS_Store		.DS_Store
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
Kconfig.projbuild		Kconfig.projbuild
README.md		README.md
dependencies.lock		dependencies.lock
linker.lf		linker.lf
partitions.csv		partitions.csv
sdkconfig		sdkconfig
sdkconfig.ci		sdkconfig.ci
sdkconfig.old		sdkconfig.old

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Little AI Dreamer based on llama2-esp32

Features

Requirements

Hardware

Software

Installation

Model Configuration

Project Structure

Memory Usage

Performance

License

Acknowledgments

Contributing

About

Releases

Packages

Languages

mc9625/esp32-llm

Folders and files

Latest commit

History

Repository files navigation

Little AI Dreamer based on llama2-esp32

Features

Requirements

Hardware

Software

Installation

Model Configuration

Project Structure

Memory Usage

Performance

License

Acknowledgments

Contributing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages