GitHub - kolinko/effort: An implementation of bucketMul LLM inference

An example implementation of the bucketMul algorithm - you can read about it here.

With it you can smoothly adjust—in real time—the number of calculations performed during the inference of an LLM model.

At 50% effort, it performs as fast as regular matrix multiplications on Apple Silicon chips; at 25% effort, it is twice as fast while still retaining most of the quality.

You also have the option to skip loading the least important weights.

Getting Started

Binaries

You can quickly get started by downloading the precompiled binaries available at: Effort Engine v0.0.1

To bypass macOS Gatekeeper, hold option while clicking to open the downloaded application for the first time.

Initial Setup

On the first run, you will be prompted to download the converted weights necessary for operation. Subsequently, a matrix multiplication benchmark will execute to demonstrate the capabilities of the engine.

Source Code

The sources are in Swift & Metal.

Download and open effort.xcodeproj. It should work straight away.

Additional Resources

More Information: Visit our project page.
See it in Action: Watch a demo on Asciinema.

Updates

Ton of things to fix, looking for collabolators! :)

Name		Name	Last commit message	Last commit date
Latest commit History 331 Commits
benchmarks		benchmarks
docs		docs
effort.xcodeproj		effort.xcodeproj
garbage		garbage
helpers		helpers
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
aux.metal		aux.metal
bucketMul.metal		bucketMul.metal
bucketMul.swift		bucketMul.swift
bucketMulQ4.metal		bucketMulQ4.metal
bucketMulQ4.swift		bucketMulQ4.swift
convert.metal		convert.metal
convert.swift		convert.swift
default.metallib		default.metallib
effort		effort
expertMul.swift		expertMul.swift
info.plist		info.plist
loader.swift		loader.swift
main.swift		main.swift
matrix.metal		matrix.metal
model.swift		model.swift
playground.swift		playground.swift
q4_convert.py		q4_convert.py
q4_draft.py		q4_draft.py
runNetwork.swift		runNetwork.swift
swift-tokeniser.json		swift-tokeniser.json
test.swift		test.swift
tokenizer.json		tokenizer.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting Started

Binaries

Initial Setup

Source Code

Additional Resources

Updates

About

Releases 1

Packages

Contributors 3

Languages

License

kolinko/effort

Folders and files

Latest commit

History

Repository files navigation

Getting Started

Binaries

Initial Setup

Source Code

Additional Resources

Updates

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Languages

Packages