Skip to content

dmater01/tokenizer-benchmark

 
 

Repository files navigation

Introduction

## Running the benchmark The following command line will allow you to run the tokenizer benchmark against multiple different models

python benchmark.py --file dataset.json --models mistralai/Mistral-7B-v0.1 gpt-4 google/gemma-7b

visualizer

python visualizer.py --file ./samples/Programming/BASIC/guess.bas --model google/gemma-7b

or

python visualizer2.py --file ./samples/Programming/BASIC/guess.bas --models mistralai/Mistral-7B-v0.1 gpt-4 google/gemma-7b
python visualizer2.py --file ./samples/Text/cities.txt --models mistralai/Mistral-7B-v0.1 gpt-4 google/gemma-7b --ignore-numbers

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%