Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
benchmark_decode.py		benchmark_decode.py
profile_generation.py		profile_generation.py
profile_hf_generation.py		profile_hf_generation.py
profile_restful_api.py		profile_restful_api.py
profile_serving.py		profile_serving.py
profile_throughput.py		profile_throughput.py

README.md

Benchmark

We provide several profiling tools to benchmark our models.

profile with dataset

Download the dataset below or create your own dataset.

wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json

Profiling your model with profile_throughput.py

python profile_throughput.py \
 ShareGPT_V3_unfiltered_cleaned_split.json \
 /path/to/your/model \
 --concurrency 64

profile without dataset

profile_generation.py perform benchmark with dummy data.

python profile_generation.py \
 /path/to/your/model \
 --concurrency 8 --input_seqlen 0 --output_seqlen 2048

profile serving

Tools above profile models with Python API. profile_serving.py is used to do benchmark on serving.

wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json

python profile_serving.py \
    ${TritonServerAddress} \
    /path/to/tokenizer \
    ShareGPT_V3_unfiltered_cleaned_split.json \
    --concurrency 64

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark

benchmark

README.md

Benchmark

profile with dataset

profile without dataset

profile serving

Files

benchmark

Directory actions

More options

Directory actions

More options

Latest commit

History

benchmark

Folders and files

parent directory

README.md

Benchmark

profile with dataset

profile without dataset

profile serving