Skip to content

hao-ai-lab/Dynasor

Repository files navigation

Dynasor🦖: Efficiently Serving LLM Reasoning Programs with Certaindex

Simple extension on vLLM to help you speed up reasoning model without training.


Try our 🤖 Demo!

Dynasor Demo

Quick Start

Use Dynasor:

# Install Dynasor
git clone https://github.com/hao-ai-lab/Dynasor.git 
cd Dynasor && pip install . && cd -

# (Optional) Install and setup vllm endpoint
pip install vllm
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-7B -tp 1 --enable-prefix-caching

# Start Dynasor Chat with an endpoint
dynasor-chat --base-url http://localhost:8000/v1

What is Dynasor?

Dynasor is a tool that helps you speed up LLM reasoning model without training or finetuning. It uses a combination of techniques to improve the prompt, and dynamically execute the prompt, and stop when the LLM has enough information to make a decision.

Installation

Install via source

git clone https://github.com/hao-ai-lab/Dynasor.git
cd Dynasor && pip install . && cd -

How to use Dynasor

We provide 3 tools to launch Dynasor:

  1. dynasor-chat: CLI chat interface to interact with Dynasor
  2. dynasor-openai: OpenAI compatible server.
  3. dynasor-vllm: vLLM-native server

dynasor-chat: CLI Chat Interface

Warning

We recommend enabling prefix caching, otherwise probing will be very slow.

  1. Setup a vLLM server
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-7B -tp 1 --enable-prefix-caching
  1. Open Dynasor Chat in command line
dynasor-chat

dynasor-openai OpenAI Compatible Server

  1. Setup a vLLM server
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-7B -tp 1 --enable-prefix-caching
  1. Setup OpenAI compatible proxy server to server Dynasor
dynasor-openai
  1. Use our simiple client script to query:
# Sample Dynasor client script to ask some questions
python examples/client.py --prompt "2+2=?"
python examples/client.py --prompt "Solve x^2 + 4x = 4"
python examples/client.py --prompt "How many nonzero points are there on x^3y + y^3z + z^3x = 0 over the finite field  𝔽_{{5}^{18}}  up to scaling?"

dynasor-vllm: vLLM-native Server

We build Dynasor on top of vLLM as a part of the vLLM OpenAI compatible server endpoint.

  1. Setup a dynasor-vllm server
dynasor-vllm --model deepseek-ai/DeepSeek-R1-Distill-Qwen-7B --enable-prefix-caching
  1. Use our simple client script to query:
python examples/client-vllm.py

Benchmark

Token Deprivation and Applying Dynasor

To conduct the token deprivation experiment on the math500 dataset, first launch a vLLM server, then run the following command. Note that the current run.py script processes only 10 questions. To obtain complete results, modify the --start and --end parameters for changing problem id and solve all problems in parallel!

bash benchmark/TokenDeprivation/run.sh

Results Visualization

Run benchmark/TokenDeprivation/post_process.ipynb to visualize the results

Citation

If you use Dynasor for your research, please cite our paper:

@article{fu2024efficiently,
  title={Efficiently Serving LLM Reasoning Programs with Certaindex},
  author={Fu, Yichao and Chen, Junda and Zhu, Siqi and Fu, Zheyu and Dai, Zhongdongming and Qiao, Aurick and Zhang, Hao},
  journal={arXiv preprint arXiv:2412.20993},
  year={2024}
}