Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
FLASK		FLASK
FastChat		FastChat
alpaca_eval		alpaca_eval
outputs		outputs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bot.py		bot.py
eval_mt_bench.py		eval_mt_bench.py
generate_for_alpaca_eval.py		generate_for_alpaca_eval.py
generate_for_flask.py		generate_for_flask.py
generate_for_mt_bench.py		generate_for_mt_bench.py
requirements.txt		requirements.txt
run_eval.sh		run_eval.sh
run_eval_flask.sh		run_eval_flask.sh
run_eval_mt_bench.sh		run_eval_mt_bench.sh
run_generation.sh		run_generation.sh
show_mt_bench_result.py		show_mt_bench_result.py
tests.py		tests.py
utils.py		utils.py

Repository files navigation

Mixture-of-Agents

Introduction

Mixture of Agents (MoA) is a novel approach that leverages the collective strengths of multiple LLMs to enhance performance, achieving state-of-the-art results. By employing a layered architecture where each layer comprises several LLM agents, MoA significantly outperforms GPT-4 Omni’s 57.5% on AlpacaEval 2.0 with a score of 65.1%, using only open-source models!

Interactive Demo

The interactive demo showcases a simple multi-turn chatbot where the response is aggregated from various reference models.

Setup

Export Your API Key:

Ensure you have your Together API key and export it as an environment variable:
```
export TOGETHER_API_KEY={your_key}
```
Install Requirements:
```
pip install -r requirements.txt
```

Running the Demo

To run the interactive demo, execute the following script with Python:

python bot.py

The script will prompt you to input instructions interactively. Here's how to use it:

Start by entering your instruction at the ">>>" prompt.
The system will process your input using the predefined reference models.
It will generate a response based on the aggregated outputs from these models.
You can continue the conversation by inputting more instructions, with the system maintaining the context of the multi-turn interaction.
enter exit to exit the chatbot.

Configuration

You can configure the demo by specifying the following parameters:

--aggregator: The primary model used for final response generation.
--reference_models: List of models used as references.
--temperature: Controls the randomness of the response generation.
--max_tokens: Maximum number of tokens in the response.
--rounds: Number of rounds to process the input for refinement. (num rounds == num of MoA layers - 1)
--num_proc: Number of processes to run in parallel for faster execution.
--multi_turn: Boolean to toggle multi-turn interaction capability.

Evaluation Benchmarks

We provide scripts to quickly recreate some of the results presented in our paper For convinence, we have included the code from AlpacaEval, MT-Bench, and FLASK, with necessary modifications. We extend our gratitude to these projects for creating the benchmarks.

Preparation:

# install requirements
pip install -r requirements.txt
cd alpaca_eval
pip install -e .
cd FastChat
pip install -e ".[model_worker,llm_judge]"
cd ..

# setup api keys
export TOGETHER_API_KEY=<TOGETHER_API_KEY>
export OPENAI_API_KEY=<OPENAI_API_KEY>

Run AlpacaEval 2

To run AlpacaEval 2, execute the following scripts:

bash run_generation.sh
bash run_eval.sh

Run MT-Bench

For a minimal example of MT-Bench evaluation, run:

bash run_eval_mt_bench.sh

Run FLASK

For a minimal example of FLASK evaluation, run:

bash run_eval_flask.sh

Please feel free to contact us if you have difficulties in reproducing the results.

Credits

Notably, this work was made possible by the collaborative spirit and contributions of active organizations in the AI field. We appreciate the efforts of Meta AI, Mistral AI, Microsoft, Alibaba Cloud, and DataBricks for developing the Llama 3, Mixtral, WizardLM 2, Qwen 1.5, and DBRX models. Additionally, we extend our gratitude to Tatsu Labs, LMSYS, and KAIST AI for developing the AlpacaEval, MT-Bench, and FLASK evaluation benchmarks.

License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.

Citation

If you find this work helpful, please consider citing:

@article{wang2024mixture,
  title={Mixture-of-Agents Enhances Large Language Model Capabilities},
  author={Wang, Junlin and Wang, Jue and Athiwaratkun, Ben and Zhang, Ce and Zou, James},
  journal={arXiv preprint arXiv:2406.04692},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mixture-of-Agents

Introduction

Interactive Demo

Setup

Running the Demo

Configuration

Evaluation Benchmarks

Preparation:

Run AlpacaEval 2

Run MT-Bench

Run FLASK

Credits

License

Citation

About

Releases

Packages

Languages

License

jiuu8888/MoA

Folders and files

Latest commit

History

Repository files navigation

Mixture-of-Agents

Introduction

Interactive Demo

Setup

Running the Demo

Configuration

Evaluation Benchmarks

Preparation:

Run AlpacaEval 2

Run MT-Bench

Run FLASK

Credits

License

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages