Paper: https://arxiv.org/abs/2406.12045
- Clone this repository:
git clone https://github.com/sierra-research/tau-bench && cd ./tau-bench
- Install from source (which also installs required packages):
pip install -e .
- Set up your OpenAI / Anthropic / Google / Mistral / AnyScale API keys as environment variables.
OPENAI_API_KEY=...
ANTHROPIC_API_KEY=...
GEMINI_API_KEY=...
MISTRAL_API_KEY=...
ANYSCALE_API_KEY=...
Run a function calling agent on the τ-retail environment:
python run.py --env retail --model gpt-4o --max_concurrency 10
Set max concurrency according to your API limit.