A framework for building multi-agent data workflows with LLMs with interactive user feedback.
I like to think (and the sooner the better!) of a cybernetic meadow where mammals and computers live together in mutually programming harmony like pure water touching clear sky.
--Richard Brautigan
Check out our blog post announcing Meadow.
If you don't have poetry
installed, follow instructions from here. We recommend
curl -sSL https://install.python-poetry.org | python3 -
Then install Meadow with
cd meadow
make install
If you want to load our sample data (required for the benchmark) in examples/data
, you will need sqlite3
. To load the clean data, run
cd meadow/examples/data
mkdir -p database_sqlite/sales_example
sqlite3 database_sqlite/sales_example/sales_example.sqlite < sales.sql
And to load the ambiguous data (also needed for benchmark), run
cd meadow/examples/data
mkdir -p database_sqlite/sales_ambiguous_joins_example
sqlite3 database_sqlite/sales_ambiguous_joins_example/sales_ambiguous_joins_example.sqlite < sales_ambiguous_joins.sql
Meadow is a plan-then-execute framework where a planner generates a sequence of sub-tasks for a collection of agents to solve based on the user's high level goal. Each each point in the plan, Meadow will return the current sub-task output to the user for feedback. The user can enter one of three things:
- A text response of feedback for what to change.
- '<next>' string which indicates the user is happy with the output and ready to move on.
- '<end>' string which indicates the user wants to terminate the conversation.
Meadow supports auto-advance which means the feedback is always '<next>'.
To get started with a simple, interactive use case using text-to-SQL on a sqlite3 database, we have examples/demo.py
to run. This use case has three main task agents: a attribute detector, a schema cleaner, and a text-to-SQL agent. Note, you will need an OpenAI API key or have it set as an environment variable.
poetry run python examples/demo.py \
--api-key API_KEY \
--db-path examples/data/database_sqlite/sales_example/sales_example.sqlite
If you want to auto-advance instead of giving feedback, add the flag --auto-advance
.
To repo results from our custom 50 question text-to-SQL benchmark, run
poetry run experiments/text2sql/predict.py predict \
examples/data/custom_text2sql_benchmark.json \
examples/data/sales_example_schema.json \
examples/data/database_sqlite \
--output-dir experiments/text2sql/predictions \
--client-cache-path test_custom_cache.duckdb \
--use-table-schema \
--api-provider openai \
--planner-api-provider openai \
--api-key <OPEN_AI_API_KEY> \
--planner-api-key <OPEN_AI_API_KEY> \
--model gpt-4o \
--planner-model gpt-4o \
--num-run 50 \
--num-print 0 \
--add-reask \
--add-empty-table \
--add-attribute-selector \
--add-schema-cleaner \
--async-batch-size 20
make sure to setup the sqlite database in the instructions above (you will need both databases setup).
You can evaluate the results by first running
cd experiments/text2sql/
mkdir metrics
cd metrics
git clone [email protected]:ElementAI/test-suite-sql-eval.git test_suite_sql_eval
cd ../../..
to download spider evaluation code.
Then running
poetry run python3 experiments/text2sql/evaluate.py evaluate \
--gold examples/data/custom_text2sql_benchmark.json \
--db examples/data/database_sqlite \
--tables examples/data/sales_example_schema.json \
--output-dir experiments/text2sql/predictions \
--client-cache-path test_custom_cache.duckdb \
--api-key <OPEN_AI_API_KEY> \
--model gpt-4o \
--pred <PREDICTION_FILE_FROM_ABOVE>
The same script can be used to run Spider V1 eval numbers. V2 is coming soon!
To see an example of how you can build your own data agents, see new_agent.ipynb. This notebook will walk you through loading up the existing text-to-SQL pipeline, adding an agent that explains the response, and asking SQL questions.
Meadow is a framework for building multi-agent workflows to solve data tasks with LLMs. Customizable agents all share access to a data layer and converse to solve a task with optional user feedback along the way. Meadow provides:
- A shared data layer to allow agents to solve data tasks and communicate updates to downstream tasks.
- Specialized executor agents that allow for iterative debugging against the data.
- Data-aware planners that break down complex data tasks into sub-tasks based on the available agents.
- A user agent that can give feedback and control the flow of the planned exection with a simple set of keyword commands.
Meadow is a plan-then-execute framework where a planner generates a sequence of sub-tasks for a collection of agents to solve based on the user's high level goal. At any point in time, there are two agents communicating to complete a sub-task. During this two agent communication, one agent is designated the Supervisor
and is the one that initiates the task and provides feedback. The other agent is the Task Handler
that performs the task. Very often, the user is the supervisor and an LLM-backed agent is the task handler.
The core abstraction of Meadow is the Agent
that must implement three methods: to send a message to another agent, to receive a message from another agent, and to generate a reply. An agent does not need to be based on a LLM, e.g. a user agent, but it often is.
An agent can be extended in three possible ways:
PlannerAgent
: The agent must have a collection of agents it can coordinate between and must be able to move through a plan linearly. Planner agents can be hierarchical in that one planner agent can have another planner agent as one of its available agents to coordinate between.ExecutorAgent
: The agent must have an execution function that parses a message from another agent. The role of the executor is to catch bugs and throw back error messages upon mistakes.AgentWithExecutors
: This is an agent that has an associated executor agent that must execute on this agent's output. When an agent has an executor, a sub-chat is initiated between the agent and its executor so the executor and agent can debug if desired.
Agents can be extended in multiple ways, e.g., an agent can be both an executor agent and have its own executors.
There is one special non-LLM agent in Meadow called the Controller
. The controller is determinisitc and has three main functions:
- It is the mediator between any conversation between two agents. It sends and receives messages and stores the message coversation history for ease of access and observability.
- It implements and understands the
Commands
sent by the supervisor to move through a plan. - It can start new chats between two agents. E.g. it initiates the chat between an agent and its executor as a "branch" off of the main chat. Once the sub-chat terminates, the main chat can continue.
All agents have access to a Database
that is a view over the user's data. The database allows agents to retrieve information via SQL queries as well as create views over the data. Each view that is created is shared to other agents solving the task.
We support backend databases connected to SQLite or DuckDB.
We support a simple LLM wrapper client that supports sending a chat requiest to various LLMs. The client integrates with a caching layer for faster and cheaper reruns of the same inputs. We currently support OpenAI and Anthropic.
We'd like to thank the follow packages for inspiring and helping Meadow.
For any questions or comments, please leave an Issue or email [email protected]
.
To cite
@misc{orr2024meadow,
author = {Orr, Laurel and Chami, Ines},
title = {Meadow},
year = {2024},
publisher = {GitHub},
howpublished = {\url{https://github.com/NumbersStationAI/meadow}},
}