Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
zer0o0ne authored Jun 18, 2024
1 parent cb94e47 commit 4fefd3c
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,12 @@ AriGraph serves as the external memory architecture for large language models (L
![**Ariadne agent and his results**](img/Architecture.png?raw=True)

## Performance
We implement five TextWorld environments for three different tasks: Treasure Hunt, Cleaning and Cooking. The first task involves navigating a maze and searching for treasure, the second entails tidying up a house by placing items in their designated spots, and the third focuses on gathering ingredients and preparing a meal. Each tested LLM agent had an identical decision-making module, and the agents differed from each other only in the implementation of memory. There is a mean normalized game scores in the following table:
We implement five TextWorld environments for three different tasks: Treasure Hunt, Cleaning and Cooking. The first task involves navigating a maze and searching for treasure, the second entails tidying up a house by placing items in their designated spots, and the third focuses on gathering ingredients and preparing a meal. Each tested LLM agent had an identical decision-making module, and the agents differed from each other only in the implementation of memory. We reported human scores averaged across both all runs and top-3 performance runs. There is a mean normalized game scores in the following table:
Type of memory | Treasure Hunt | Cleaning | Cooking | Treasure Hunt Hard | Cooking Hard
-- | -- | -- | -- | -- | --
AriGraph (ours) | 1.0 | 0.79 | 1.0 | 1.0 | 1.0
Human Players | 1.0 | 0.85 | 1.0 | - | -
Human Players Top-3 | 1.0 | 0.85 | 1.0 | - | -
Human Players All | 0.96 | 0.59 | 0.32 | - | -
Full History | 0.49 | 0.05 | 0.18 | - | -
Summary | 0.33 | 0.39 | 0.52 | 0.17 | 0.21
RAG | 0.33 | 0.35 | 0.36 | 0.17 | 0.17
Expand Down

0 comments on commit 4fefd3c

Please sign in to comment.