From d56a847b7de2561e48055a81f242a751db8c63fe Mon Sep 17 00:00:00 2001 From: Maxime Chevalier-Boisvert Date: Wed, 3 Oct 2018 17:02:52 -0400 Subject: [PATCH] Split out levels documentation into separate markdown files. Documentation not yet complete for ICLR19 levels. --- README.md | 267 ++---------------------------------------- docs/bonus_levels.md | 258 ++++++++++++++++++++++++++++++++++++++++ docs/iclr19_levels.md | 49 ++++++++ 3 files changed, 314 insertions(+), 260 deletions(-) create mode 100644 docs/bonus_levels.md create mode 100644 docs/iclr19_levels.md diff --git a/README.md b/README.md index 6148f418..7a87dcf5 100644 --- a/README.md +++ b/README.md @@ -84,6 +84,13 @@ If you connect to the lab machines by ssh-ing, make sure to use `ssh -X` in orde The code does not work in conda, install everything with `pip install --user`. +### The levels + +Documentation for the ICLR19 levels can be found in +[docs/iclr19_levels.md](docs/iclr19_levels.md). +There are also older levels documented in +[docs/bonus_levels.md](docs/iclr19_levels.md). + ### Troubleshooting If you run into error messages relating to OpenAI gym or PyQT, it may be that the version of those libraries that you have installed is incompatible. You can try upgrading specific libraries with pip3, eg: `pip3 install --upgrade gym`. If the problem persists, please [open an issue](https://github.com/maximecb/baby-ai-game/issues) on this repository and paste a *complete* error message, along with some information about your platform (are you running Windows, Mac, Linux? Are you running this on a Mila machine?). @@ -114,263 +121,3 @@ gestures in combination with language may be key. You can find here a presentation of the project: [Baby AI Summary](https://docs.google.com/document/d/1WXY0HLHizxuZl0GMGY0j3FEqLaK1oX-66v-4PyZIvdU) A work-in-progress review of related work can be found [here](https://www.overleaf.com/13480997qqsxybgstxhg#/52042269/) - -## The levels - -In naming the levels we adhere to the following convention: -- `N2`, `N3`, `N4` refers to the number of objects in the room/environment -- `S2`, `S3`, `S4` refers to the size of the room/environment -- in `Debug` levels the episode is terminated once the agent does something unnecessary or fatally bad, for example - - picks up an object which it is not supposed to pick up (unnecessary) - - open the door that it is supposed to open _after_ another one (fatal) -- in `Carrying` levels the agent starts carrying the object of interest -- in `Dist` levels distractor objects are placed to confuse the agent - -### OpenRedDoor - -- Environment: The agent is placed in a room with a door. -- Instruction: Open the red door -- Evaluate: image understanding -- Level id: `BabyAI-OpenRedDoor-v0` - -

- -### OpenDoor - -- Environment: The agent is placed in a room with 4 different doors. The environment is done when the instruction is executed in the regular mode or when a door is opened in the `debug` mode. -- Instruction: Open a door of: - - a given color or location in `OpenDoor` - - a given color in `OpenDoorColor` - - a given location in `OpenDoorLoc` -- Evaluate: image & text understanding, memory in `OpenDoor` and `OpenDoorLoc` -- Level id: - - `BabyAI-OpenDoor-v0` - - `BabyAI-OpenDoorDebug-v0` - - `BabyAI-OpenDoorColor-v0` - - `BabyAI-OpenDoorColorDebug-v0` - - `BabyAI-OpenDoorLoc-v0` - - `BabyAI-OpenDoorLocDebug-v0` - -

- -### GoToDoor - -- Environment: The agent is placed in a room with 4 different doors. -- Instruction: Go to a door of a given of a given color. -- Evaluate: image & text understanding -- Level id: `BabyAI-GoToDoor-v0` - -### GoToObjDoor - -- Environment: The agent is placed in a room with 4 different doors and 5 different objects. -- Instruction: Go to an object or a door of a given type and color -- Evaluate: image & text understanding -- Level id: `BabyAI-GoToObjDoor-v0` - -

- -### ActionObjDoor - -- Environment: The agent is placed in a room with 4 different doors and 5 different objects. -- Instruction: [Pick up an object] or [go to an object or door] or [open a door] -- Evaluate: image & text understanding -- Level id: `BabyAI-ActionObjDoor-v0` - -

- -### Unlock - -- Environment: Maze environment where the agent has to retrieve a key to open a locked door. -- Instruction: Open the door -- Evaluate: image understanding, navigation, memory. -- Level id: - - `BabyAI-Unlock-v0` - -### UnlockPickup - -- Environment: The agent is placed in a room with a key and a locked door. The door opens onto a room with a box. Rooms have either no distractors in `UnlockPickup` or 4 distractors in `UnlockPickupDist`. -- Instruction: Pick up an object of a given type and color -- Evaluate: image understanding, memory in `UnlockPickupDist` -- Level id: `BabyAI-UnlockPickup-v0`, `BabyAI-UnlockPickupDist-v0` - -

- - -

- -### BlockedUnlockPickup - -- Environment: The agent is placed in a room with a key and a locked door. The door is blocked by a ball. The door opens onto a room with a box. -- Instruction: Pick up the box -- Evaluate: image understanding -- Level id: `BabyAI-BlockedUnlockPickup-v0` - -

- -### UnlockToUnlock - -- Environment: The agent is placed in a room with a key of color A and two doors of color A and B. The door of color A opens onto a room with a key of color B. The door of color B opens onto a room with a ball. -- Instruction: Pick up the ball -- Evaluate: image understanding -- Level id: `BabyAI-UnlockToUnlock-v0` - -

- -### KeyInBox - -- Environment: The agent is placed in a room with a box containing a key and a locked door. -- Instruction: Open the door -- Evaluate: image understanding -- Level id: `BabyAI-KeyInBox-v0` - -

- -### PickupDist - -- Environment: The agent is placed in a room with 5 objects. The environment is done when the instruction is executed in the regular mode or when any object is picked in the `debug` mode. -- Instruction: Pick up an object of a given type and color -- Evaluate: image & text understanding -- Level id: - - `BabyAI-PickupDist-v0` - - `BabyAI-PickupDistDebug-v0` - -

- -### PickupAbove - -- Environment: The agent is placed in the middle room. An object is placed in the top-middle room. -- Instruction: Pick up an object of a given type and color -- Evaluate: image & text understanding, memory -- Level id: `BabyAI-PickupAbove-v0` - -

- -### OpenRedBlueDoors - -- Environment: The agent is placed in a room with a red door and a blue door facing each other. The environment is done when the instruction is executed in the regular mode or when the blue door is opened in the `debug` mode. -- Instruction: Open the red door then open the blue door -- Evaluate: image understanding, memory -- Level id: - - `BabyAI-OpenRedBlueDoors-v0` - - `BabyAI-OpenRedBlueDoorsDebug-v0` - -

- -### OpenTwoDoors - -- Environment: The agent is placed in a room with a red door and a blue door facing each other. The environment is done when the instruction is executed in the regular mode or when the second door is opened in the `debug` mode. -- Instruction: Open the door of color X then open the door of color Y -- Evaluate: image & text understanding, memory -- Level id: - - `BabyAI-OpenTwoDoors-v0` - - `BabyAI-OpenTwoDoorsDebug-v0` - -

- -### FindObj - -- Environment: The agent is placed in the middle room. An object is placed in one of the rooms. Rooms have a size of 5 in `FindObjS5`, 6 in `FindObjS6` or 7 in `FindObjS7`. -- Instruction: Pick up an object of a given type and color -- Evaluate: image understanding, memory -- Level id: - - `BabyAI-FindObjS5-v0` - - `BabyAI-FindObjS6-v0` - - `BabyAI-FindObjS7-v0` - -

- - - -

- -### FourObjs - -- Environment: The agent is placed in the middle room. 4 different objects are placed in the adjacent rooms. Rooms have a size of 5 in `FourObjsS5`, 6 in `FourObjsS6` or 7 in `FourObjsS7`. -- Instruction: Pick up an object of a given type and location -- Evaluate: image understanding, memory -- Level id: - - `BabyAI-FourObjsS5-v0` - - `BabyAI-FourObjsS6-v0` - - `BabyAI-FourObjsS7-v0` - -

- - - -

- -### KeyCorridor - -- Environment: The agent is placed in the middle of the corridor. One of the rooms is locked and contains a ball. Another room contains a key for opening the previous one. The level is split into a curriculum starting with one row of 3x3 rooms, going up to 3 rows of 6x6 rooms. -- Instruction: Pick up an object of a given type -- Evaluate: image understanding, memory -- Level ids: - - `BabyAI-KeyCorridorS3R1-v0` - - `BabyAI-KeyCorridorS3R2-v0` - - `BabyAI-KeyCorridorS3R3-v0` - - `BabyAI-KeyCorridorS4R3-v0` - - `BabyAI-KeyCorridorS5R3-v0` - - `BabyAI-KeyCorridorS6R3-v0` - -

- - - - - - -

- -### 1Room - -- Environment: The agent is placed in a room with a ball. The level is split into a curriculum with rooms of size 8, 12, 16 or 20. -- Instruction: Pick up the ball -- Evaluate: image understanding, memory -- Level ids: - - `BabyAI-1RoomS8-v0` - - `BabyAI-1RoomS12-v0` - - `BabyAI-1RoomS16-v0` - - `BabyAI-1RoomS20-v0` - -

- - - - -

- -### OpenDoorsOrder - -- Environment: There are two or four doors in a room. The agent has to open - one or two of the doors in a given order. -- Instruction: - - open the X door - - open the X door and then open the Y door - - open the X door after you open the Y door -- Level ids: - - `BabyAI-OpenDoorsOrderN2-v0` - - `BabyAI-OpenDoorsOrderN4-v0` - - `BabyAI-OpenDoorsOrderN2Debug-v0` - - `BabyAI-OpenDoorsOrderN4Debug-v0` - -### PutNext - -- Environment: Single room with multiple objects. One of the objects must be moved next to another specific object. -- Instruction: Put the X next to the Y -- Level ids: - - `BabyAI-PutNextS4N1-v0` - - `BabyAI-PutNextS5N1-v0` - - `BabyAI-PutNextS6N2-v0` - - `BabyAI-PutNextS6N3-v0` - - `BabyAI-PutNextS7N4-v0` - - `BabyAI-PutNextS6N2Carrying-v0` - - `BabyAI-PutNextS6N3Carrying-v0` - - `BabyAI-PutNextS7N4Carrying-v0` - -### MoveTwoAcross - -- Environment: Two objects must be moved so that they are next to two other objects. This task is structured to have a very large number of possible instructions. -- Instruction: Put the A next to the B and the C next to the D -- Level ids: - - `BabyAI-MoveTwoAcrossS5N2-v0` - - `BabyAI-MoveTwoAcrossS8N9-v0` diff --git a/docs/bonus_levels.md b/docs/bonus_levels.md new file mode 100644 index 00000000..fdf458d6 --- /dev/null +++ b/docs/bonus_levels.md @@ -0,0 +1,258 @@ +# Bonus Levels + +The levels described in this file were created prior to the ICLR19 publication. +We've chosen to keep these because they may be useful for curriculum learning +or for specific research projects. + +Please note that these levels are not as widely tested as the ICLR19 levels. +If you run into problems, please open an issue on this repository. + +In naming the levels we adhere to the following convention: +- `N2`, `N3`, `N4` refers to the number of objects in the room/environment +- `S2`, `S3`, `S4` refers to the size of the room/environment +- in `Debug` levels the episode is terminated once the agent does something unnecessary or fatally bad, for example + - picks up an object which it is not supposed to pick up (unnecessary) + - open the door that it is supposed to open _after_ another one (fatal) +- in `Carrying` levels the agent starts carrying the object of interest +- in `Dist` levels distractor objects are placed to confuse the agent + +## OpenRedDoor + +- Environment: The agent is placed in a room with a door. +- instruction: open the red door +- Evaluate: image understanding +- Level id: `BabyAI-OpenRedDoor-v0` + +

+ +## OpenDoor + +- Environment: The agent is placed in a room with 4 different doors. The environment is done when the instruction is executed in the regular mode or when a door is opened in the `debug` mode. +- instruction: open a door of: + - a given color or location in `OpenDoor` + - a given color in `OpenDoorColor` + - a given location in `OpenDoorLoc` +- Evaluate: image & text understanding, memory in `OpenDoor` and `OpenDoorLoc` +- Level id: + - `BabyAI-OpenDoor-v0` + - `BabyAI-OpenDoorDebug-v0` + - `BabyAI-OpenDoorColor-v0` + - `BabyAI-OpenDoorColorDebug-v0` + - `BabyAI-OpenDoorLoc-v0` + - `BabyAI-OpenDoorLocDebug-v0` + +

+ +## GoToDoor + +- Environment: The agent is placed in a room with 4 different doors. +- Instruction: Go to a door of a given of a given color. +- Evaluate: image & text understanding +- Level id: `BabyAI-GoToDoor-v0` + +## GoToObjDoor + +- Environment: The agent is placed in a room with 4 different doors and 5 different objects. +- Instruction: Go to an object or a door of a given type and color +- Evaluate: image & text understanding +- Level id: `BabyAI-GoToObjDoor-v0` + +

+ +## ActionObjDoor + +- Environment: The agent is placed in a room with 4 different doors and 5 different objects. +- Instruction: [Pick up an object] or [go to an object or door] or [open a door] +- Evaluate: image & text understanding +- Level id: `BabyAI-ActionObjDoor-v0` + +

+ +## UnlockPickup + +- Environment: The agent is placed in a room with a key and a locked door. The door opens onto a room with a box. Rooms have either no distractors in `UnlockPickup` or 4 distractors in `UnlockPickupDist`. +- instruction: pick up an object of a given type and color +- Evaluate: image understanding, memory in `UnlockPickupDist` +- Level id: `BabyAI-UnlockPickup-v0`, `BabyAI-UnlockPickupDist-v0` + +

+ + +

+ +## BlockedUnlockPickup + +- Environment: The agent is placed in a room with a key and a locked door. The door is blocked by a ball. The door opens onto a room with a box. +- instruction: pick up the box +- Evaluate: image understanding +- Level id: `BabyAI-BlockedUnlockPickup-v0` + +

+ +## UnlockToUnlock + +- Environment: The agent is placed in a room with a key of color A and two doors of color A and B. The door of color A opens onto a room with a key of color B. The door of color B opens onto a room with a ball. +- instruction: pick up the ball +- Evaluate: image understanding +- Level id: `BabyAI-UnlockToUnlock-v0` + +

+ +## KeyInBox + +- Environment: The agent is placed in a room with a box containing a key and a locked door. +- instruction: open the door +- Evaluate: image understanding +- Level id: `BabyAI-KeyInBox-v0` + +

+ +## PickupDist + +- Environment: The agent is placed in a room with 5 objects. The environment is done when the instruction is executed in the regular mode or when any object is picked in the `debug` mode. +- instruction: pick up an object of a given type and color +- Evaluate: image & text understanding +- Level id: + - `BabyAI-PickupDist-v0` + - `BabyAI-PickupDistDebug-v0` + +

+ +## PickupAbove + +- Environment: The agent is placed in the middle room. An object is placed in the top-middle room. +- instruction: pick up an object of a given type and color +- Evaluate: image & text understanding, memory +- Level id: `BabyAI-PickupAbove-v0` + +

+ +## OpenRedBlueDoors + +- Environment: The agent is placed in a room with a red door and a blue door facing each other. The environment is done when the instruction is executed in the regular mode or when the blue door is opened in the `debug` mode. +- instruction: open the red door then open the blue door +- Evaluate: image understanding, memory +- Level id: + - `BabyAI-OpenRedBlueDoors-v0` + - `BabyAI-OpenRedBlueDoorsDebug-v0` + +

+ +## OpenTwoDoors + +- Environment: The agent is placed in a room with a red door and a blue door facing each other. The environment is done when the instruction is executed in the regular mode or when the second door is opened in the `debug` mode. +- instruction: open the door of color X then open the door of color Y +- Evaluate: image & text understanding, memory +- Level id: + - `BabyAI-OpenTwoDoors-v0` + - `BabyAI-OpenTwoDoorsDebug-v0` + +

+ +## FindObj + +- Environment: The agent is placed in the middle room. An object is placed in one of the rooms. Rooms have a size of 5 in `FindObjS5`, 6 in `FindObjS6` or 7 in `FindObjS7`. +- instruction: pick up an object of a given type and color +- Evaluate: image understanding, memory +- Level id: + - `BabyAI-FindObjS5-v0` + - `BabyAI-FindObjS6-v0` + - `BabyAI-FindObjS7-v0` + +

+ + + +

+ +## FourObjs + +- Environment: The agent is placed in the middle room. 4 different objects are placed in the adjacent rooms. Rooms have a size of 5 in `FourObjsS5`, 6 in `FourObjsS6` or 7 in `FourObjsS7`. +- instruction: pick up an object of a given type and location +- Evaluate: image understanding, memory +- Level id: + - `BabyAI-FourObjsS5-v0` + - `BabyAI-FourObjsS6-v0` + - `BabyAI-FourObjsS7-v0` + +

+ + + +

+ +## KeyCorridor + +- Environment: The agent is placed in the middle of the corridor. One of the rooms is locked and contains a ball. Another room contains a key for opening the previous one. The level is split into a curriculum starting with one row of 3x3 rooms, going up to 3 rows of 6x6 rooms. +- instruction: pick up an object of a given type +- Evaluate: image understanding, memory +- Level ids: + - `BabyAI-KeyCorridorS3R1-v0` + - `BabyAI-KeyCorridorS3R2-v0` + - `BabyAI-KeyCorridorS3R3-v0` + - `BabyAI-KeyCorridorS4R3-v0` + - `BabyAI-KeyCorridorS5R3-v0` + - `BabyAI-KeyCorridorS6R3-v0` + +

+ + + + + + +

+ +## 1Room + +- Environment: The agent is placed in a room with a ball. The level is split into a curriculum with rooms of size 8, 12, 16 or 20. +- instruction: pick up the ball +- Evaluate: image understanding, memory +- Level ids: + - `BabyAI-1RoomS8-v0` + - `BabyAI-1RoomS12-v0` + - `BabyAI-1RoomS16-v0` + - `BabyAI-1RoomS20-v0` + +

+ + + + +

+ +## OpenDoorsOrder + +- Environment: There are two or four doors in a room. The agent has to open + one or two of the doors in a given order. +- Instruction: + - open the X door + - open the X door and then open the Y door + - open the X door after you open the Y door +- Level ids: + - `BabyAI-OpenDoorsOrderN2-v0` + - `BabyAI-OpenDoorsOrderN4-v0` + - `BabyAI-OpenDoorsOrderN2Debug-v0` + - `BabyAI-OpenDoorsOrderN4Debug-v0` + +## PutNext + +- Environment: Single room with multiple objects. One of the objects must be moved next to another specific object. +- instruction: put the X next to the Y +- Level ids: + - `BabyAI-PutNextS4N1-v0` + - `BabyAI-PutNextS5N1-v0` + - `BabyAI-PutNextS6N2-v0` + - `BabyAI-PutNextS6N3-v0` + - `BabyAI-PutNextS7N4-v0` + - `BabyAI-PutNextS6N2Carrying-v0` + - `BabyAI-PutNextS6N3Carrying-v0` + - `BabyAI-PutNextS7N4Carrying-v0` + +## MoveTwoAcross + +- Environment: Two objects must be moved so that they are next to two other objects. This task is structured to have a very large number of possible instructions. +- instruction: put the A next to the B and the C next to the D +- Level ids: + - `BabyAI-MoveTwoAcrossS5N2-v0` + - `BabyAI-MoveTwoAcrossS8N9-v0` diff --git a/docs/iclr19_levels.md b/docs/iclr19_levels.md new file mode 100644 index 00000000..a08508e2 --- /dev/null +++ b/docs/iclr19_levels.md @@ -0,0 +1,49 @@ +# ICLR19 Levels + +The levels described in this file were created for the ICLR19 submission. +These form a curriculum that is subdivided according to specific competencies. + +## GoToObj + +## GoToRedBall + +## GoToRedBallGrey + +## GoToLocal + +## PutNextLocal + +## PickUpLoc + +## GoToObjMaze + +## GoTo + +## Pickup + +## PickupUnblock + +## Open + +## Unlock + +Maze environment where the agent has to retrieve a key to open a locked door. + +- Instruction: open the door +- Evaluate: image understanding, navigation, memory. +- Level id: + - `BabyAI-Unlock-v0` + +## PutNext + +## Synth + +## SynthLoc + +## GoToSeq + +## SynthSeq + +## GoToImpUnlock + +## BossLevel