MATE: the Multi-Agent Tracking Environment

This repo contains the source code of MATE, the Multi-Agent Tracking Environment. The full documentation can be found at https://mate-gym.readthedocs.io. The full list of implemented agents can be found in section Implemented Algorithms. For detailed description, please checkout our paper (PDF, bibtex).

This is an asymmetric two-team zero-sum stochastic game with partial observations, and each team has multiple agents (multiplayer). Intra-team communications are allowed, but inter-team communications are prohibited. It is cooperative among teammates, but it is competitive among teams (opponents).

Installation

git config --global core.symlinks true  # required on Windows
pip3 install git+https://github.com/XuehaiPan/mate.git#egg=mate

NOTE: Python 3.7+ is required, and Python versions lower than 3.7 is not supported.

It is highly recommended to create a new isolated virtual environment for MATE using conda:

git clone https://github.com/XuehaiPan/mate.git && cd mate
conda env create --no-default-packages --file conda-recipes/basic.yaml  # or full-cpu.yaml to install RLlib
conda activate mate

Getting Started

Make the MultiAgentTracking environment and play!

import mate

# Base environment for MultiAgentTracking
env = mate.make('MultiAgentTracking-v0')
env.seed(0)
done = False
camera_joint_observation, target_joint_observation = env.reset()
while not done:
    camera_joint_action, target_joint_action = env.action_space.sample()  # your agent here (this takes random actions)
    (
        (camera_joint_observation, target_joint_observation),
        (camera_team_reward, target_team_reward),
        done,
        (camera_infos, target_infos)
    ) = env.step((camera_joint_action, target_joint_action))

Another example with a built-in single-team wrapper (see also Built-in Wrappers):

import mate

env = mate.make('MultiAgentTracking-v0')
env = mate.MultiTarget(env, camera_agent=mate.GreedyCameraAgent(seed=0))
env.seed(0)
done = False
target_joint_observation = env.reset()
while not done:
    target_joint_action = env.action_space.sample()  # your agent here (this takes random actions)
    target_joint_observation, target_team_reward, done, target_infos = env.step(target_joint_action)

4 Cameras vs. 8 Targets (9 Obstacles)

Examples and Demos

mate/evaluate.py contains the example evaluation code for the MultiAgentTracking environment. Try out the following demos:

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 cameras, 2 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-4v2-9.yaml

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 cameras, 8 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-4v8-9.yaml

# <MultiAgentTracking<MultiAgentTracking-v0>>(8 cameras, 8 targets, 9 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-8v8-9.yaml

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 cameras, 8 targets, 0 obstacle)
python3 -m mate.evaluate --episodes 1 --config MATE-4v8-0.yaml

# <MultiAgentTracking<MultiAgentTracking-v0>>(0 camera, 8 targets, 32 obstacles)
python3 -m mate.evaluate --episodes 1 --config MATE-Navigation.yaml

4 Cameras vs. 2 Targets (9 obstacles)	4 Cameras vs. 8 Targets (9 obstacles)	8 Cameras vs. 8 Targets (9 obstacles)	4 Cameras vs. 8 Targets (no obstacles)	8 Targets Navigation (no cameras)

You can specify the agent classes and arguments by:

python3 -m mate.evaluate --camera-agent module:class --camera-kwargs <JSON-STRING> --target-agent module:class --target-kwargs <JSON-STRING>

You can find the example code for agents in examples. The full list of implemented agents can be found in section Implemented Algorithms. For example:

# Example demos in examples
python3 -m examples.naive

# Use the evaluation script
python3 -m mate.evaluate --episodes 1 --render-communication \
    --camera-agent examples.greedy:GreedyCameraAgent --camera-kwargs '{"memory_period": 20}' \
    --target-agent examples.greedy:GreedyTargetAgent \
    --config MATE-4v8-9.yaml \
    --seed 0

You can implement your own custom agents classes to play around. See Make Your Own Agents for more details.

Environment Configurations

The MultiAgentTracking environment accepts a Python dictionary mapping or a configuration file in JSON or YAML format. If you want to use customized environment configurations, you can copy the default configuration file:

cp "$(python3 -m mate.assets)"/MATE-4v8-9.yaml MyEnvCfg.yaml

Then make some modifications for your own. Use the modified environment by:

env = mate.make('MultiAgentTracking-v0', config='/path/to/your/cfg/file')

There are several preset configuration files in mate/assets directory.

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 camera, 2 targets, 9 obstacles)
env = mate.make('MATE-4v2-9-v0')

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 camera, 8 targets, 9 obstacles)
env = mate.make('MATE-4v8-9-v0')

# <MultiAgentTracking<MultiAgentTracking-v0>>(8 camera, 8 targets, 9 obstacles)
env = mate.make('MATE-8v8-9-v0')

# <MultiAgentTracking<MultiAgentTracking-v0>>(4 camera, 8 targets, 0 obstacles)
env = mate.make('MATE-4v8-0-v0')

# <MultiAgentTracking<MultiAgentTracking-v0>>(0 camera, 8 targets, 32 obstacles)
env = mate.make('MATE-Navigation-v0')

You can reinitialize the environment with a new configuration without creating a new instance:

>>> env = mate.make('MultiAgentTracking-v0', wrappers=[mate.MoreTrainingInformation])  # we support wrappers
>>> print(env)
<MoreTrainingInformation<MultiAgentTracking<MultiAgentTracking-v0>>(4 cameras, 8 targets, 9 obstacles)>

>>> env.load_config('MATE-8v8-9.yaml')
>>> print(env)
<MoreTrainingInformation<MultiAgentTracking<MultiAgentTracking-v0>>(8 cameras, 8 targets, 9 obstacles)>

Besides, we provide a script mate/assets/generator.py to generate a configuration file with responsible camera placement:

python3 -m mate.assets.generator --path 24v48.yaml --num-cameras 24 --num-targets 48 --num-obstacles 20

See Environment Customization for more details.

Built-in Wrappers

MATE provides multiple wrappers for different settings. Such as fully observability, discrete action spaces, single team multi-agent, etc. See Built-in Wrappers for more details.

Wrapper		Description
observation	`EnhancedObservation`	Enhance the agent’s observation, which sets all observation mask to `True`.
	`SharedFieldOfView`	Share field of view among agents in the same team, which applies the `or` operator over the observation masks. The target agents share the empty status of warehouses.
	`MoreTrainingInformation`	Add more environment and agent information to the `info` field of `step()`, enabling full observability of the environment.
	`RescaledObservation`	Rescale all entity states in the observation to [-1, +1].
	`RelativeCoordinates`	Convert all locations of other entities in the observation to relative coordinates.
action	`DiscreteCamera`	Allow cameras to use discrete actions.
action	`DiscreteTarget`	Allow targets to use discrete actions.
reward	`AuxiliaryCameraRewards`	Add additional auxiliary rewards for each individual camera.
reward	`AuxiliaryTargetRewards`	Add additional auxiliary rewards for each individual target.
single-team	`MultiCamera`	Wrap into a single-team multi-agent environment.
	`MultiTarget`	Wrap into a single-team multi-agent environment.
	`SingleCamera`	Wrap into a single-team single-agent environment.
	`SingleTarget`	Wrap into a single-team single-agent environment.
communication	`MessageFilter`	Filter messages from agents of intra-team communications.
	`RandomMessageDropout`	Randomly drop messages in communication channels.
	`RestrictedCommunicationRange`	Add a restricted communication range to channels.
	`NoCommunication`	Disable intra-team communications, i.e., filter out all messages.
	`ExtraCommunicationDelays`	Add extra message delays to communication channels.
miscellaneous	`RepeatedRewardIndividualDone`	Repeat the `reward` field and assign individual `done` field of `step()`, which is similar to MPE.

You can create an environment with multiple wrappers at once. For example:

env = mate.make('MultiAgentTracking-v0',
                wrappers=[
                    mate.EnhancedObservation,
                    mate.MoreTrainingInformation,
                    mate.WrapperSpec(mate.DiscreteCamera, levels=5),
                    mate.WrapperSpec(mate.MultiCamera, target_agent=mate.GreedyTargetAgent(seed=0)),
                    mate.RepeatedRewardIndividualDone,
                    mate.WrapperSpec(mate.AuxiliaryCameraRewards,
                                     coefficients={'raw_reward': 1.0,
                                                   'coverage_rate': 1.0,
                                                   'soft_coverage_score': 1.0,
                                                   'baseline': -2.0}),
                ])

Implemented Algorithms

The following algorithms are implemented in examples:

Rule-based:
1. Random (source: mate/agents/random.py)
2. Naive (source: mate/agents/naive.py)
3. Greedy (source: mate/agents/greedy.py)
4. Heuristic (source: mate/agents/heuristic.py)
Multi-Agent Reinforcement Learning Algorithms:
1. IQL (https://arxiv.org/abs/1511.08779)
2. QMIX (https://arxiv.org/abs/1803.11485)
3. MADDPG (MA-TD3) (https://arxiv.org/abs/1706.02275)
4. IPPO (https://arxiv.org/abs/2011.09533)
5. MAPPO (https://arxiv.org/abs/2103.01955)
Multi-Agent Reinforcement Learning Algorithms with Multi-Agent Communication:
1. TarMAC (base algorithm: IPPO) (https://arxiv.org/abs/1810.11187)
2. TarMAC (base algorithm: MAPPO)
3. I2C (base algorithm: MAPPO) (https://arxiv.org/abs/2006.06455)
Population Based Adversarial Policy Learning, available meta-solvers:
1. Self-Play (SP)
2. Fictitious Self-Play (FSP) (https://proceedings.mlr.press/v37/heinrich15.html)
3. PSRO-Nash (NE) (https://arxiv.org/abs/1711.00832)

NOTE: all learning-based algorithms are tested with Ray 1.12.0 on Ubuntu 20.04 LTS.

Citation

If you find MATE useful, please consider citing:

@inproceedings{pan2022mate,
  title     = {{MATE}: Benchmarking Multi-Agent Reinforcement Learning in Distributed Target Coverage Control},
  author    = {Xuehai Pan and Mickel Liu and Fangwei Zhong and Yaodong Yang and Song-Chun Zhu and Yizhou Wang},
  booktitle = {Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
  year      = {2022},
  url       = {https://openreview.net/forum?id=SyoUVEyzJbE}
}

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
conda-recipes		conda-recipes
docs		docs
examples		examples
mate		mate
scripts		scripts
.dockerignore		.dockerignore
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MATE: the Multi-Agent Tracking Environment

Installation

Getting Started

Examples and Demos

Environment Configurations

Built-in Wrappers

Implemented Algorithms

Citation

License

About

Releases

Packages

Languages

License

lizat-i/mate

Folders and files

Latest commit

History

Repository files navigation

MATE: the Multi-Agent Tracking Environment

Installation

Getting Started

Examples and Demos

Environment Configurations

Built-in Wrappers

Implemented Algorithms

Citation

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages