Name	Name	Last commit message	Last commit date
Latest commit History 11 Commits
assets	assets
timechamber	timechamber
.gitattributes	.gitattributes
.gitignore	.gitignore
LICENSE	LICENSE
README.md	README.md
setup.py	setup.py

TimeChamber: A Large Scale Self-Play Framework via Parallel Techniques

TimeChamber is a large scale self-play framework running on parallel simulation. Running self-play algorithms always needs lots of hardware resources, especially on 3D physically simulated environments. We provide a self-play framework that can achieve fast training and evaluation with limited hardware resources. TimeChamber is developed with the following key features:

Parallel Simulation: TimeChamber is built within Isaac Gym. Isaac Gym is a fast GPU-based simulation platform. It supports running thousands of environments in parallel on a single GPU.
Parallel Evaluation: Inspired by Vectorization techniques for fast population-based training, we leverage the vectorized models to evaluate different policy in parallel. TimeChamber can fast calculate dozens of policies' elo rating(represent their battle force). It also supports multi-player elo calculations by multi-elo.
Prioritized Fictitious Self-Play Benchmark: We implement a classic PPO-self-play algorithm on top of rl_games. We provide a prioritized player pool to avoid cycles and improve the diversity of training policy.
Competitive Multi-Agent Tasks: We introduce some competitive multi-agent tasks(e.g.,[Ant Sumo](#Ant Sumo), [Ant Battle](#Ant Battle)). These tasks test the efficiency of our self-play framework. After days of training,our agent can discover some physical skills like pulling, jumping,etc.

Installation

Download and follow the installation instructions of Isaac Gym: https://developer.nvidia.com/isaac-gym
Ensure that Isaac Gym works on your system by running one of the examples from the python/examples directory, like joint_monkey.py. Please follow troubleshooting steps described in the Isaac Gym Preview Release 3/4 install instructions if you have any trouble running the samples.
Then install this repo:

pip install -e .

Quick Start

Tasks

Source code for tasks can be found in timechamber/tasks, more interesting tasks will come soon(wellcome to contribute)

Ant Sumo

Ant Sumo is a 3D environment with simulated physics that allows pairs of ant agents to compete against each other. To win, the agent has to push the opponent out of the ring. If the agent's body touch the ground, it will reduce hp until death.

Ant Battle

Ant Battle is an expanded environment of Ant Sumo. It supports more than two agents compete against with each other. The battle ring radius will shrink, the agent which out of the ring will be eliminated.

Self-Play Training

To Train your policy for tasks, for example:

# run self-play training for Ant Sumo task
python train.py task=MA_Ant_Sumo

# run self-play training for Ant Battle task
python train.py task=MA_Ant_Battle

Key arguments to the training script follow IsaacGymEnvs Configuration and command line arguments . Other training arguments follow rl_games config parameters, you can change them in timechamber/tasks/train/*.yaml. There are some specific arguments for self-play training:

num_agents: Set the agents number for Ant Battle environment, it should larger than 1.
op_checkpoint: Set to path to the checkpoint to load initial opponent agent policy. If it's empty, opponent agent will use random policy.
update_win_rate: Win_rate threshold to add the current policy to opponent's player_pool.
player_pool_length: The max size of player pool, following FIFO rules.
games_to_check: Warm up for training, the player pool won't be updated until the current policy plays such number of games.
max_update_steps: If current policy update iterations exceed that number, the current policy will be added to opponent player_pool.

Policies Evaluation

To evaluate your policies, for example:

# run testing for Ant Sumo policy
python train.py task=MA_Ant_Sumo test=True checkpoint='runs/MA_Ant_Sumo/nn/policy_1.pth'

You can set the opponent agent policy using op_checkpoint. If it's empty, the opponent agent will use the same policy as checkpoint.
We use vectorized models to accelerate the evaluation of policies. Put policies into dir, let them compete with each other in parallel:

# run testing for Ant Sumo policy
python train.py task=MA_Ant_Sumo test=True checkpoint='runs/MA_Ant_Sumo/nn/policy_dir' player_pool_type=vectorized

There are some specific arguments for self-play evaluation, you can change them in timechamber/tasks/train/*.yaml:

games_num: Total episode numbers of evaluation.
record_elo: Set True to record the elo rating of your policies, after evaluation, you can check the elo.jpg in your checkpoint dir.
init_elo: Initial elo rating of each policy.

Building Your Own Task

You can build your own task follow IsaacGymEnvs , make sure the obs shape is correct andinfo contains win,loseanddraw:

import isaacgym
import timechamber
import torch

envs = timechamber.make(
    seed=0,
    task="MA_Ant_Sumo",
    num_envs=2,
    sim_device="cuda:0",
    rl_device="cuda:0",
)
# the obs shape should be (num_agents*num_envs,num_obs).
# the obs of training agent is (:num_envs,num_obs)
print("Observation space is", envs.observation_space)
print("Action space is", envs.action_space)
obs = envs.reset()
for _ in range(20):
    obs, reward, done, info = envs.step(
        torch.rand((2 * 2,) + envs.action_space.shape, device="cuda:0")
    )
# info:
# {'win': tensor([Bool, Bool])
# 'lose': tensor([Bool, Bool])
# 'draw': tensor([Bool, Bool])}

Citing

If you use timechamber in your research please use the following citation:

@misc{InspirAI,
  author = {ZeldaHuang,Flood Sung},
  title = {TimeChamber: A Large Scale Self-Play Framework via Parallel Techniques},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/inspirai/TimeChamber}},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TimeChamber: A Large Scale Self-Play Framework via Parallel Techniques

Installation

Quick Start

Tasks

Ant Sumo

Ant Battle

Self-Play Training

Policies Evaluation

Building Your Own Task

Citing

About

Releases

Packages

Contributors 2

Languages

License

inspirai/TimeChamber

Folders and files

Latest commit

History

Repository files navigation

TimeChamber: A Large Scale Self-Play Framework via Parallel Techniques

Installation

Quick Start

Tasks

Ant Sumo

Ant Battle

Self-Play Training

Policies Evaluation

Building Your Own Task

Citing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages