MindSpore Reinforcement Release Notes

MindSpore Reinforcement 0.6.0 Release Notes

Major Features and Improvements

[BETA] Support GAIL(Generative Adversarial Imitation Learning Jonathan Ho et al..2016) Algorithm. The algorithms are tuned on HalfCheetah environment and support CPU, GPU and Ascend backends.
[BETA] Support C51(Marc G. Bellemare et al..2017) Algorithm. The algorithms are tuned on CartPole environment and support CPU, GPU and Ascend backends.
[BETA] Support CQL(Conservative Q-Learning Aviral Kumar et al..2019) Algorithm. The algorithms are tuned on Hopper environment and support CPU, GPU and Ascend backends.
[BETA] Support AWAC(Accelerating Online Reinforcement Learning with Offline Datasets Ashvin Nair et al..2020) Algorithm. The algorithms are tuned on Ant environment and support CPU, GPU and Ascend backends.
[BETA] Support Dreamer(Danijar Hafner et al..2020) Algorithm. The algorithms are tuned on Walker-walk environment and support GPU backends.

Contributors

Thanks goes to these wonderful people:

Pro. Peter, Huanzhou Zhu, Bo Zhao, Gang Chen, Weifeng Chen, Liang Shi, Yijie Chen.

MindSpore Reinforcement 0.5.0 Release Notes

Major Features and Improvements

[STABLE] Add Chinese version of all existed API.
[STABLE] Add reinforcement learning multi-agent algorithm QMIX.

Contributors

Thanks goes to these wonderful people:

Pro. Peter, Huanzhou Zhu, Bo Zhao, Gang Chen, Weifeng Chen, Liang Shi, Yijie Chen.

MindSpore Reinforcement 0.3.0 Release Notes

Major Features and Improvements

[STABLE] Support DDPG reinforcement learning algorithm.

API Change

Backwards Compatible Change

Python API

Change the API of following classes: Actor, Agent. Their function names change to act(self, phase, params) and get_action(self, phase, params). Moreover, some useless functions are deleted (env_setter, act_init, evaluate, reset_collect_actor, reset_eval_actor, update in Actorclass, and init, reset_all in Agent class). Also the hierarchy relationship of configuration file changes. ReplayBufferis moved out from the directory actor, and becomes a new key in algorithm config. (Rearrange API PR !29)
Add the virtual base class of Environment class. It has step, resetfunctions and 5 space properties (action_space, observation_space, reward_space, done_space and config)

Contributors

Thanks goes to these wonderful people:

Pro. Peter, Huanzhou Zhu, Bo Zhao, Gang Chen, Weifeng Chen, Liang Shi, Yijie Chen.

Contributions of any kind are welcome!