Skip to content

Commit

Permalink
Evaluation worker (sjtu-marl#24)
Browse files Browse the repository at this point in the history
* Update sc2 implementation

* Disambiguate the  behavior of sampling

* Add qmix

* Ready for smarts

* VecEnv supports auto reset

* Use simple metric record

* Fix: exploration is not exploit

* tmp save

* update

* Update environment returns

* Fix: sequential rollout frames shifting

* Add configs for running smarts

* Update

* Update usage

* Fix bug: sequential rollout

* Identify with PID

* Fix: on adapter for smarts

* update ddpg yaml

* update

* ppo runable

* Enable info adapter for smarts

* Apply custom metric to simulation rollout

* fix bug in dqn (action_mask is None)

* update

* update

* Fix: no info

* Apply reward shaping

* bug fix: ppo ratio and exploration

* save params

* Light controller and dataset server

* temp test for Mujoco

* Add data shapes example (gym)

* Tag logging with log level

* Depart task request handling from server

* Alternative policy pool and register collected

* Formatted

* Launched new coordinator

* Temporary collect helper

* Task cache dataclass

* Remove useless state id parameter

* Gym passed

* update

* fix resource error and upgrade ray

* upgrade dependencies

* fix: request index

* update yamls

* deprecated environment

* rename filename

* apply safe load

* Mappo+gfootball (sjtu-marl#29)

* tmp saving, starting add mappo+gfootball; now add customized rollout function

* rollout function & policy & env tests over

* use formal action mask

* format code

* mappo run gfootball 5_vs_5 against bot, runnable version; fix & more feature need to be added

* Make SubProcVecEnv support AddEnv

* Feature

1.make num  env in subprocVecEnv flexible
2.add gpu for AgentInterface during training
ps. eager to add read/write limits

* add missing commits

* update ignores

* refactor rollout mechanism, and test

* test pass: google football base env

* enable wrappers for gfootball env

* support RNN state transmission

* avoid list numpy warning

* auto tensor caster for loss computing, test required

* test pass for offline dataset

* temp save

* temp save

Co-authored-by: ming <[email protected]>

* update

* make rollout/env/databackend test pass

* test mappo: in progress

* update configs and fix: episode info record

* Mappo+gfootball (sjtu-marl#30)

* tmp saving, starting add mappo+gfootball; now add customized rollout function

* rollout function & policy & env tests over

* use formal action mask

* format code

* mappo run gfootball 5_vs_5 against bot, runnable version; fix & more feature need to be added

* Make SubProcVecEnv support AddEnv

* Feature

1.make num  env in subprocVecEnv flexible
2.add gpu for AgentInterface during training
ps. eager to add read/write limits

* add missing commits

* update ignores

* refactor rollout mechanism, and test

* add vtrace

* test pass: google football base env

* enable wrappers for gfootball env

* support RNN state transmission

* avoid list numpy warning

* auto tensor caster for loss computing, test required

* test pass for offline dataset

* temp save

* temp save

* tmp saving

* report problem

* test_mappo.py pass
still problems such as rnn_states never change?

next step: check data and then run

* update

* all passed

* only mappo loss has some bugs

* enable info logging

* still debug

* dict rnn state is required

* reparameterize yamls

Co-authored-by: ziyuwan <[email protected]>

* temp save

* collect tests

* make vecenv support sequential envs

* add ignored env tests

* still in progress: sequential

* test pass

* check output from ctde

* expose local buffer config

* copy next frame for sequential games

* let remote logger launch in a cluster

* fix: data frame is broken in sequential games

* tmp saving: adding algorithm tests

* add tests for mappo/dqn

* add all tests for algorithm except 'bc',
"SAC"  implementaion has problems should be fixed

* reformatted and leave fixme for sac test

* init some test suits

* temp save

* in progress: add learner and server tests

* fix some bugs in sac to pass test

* test coverage of learner is 85%

* just make qmix pass tests, but it may be totally wrong.

* temp save

* updates

* remove deprecated test & add example test

* pre merge

* add test for task handler

* fix rnn states name

* ignore deprecated funcs

* improve gfootball test

* fix bug: no validate_func method

* add tests for env agent interface

* add test for evaluation

* update settings

* add test for managers

* udpate algorithm tests

* adding updates

* add tests for evaluation

* reformatted tests and SC2 implementation

* fix: no rollout_worker_manger

* add test for misc and modify misc

* add test for conv in dqn; modify dqn conv's code to make it more general

* offlinne dataset unit test

* :( fix bugs

* add tests model ---> 94% cov

* remove deprecated file

* reformatted and enable sc2

* env cov rise

* init utils test

* tmp: simplify rollout implementation with easy actor configs

* migrate exp tools

* init worker test

* remove useless comments

* delete deprecated file

* metric_type is not required by rollout_worker

* add template geneator for types

* async_rollout_worker_passed

* init parameter server tests

* parameter server test passed

* update test cases for payoff manager

* improve tests for postprocessors

* resolve task dispatching conflicts

* update

* fix: no optimization tasks executed because of wrong actorpool use

* fix: FakeStepping should remove callback

* fix: error raised when transformation in maatari

* fix and refine: tensorboard error and make ExprManager human readable

* disable verbose for test

* replace agent_reward with reward

* fix parameter list for all env test

* test examples

* remove info from env.timestep() which can be processed by np.asarray()

* fix qmix sc2, make it runnable

* make qmix tests passed

* add comments

* handle cc error and fix tests for SC2

* update settings

* update docs

* remove deprecated examples

Co-authored-by: Morning <[email protected]>
Co-authored-by: ziyuwan <[email protected]>
Co-authored-by: ziyuwan <[email protected]>
Co-authored-by: Hanjing Wang <[email protected]>
  • Loading branch information
5 people authored Feb 18, 2022
1 parent 3785309 commit 53d64ba
Show file tree
Hide file tree
Showing 345 changed files with 19,865 additions and 5,329 deletions.
40 changes: 40 additions & 0 deletions .coveragerc
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
[run]
branch = False
omit =
# ignore typings and base class
*/__init__.py
malib/agent/agent_interface.py
malib/backend/coordinator/base_coordinator.py
malib/envs/smarts/*
malib/envs/env.py
malib/rollout/base_worker.py
malib/evaluator/base_evaluator.py
# ignore imitation training suit
malib/algorithm/imitation/*
malib/algorithm/common/reward.py
# ignore random policy, just for test
malib/algorithm/random/*
malib/utils/*
malib/rpc/*
# ignore cli
malib/runner.py
# no usage
malib/rollout/sync_rollout_worker.py


[report]
skip_empty = True
omit =
malib/envs/smarts/*
malib/algorithm/imitation/*
malib/settings.py
malib/registration.py
malib/backend/coordinator/light_server.py
# deprecated
malib/backend/datapool/data_array.py
malib/algorithm/common/reward.py
malib/utils/*
malib/rpc/*

[html]
directory = cov_html
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -133,4 +133,5 @@ dmypy.json
.idea
_build
logs
demos
demos
prof/
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "malib/envs/smarts/_env"]
path = malib/envs/smarts/_env
url = https://github.com/huawei-noah/SMARTS.git
39 changes: 38 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
#
#.PHONY: profiling
#profiling:

# run visualization for cov_html: ruby -run -ehttpd . -p8000
.PHONY: clean
clean:
rm -rf ./logs/*
Expand All @@ -24,3 +24,40 @@ docs:
.PHONY: rm-pycache
rm-pycache:
find . -type f -name '*.py[co]' -delete -o -type d -name __pycache__ -delete

.PHONY: test
test:
pytest --cov-config=.coveragerc --cov=malib --cov-report html --doctest-modules tests
rm -f .coverage.*

.PHONY: test-dataset
test-dataset:
pytest -v --doctest-modules tests/dataset

.PHONY: test-parameter-server
test-parameter-server:
pytest -v --doctest-modules tests/paramter_server

.PHONY: test-coordinator
test-coordinator:
pytest -v --doctest-modules tests/coordinator

.PHONY: test-backend
test-backend: test-dataset test-parameter-server test-coordinator

.PHONY: test-algorith
test-algorithm:
pytest -v --doctest-modules tests/algorithm

.PHONY: test-rollout
test-rollout:
pytest -v --doctest-modules tests/rollout

.PHONY: test-agent
test-agent:
pytest --doctest-modules tests/agent

.PHONY: test-env-api
test-env-api:
pytest -v --doctest-modules tests/env_api

7 changes: 7 additions & 0 deletions docs/source/api/malib.agent.indepdent_irl_agent.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.agent.indepdent\_irl\_agent module
========================================

.. automodule:: malib.agent.indepdent_irl_agent
:members:
:undoc-members:
:show-inheritance:
1 change: 1 addition & 0 deletions docs/source/api/malib.agent.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,4 +17,5 @@ Submodules
malib.agent.centralized_agent
malib.agent.ctde_agent
malib.agent.indepdent_agent
malib.agent.indepdent_irl_agent
malib.agent.sync_agent
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.common.reward.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.common.reward module
====================================

.. automodule:: malib.algorithm.common.reward
:members:
:undoc-members:
:show-inheritance:
1 change: 1 addition & 0 deletions docs/source/api/malib.algorithm.common.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,5 @@ Submodules
malib.algorithm.common.misc
malib.algorithm.common.model
malib.algorithm.common.policy
malib.algorithm.common.reward
malib.algorithm.common.trainer
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.discrete_sac.loss.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.discrete\_sac.loss module
=========================================

.. automodule:: malib.algorithm.discrete_sac.loss
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.discrete_sac.policy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.discrete\_sac.policy module
===========================================

.. automodule:: malib.algorithm.discrete_sac.policy
:members:
:undoc-members:
:show-inheritance:
17 changes: 17 additions & 0 deletions docs/source/api/malib.algorithm.discrete_sac.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
malib.algorithm.discrete\_sac package
=====================================

.. automodule:: malib.algorithm.discrete_sac
:members:
:undoc-members:
:show-inheritance:

Submodules
----------

.. toctree::
:maxdepth: 2

malib.algorithm.discrete_sac.loss
malib.algorithm.discrete_sac.policy
malib.algorithm.discrete_sac.trainer
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.discrete_sac.trainer.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.discrete\_sac.trainer module
============================================

.. automodule:: malib.algorithm.discrete_sac.trainer
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.imitation.advirl.loss.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.imitation.advirl.loss module
============================================

.. automodule:: malib.algorithm.imitation.advirl.loss
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.imitation.advirl.reward.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.imitation.advirl.reward module
==============================================

.. automodule:: malib.algorithm.imitation.advirl.reward
:members:
:undoc-members:
:show-inheritance:
17 changes: 17 additions & 0 deletions docs/source/api/malib.algorithm.imitation.advirl.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
malib.algorithm.imitation.advirl package
========================================

.. automodule:: malib.algorithm.imitation.advirl
:members:
:undoc-members:
:show-inheritance:

Submodules
----------

.. toctree::
:maxdepth: 2

malib.algorithm.imitation.advirl.loss
malib.algorithm.imitation.advirl.reward
malib.algorithm.imitation.advirl.trainer
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.imitation.advirl.trainer.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.imitation.advirl.trainer module
===============================================

.. automodule:: malib.algorithm.imitation.advirl.trainer
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.imitation.bc.loss.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.imitation.bc.loss module
========================================

.. automodule:: malib.algorithm.imitation.bc.loss
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.imitation.bc.policy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.imitation.bc.policy module
==========================================

.. automodule:: malib.algorithm.imitation.bc.policy
:members:
:undoc-members:
:show-inheritance:
17 changes: 17 additions & 0 deletions docs/source/api/malib.algorithm.imitation.bc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
malib.algorithm.imitation.bc package
====================================

.. automodule:: malib.algorithm.imitation.bc
:members:
:undoc-members:
:show-inheritance:

Submodules
----------

.. toctree::
:maxdepth: 2

malib.algorithm.imitation.bc.loss
malib.algorithm.imitation.bc.policy
malib.algorithm.imitation.bc.trainer
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.imitation.bc.trainer.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.imitation.bc.trainer module
===========================================

.. automodule:: malib.algorithm.imitation.bc.trainer
:members:
:undoc-members:
:show-inheritance:
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.imitation.imitation\_trainer module
===================================================

.. automodule:: malib.algorithm.imitation.imitation_trainer
:members:
:undoc-members:
:show-inheritance:
24 changes: 24 additions & 0 deletions docs/source/api/malib.algorithm.imitation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
malib.algorithm.imitation package
=================================

.. automodule:: malib.algorithm.imitation
:members:
:undoc-members:
:show-inheritance:

Subpackages
-----------

.. toctree::
:maxdepth: 2

malib.algorithm.imitation.advirl
malib.algorithm.imitation.bc

Submodules
----------

.. toctree::
:maxdepth: 2

malib.algorithm.imitation.imitation_trainer
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.mappo.actor_critic.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.mappo.actor\_critic module
==========================================

.. automodule:: malib.algorithm.mappo.actor_critic
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.mappo.data_generator.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.mappo.data\_generator module
============================================

.. automodule:: malib.algorithm.mappo.data_generator
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.mappo.loss.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.mappo.loss module
=================================

.. automodule:: malib.algorithm.mappo.loss
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.mappo.policy.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.mappo.policy module
===================================

.. automodule:: malib.algorithm.mappo.policy
:members:
:undoc-members:
:show-inheritance:
21 changes: 21 additions & 0 deletions docs/source/api/malib.algorithm.mappo.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
malib.algorithm.mappo package
=============================

.. automodule:: malib.algorithm.mappo
:members:
:undoc-members:
:show-inheritance:

Submodules
----------

.. toctree::
:maxdepth: 2

malib.algorithm.mappo.actor_critic
malib.algorithm.mappo.data_generator
malib.algorithm.mappo.loss
malib.algorithm.mappo.policy
malib.algorithm.mappo.trainer
malib.algorithm.mappo.utils
malib.algorithm.mappo.vtrace
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.mappo.trainer.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.mappo.trainer module
====================================

.. automodule:: malib.algorithm.mappo.trainer
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.mappo.utils.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.mappo.utils module
==================================

.. automodule:: malib.algorithm.mappo.utils
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.mappo.vtrace.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.mappo.vtrace module
===================================

.. automodule:: malib.algorithm.mappo.vtrace
:members:
:undoc-members:
:show-inheritance:
2 changes: 1 addition & 1 deletion docs/source/api/malib.algorithm.ppo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,4 @@ Submodules

malib.algorithm.ppo.loss
malib.algorithm.ppo.policy
malib.algorithm.ppo.ppo_trainer
malib.algorithm.ppo.trainer
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.ppo.trainer.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.ppo.trainer module
==================================

.. automodule:: malib.algorithm.ppo.trainer
:members:
:undoc-members:
:show-inheritance:
7 changes: 7 additions & 0 deletions docs/source/api/malib.algorithm.qmix.q_mixer.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
malib.algorithm.qmix.q\_mixer module
====================================

.. automodule:: malib.algorithm.qmix.q_mixer
:members:
:undoc-members:
:show-inheritance:
2 changes: 1 addition & 1 deletion docs/source/api/malib.algorithm.qmix.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,5 +13,5 @@ Submodules
:maxdepth: 2

malib.algorithm.qmix.loss
malib.algorithm.qmix.policy
malib.algorithm.qmix.q_mixer
malib.algorithm.qmix.trainer
Loading

0 comments on commit 53d64ba

Please sign in to comment.