Squashed commit of the following:

commit f9ef5ce Author: Yuanpu <[email protected]> Date: Fri Oct 11 03:46:52 2019 +0800 update multi process doc commit 9950d88 Author: Daochen <[email protected]> Date: Thu Oct 10 10:39:51 2019 -0500 readme commit 319e8b3 Author: Daochen <[email protected]> Date: Thu Oct 10 09:43:38 2019 -0500 fix leduc bug commit 77f2614 Author: Daochen <[email protected]> Date: Thu Oct 10 08:28:43 2019 -0500 docs and models commit 3d169ed Author: Yuanpu <[email protected]> Date: Thu Oct 10 11:40:33 2019 +0800 dqn multi-process commit b93f4ba Author: songyih <[email protected]> Date: Wed Oct 9 18:03:43 2019 -0700 pip conf commit 938c72f Merge: 1768694 e681d50 Author: songyih <[email protected]> Date: Wed Oct 9 17:02:38 2019 -0700 update version commit 09f48ef Author: songyih <[email protected]> Date: Wed Oct 9 17:01:00 2019 -0700 update version commit 99a6709 Author: Daochen <[email protected]> Date: Wed Oct 9 08:57:36 2019 -0500 docs commit 103e929 Author: Daochen <[email protected]> Date: Wed Oct 9 00:48:14 2019 -0500 cfr doc commit 7fca314 Author: Daochen <[email protected]> Date: Tue Oct 8 23:34:14 2019 -0500 setup version commit b75cbff Author: songyih <[email protected]> Date: Tue Oct 8 20:05:34 2019 -0700 learning curve commit dedb835 Author: Kwei-Herng Lai <[email protected]> Date: Tue Oct 8 19:15:25 2019 -0700 mahjong (readme) commit bcf8d0a Merge: 18b65b1 d462bf3 Author: songyih <[email protected]> Date: Tue Oct 8 18:52:36 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 2b3fb46 Author: Kwei-Herng Lai <[email protected]> Date: Tue Oct 8 18:51:59 2019 -0700 mahjong (fix table) commit 3e58c7f Author: songyih <[email protected]> Date: Tue Oct 8 18:50:50 2019 -0700 setup.py for pip hosting commit c392028 Author: Kwei-Herng Lai <[email protected]> Date: Tue Oct 8 18:50:38 2019 -0700 mahjong (table) commit 9c0ed28 Author: songyih <[email protected]> Date: Tue Oct 8 18:49:38 2019 -0700 setup.py for pip hosting commit d5aacff Author: Kwei-Herng Lai <[email protected]> Date: Tue Oct 8 18:47:30 2019 -0700 mahjong (unit test) commit f1481e3 Merge: 36d4837 9fdbf95 Author: Kwei-Herng Lai <[email protected]> Date: Tue Oct 8 18:42:47 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 9549556 Author: Kwei-Herng Lai <[email protected]> Date: Tue Oct 8 18:42:45 2019 -0700 mahjong (unit test) commit 81e7bb3 Author: Daochen <[email protected]> Date: Tue Oct 8 20:41:13 2019 -0500 setup commit 65dded7 Author: Daochen <[email protected]> Date: Tue Oct 8 20:03:22 2019 -0500 readme commit 5567186 Merge: 976b8f3 ff4cddc Author: Kwei-Herng Lai <[email protected]> Date: Tue Oct 8 17:54:14 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 9120a8e Author: Kwei-Herng Lai <[email protected]> Date: Tue Oct 8 17:54:11 2019 -0700 mahjong (unit test) commit f63214e Author: Daochen <[email protected]> Date: Tue Oct 8 04:28:21 2019 -0500 refine docs and codes commit 14dbf0f Author: Daochen <[email protected]> Date: Tue Oct 8 04:02:35 2019 -0500 cfr commit e01e6da Author: Yuanpu <[email protected]> Date: Tue Oct 8 04:09:57 2019 +0800 fix uno test class name commit 2836790 Author: Daochen <[email protected]> Date: Mon Oct 7 14:57:38 2019 -0500 refine mahjong commit 2f592c9 Merge: 4eb87c3 b47c55b Author: Kwei-Herng Lai <[email protected]> Date: Mon Oct 7 12:43:48 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit f02675f Author: Kwei-Herng Lai <[email protected]> Date: Mon Oct 7 12:43:45 2019 -0700 mahjong (fixed bug) commit a785435 Author: Daochen <[email protected]> Date: Mon Oct 7 11:40:44 2019 -0500 refine codes, docs and stepbacks commit ea2575d Author: Yuanpu <[email protected]> Date: Mon Oct 7 11:26:49 2019 +0800 init dqn multi process commit 48e45d0 Author: Daochen <[email protected]> Date: Sun Oct 6 14:07:51 2019 -0500 update docs commit b383bd5 Author: Daochen <[email protected]> Date: Sun Oct 6 08:54:48 2019 -0500 update docs commit 813827c Author: Daochen Zha <[email protected]> Date: Sat Oct 5 22:29:40 2019 -0500 update docs commit 100ac76 Author: Daochen Zha <[email protected]> Date: Sat Oct 5 22:23:44 2019 -0500 update docs commit e249175 Author: Daochen Zha <[email protected]> Date: Sat Oct 5 21:55:32 2019 -0500 update docs commit b5d6233 Author: Daochen <[email protected]> Date: Sat Oct 5 20:30:44 2019 -0500 update docs commit 762ef6b Author: Daochen <[email protected]> Date: Sat Oct 5 20:26:41 2019 -0500 update docs commit 01e9385 Author: Kwei-Herng Lai <[email protected]> Date: Sat Oct 5 13:53:15 2019 -0700 mahjong (with comment) commit af3a194 Merge: b6dac75 49e3477 Author: Kwei-Herng Lai <[email protected]> Date: Sat Oct 5 13:48:57 2019 -0700 mahjong (with comment) commit 94b05f8 Author: Kwei-Herng Lai <[email protected]> Date: Sat Oct 5 13:48:17 2019 -0700 mahjong (with comment) commit 43c96a8 Author: Daochen <[email protected]> Date: Sat Oct 5 13:38:49 2019 -0500 refine mahjong and doc commit 89723db Author: Daochen <[email protected]> Date: Sat Oct 5 11:40:49 2019 -0500 refine codes commit 0fb3bcd Author: Daochen <[email protected]> Date: Sat Oct 5 11:32:29 2019 -0500 nice leduc interface; uno rule agent commit bea2396 Author: Ruzhe Wei <[email protected]> Date: Sat Oct 5 21:42:31 2019 +0800 Update utils.py commit 166454e Author: Ruzhe Wei <[email protected]> Date: Sat Oct 5 21:40:26 2019 +0800 refine holdem util commit 0bba3c6 Author: Ruzhe Wei <[email protected]> Date: Sat Oct 5 21:37:24 2019 +0800 Refine commit bd44a61 Author: Ruzhe Wei <[email protected]> Date: Sat Oct 5 19:58:22 2019 +0800 refine code commit 73376c3 Author: Ruzhe Wei <[email protected]> Date: Sat Oct 5 19:55:09 2019 +0800 refine code commit c51f6cf Author: Yuanpu <[email protected]> Date: Sat Oct 5 11:45:31 2019 +0800 fix uno legal action commit 4fff8ba Merge: cb7d49a a9e1b26 Author: Kwei-Herng Lai <[email protected]> Date: Fri Oct 4 20:22:29 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 9afee4e Author: Kwei-Herng Lai <[email protected]> Date: Fri Oct 4 20:22:26 2019 -0700 mahjong (with comment) commit e8c6127 Author: Yuanpu <[email protected]> Date: Sat Oct 5 10:16:08 2019 +0800 multi process commit 6e395b8 Author: Yuanpu <[email protected]> Date: Sat Oct 5 10:06:23 2019 +0800 delete print commit a42985a Author: Yuanpu <[email protected]> Date: Sat Oct 5 10:02:50 2019 +0800 fix uno legal_actions commit 81f68e4 Author: Ruzhe Wei <[email protected]> Date: Sat Oct 5 01:17:31 2019 +0800 nothing happened, just for good-looking commit 17ca3e1 Merge: 4c2b7c2 41700e8 Author: Ruzhe Wei <[email protected]> Date: Sat Oct 5 00:52:44 2019 +0800 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 76ff800 Author: Ruzhe Wei <[email protected]> Date: Sat Oct 5 00:50:44 2019 +0800 holdem util test commit 65b6418 Author: Daochen <[email protected]> Date: Fri Oct 4 10:46:35 2019 -0500 uno test interface commit 74e1944 Author: Daochen <[email protected]> Date: Fri Oct 4 09:25:28 2019 -0500 refine codes commit 0e6cc2c Merge: f4411f0 81ccbbd Author: Daochen <[email protected]> Date: Fri Oct 4 09:04:59 2019 -0500 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit e34126e Author: Daochen <[email protected]> Date: Fri Oct 4 09:04:33 2019 -0500 initilize uno rule agent and refine limit holdem env commit a284ff4 Author: Ruzhe Wei <[email protected]> Date: Fri Oct 4 21:57:50 2019 +0800 improve efficiency commit 6d207cf Author: Ruzhe Wei <[email protected]> Date: Fri Oct 4 21:29:00 2019 +0800 improve efficiency commit f0a93fe Author: Ruzhe Wei <[email protected]> Date: Fri Oct 4 21:16:34 2019 +0800 improve efficiency commit eb54c05 Author: Ruzhe Wei <[email protected]> Date: Fri Oct 4 21:01:08 2019 +0800 Improve efficiency commit c6117b0 Author: Ruzhe Wei <[email protected]> Date: Fri Oct 4 20:02:04 2019 +0800 Bug fixed commit e109f24 Author: Kwei-Herng Lai <[email protected]> Date: Thu Oct 3 20:47:39 2019 -0700 mahjong (finish) commit b66d115 Merge: 8403d24 b2b504d Author: Kwei-Herng Lai <[email protected]> Date: Thu Oct 3 20:40:10 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 3c7e23b Author: Daochen <[email protected]> Date: Thu Oct 3 22:02:53 2019 -0500 update doce commit 7547a7f Author: Daochen <[email protected]> Date: Thu Oct 3 21:58:00 2019 -0500 refine codes and add docs commit ada29b5 Author: Daochen <[email protected]> Date: Thu Oct 3 17:59:08 2019 -0500 refine codes commit 6bec418 Author: Daochen <[email protected]> Date: Thu Oct 3 17:46:29 2019 -0500 Add human interface and single-agent environment for Leduc commit 79fa47d Author: Yuanpu <[email protected]> Date: Thu Oct 3 22:23:04 2019 +0800 uno doc commit 4e656d0 Merge: 0fe38d9 2dc9560 Author: Ruzhe Wei <[email protected]> Date: Thu Oct 3 15:25:14 2019 +0800 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 558280c Author: Ruzhe Wei <[email protected]> Date: Thu Oct 3 15:25:06 2019 +0800 Update utils.py commit 5f57ebe Author: Yuanpu <[email protected]> Date: Thu Oct 3 14:33:04 2019 +0800 uno docstring commit edf644b Author: Daochen <[email protected]> Date: Wed Oct 2 13:14:13 2019 -0500 modify texas holdem feature commit 133af48 Author: Daochen <[email protected]> Date: Wed Oct 2 11:43:27 2019 -0500 deepcfr test commit 9054ffe Merge: 382e196 d618551 Author: Kwei-Herng Lai <[email protected]> Date: Wed Oct 2 09:39:22 2019 -0700 deep_cfr commit cc7fccb Author: Kwei-Herng Lai <[email protected]> Date: Wed Oct 2 09:38:30 2019 -0700 deep_cfr commit 05f8b0d Merge: c661d23 e2c055d Author: Daochen <[email protected]> Date: Wed Oct 2 11:08:58 2019 -0500 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 9f197e9 Author: Daochen <[email protected]> Date: Wed Oct 2 11:08:35 2019 -0500 deepcfr test commit 077efbf Author: Yuanpu <[email protected]> Date: Wed Oct 2 23:57:55 2019 +0800 normal uno commit 129e96b Author: Ruzhe Wei <[email protected]> Date: Wed Oct 2 22:33:18 2019 +0800 Update utils.py commit 09a9eef Author: Ruzhe Wei <[email protected]> Date: Wed Oct 2 18:57:57 2019 +0800 issues fixed commit 885c4f1 Author: Daochen <[email protected]> Date: Tue Oct 1 21:07:45 2019 -0500 fix blackjack commit ac99154 Author: Daochen <[email protected]> Date: Tue Oct 1 20:59:45 2019 -0500 fix limit holdem commit 18fa410 Merge: e23056c 098742f Author: Daochen <[email protected]> Date: Tue Oct 1 20:51:02 2019 -0500 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 8494ed3 Author: Daochen <[email protected]> Date: Tue Oct 1 20:50:42 2019 -0500 fix limit holdem commit 12a11ac Author: Daochen <[email protected]> Date: Tue Oct 1 18:59:27 2019 -0500 add name to Adam commit 8beb85e Author: Daochen <[email protected]> Date: Tue Oct 1 16:49:54 2019 -0500 fix limit holdem dqn example commit a2c4beb Author: Yuanpu <[email protected]> Date: Wed Oct 2 03:54:36 2019 +0800 normal uno commit 2ace052 Merge: ca4f04b cb4a983 Author: Kwei-Herng Lai <[email protected]> Date: Tue Oct 1 12:36:26 2019 -0700 deep_cfr commit 95f597b Author: Kwei-Herng Lai <[email protected]> Date: Tue Oct 1 12:35:53 2019 -0700 deep_cfr commit f70a1fa Author: Kwei-Herng Lai <[email protected]> Date: Tue Oct 1 12:35:29 2019 -0700 deep_cfr commit d92ef74 Author: Daochen <[email protected]> Date: Tue Oct 1 12:16:29 2019 -0500 leduc test commit 1f5391e Author: Daochen <[email protected]> Date: Tue Oct 1 11:47:33 2019 -0500 Change setup commit b4ca4b1 Author: Daochen <[email protected]> Date: Tue Oct 1 11:22:15 2019 -0500 leduc test commit c0110b1 Author: Yuanpu <[email protected]> Date: Tue Oct 1 06:38:21 2019 +0800 doudizhu random multi process commit e9f5d0e Author: Yuanpu <[email protected]> Date: Tue Oct 1 01:26:04 2019 +0800 uno env test commit 4d70710 Author: Daochen <[email protected]> Date: Mon Sep 30 12:16:10 2019 -0500 uno examples commit 0cc72ad Author: Daochen <[email protected]> Date: Mon Sep 30 11:09:26 2019 -0500 update examples commit 34bfcca Author: Daochen <[email protected]> Date: Mon Sep 30 09:03:59 2019 -0500 fix uno state_shape commit 8e51abd Author: Daochen <[email protected]> Date: Mon Sep 30 08:50:05 2019 -0500 add state space in env commit 810fdb5 Author: Yuanpu <[email protected]> Date: Mon Sep 30 16:28:46 2019 +0800 uno test commit b1514a5 Author: Daochen <[email protected]> Date: Sun Sep 29 16:51:02 2019 -0500 update docs commit a4bd9ee Author: Daochen <[email protected]> Date: Sun Sep 29 16:47:36 2019 -0500 refine codes commit 7e1b32c Author: Daochen <[email protected]> Date: Sun Sep 29 16:39:36 2019 -0500 update docs commit 44fe921 Author: Daochen <[email protected]> Date: Sun Sep 29 16:29:34 2019 -0500 update docs commit 19646a6 Author: Daochen <[email protected]> Date: Sun Sep 29 16:19:30 2019 -0500 clean codes commit 6c63ad8 Author: Daochen <[email protected]> Date: Sun Sep 29 15:51:22 2019 -0500 Accelerate Dou Dizhu commit 5b5e389 Author: Ruzhe Wei <[email protected]> Date: Sat Sep 28 18:45:49 2019 +0800 Refine Holdem Utils commit 6266310 Author: Daochen <[email protected]> Date: Fri Sep 27 14:15:21 2019 -0500 refine nolimit test commit 93893ad Merge: 41c72ba 5a498f1 Author: Kwei-Herng Lai <[email protected]> Date: Fri Sep 27 11:46:58 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 126c86b Author: Kwei-Herng Lai <[email protected]> Date: Fri Sep 27 11:46:56 2019 -0700 mahjong (debugging)) commit 9156ffe Author: JunyuGuo <[email protected]> Date: Sat Sep 28 02:03:23 2019 +0800 Add files via upload commit 1660e1f Author: JunyuGuo <[email protected]> Date: Sat Sep 28 02:01:19 2019 +0800 Delete test_nolimitholdem_game.py commit 23b7c31 Author: Daochen Zha <[email protected]> Date: Fri Sep 27 02:50:47 2019 -0500 update docs commit d2eae49 Author: Daochen <[email protected]> Date: Fri Sep 27 02:31:19 2019 -0500 add dqn and nfsp to leduc commit 767db7c Author: Songyi Huang <[email protected]> Date: Thu Sep 26 20:38:24 2019 -0700 game description commit 3b4e2f5 Author: JunyuGuo <[email protected]> Date: Fri Sep 27 04:22:16 2019 +0800 Add files via upload commit 784b8da Author: JunyuGuo <[email protected]> Date: Fri Sep 27 04:20:30 2019 +0800 Add files via upload commit aea7a54 Author: Daochen Zha <[email protected]> Date: Thu Sep 26 09:28:07 2019 -0500 update docs commit 264c335 Author: Daochen Zha <[email protected]> Date: Thu Sep 26 08:48:04 2019 -0500 update docs commit 230f49b Merge: bb17e8b 06e28fa Author: Kwei-Herng Lai <[email protected]> Date: Wed Sep 25 20:42:13 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit b2c325f Author: Kwei-Herng Lai <[email protected]> Date: Wed Sep 25 20:42:10 2019 -0700 mahjong (debugging)) commit 13be45f Merge: dcaf830 ccc4edc Author: songyih <[email protected]> Date: Wed Sep 25 17:05:22 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 84f9e7a Author: songyih <[email protected]> Date: Wed Sep 25 17:05:11 2019 -0700 leduc holdem env commit 8da3820 Author: Daochen <[email protected]> Date: Wed Sep 25 18:49:15 2019 -0500 refine codes commit 16426db Merge: ad53e65 7dd2647 Author: songyih <[email protected]> Date: Wed Sep 25 16:42:14 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 79cf482 Author: Daochen <[email protected]> Date: Wed Sep 25 18:34:27 2019 -0500 update test commit fe365f5 Author: Daochen <[email protected]> Date: Wed Sep 25 18:23:54 2019 -0500 update tests commit 6ecee75 Author: Daochen Zha <[email protected]> Date: Wed Sep 25 16:48:00 2019 -0500 update docs commit 4ecbebf Author: Daochen Zha <[email protected]> Date: Wed Sep 25 16:33:42 2019 -0500 update docs commit f27f730 Author: Songyi Huang <[email protected]> Date: Tue Sep 24 20:05:13 2019 -0700 leducholdem commit 44d1355 Author: Daochen <[email protected]> Date: Tue Sep 24 18:10:38 2019 -0500 no-limit examples commit b17e17c Author: Daochen <[email protected]> Date: Tue Sep 24 12:21:23 2019 -0500 Add timesteps and refine codes commit 5024674 Author: Yuanpu <[email protected]> Date: Wed Sep 25 00:27:46 2019 +0800 test_limitholdem_env commit 7b51d5e Author: Songyi Huang <[email protected]> Date: Mon Sep 23 20:13:13 2019 -0700 init leduc holdem commit f8046a6 Author: Daochen <[email protected]> Date: Mon Sep 23 21:51:19 2019 -0500 modify setup commit 427a92f Author: Daochen <[email protected]> Date: Mon Sep 23 21:31:23 2019 -0500 fix tests commit 03ccb13 Author: Daochen <[email protected]> Date: Mon Sep 23 21:24:15 2019 -0500 refine tests commit 72e4d4b Author: Daochen <[email protected]> Date: Mon Sep 23 20:11:11 2019 -0500 refine codes commit c48ce57 Author: Ruzhe Wei <[email protected]> Date: Tue Sep 24 08:37:09 2019 +0800 limitholdem util comments commit f7b0d4b Author: Daochen <[email protected]> Date: Mon Sep 23 17:48:57 2019 -0500 refine codes commit 5bfdd52 Author: Kwei-Herng Lai <[email protected]> Date: Mon Sep 23 15:47:09 2019 -0700 deep_cfr (stable) commit 06b144a Author: Kwei-Herng Lai <[email protected]> Date: Mon Sep 23 15:04:36 2019 -0700 deep_cfr (stable) commit 2ac02fd Merge: 820d7b7 3fc95fd Author: Kwei-Herng Lai <[email protected]> Date: Mon Sep 23 15:00:18 2019 -0700 deep_cfr (stable) commit de20483 Author: Kwei-Herng Lai <[email protected]> Date: Mon Sep 23 14:58:40 2019 -0700 deep_cfr (stable) commit 147c957 Author: Yuanpu <[email protected]> Date: Tue Sep 24 05:32:11 2019 +0800 fix uno extract_state and limitholdem test commit 6448566 Author: Daochen <[email protected]> Date: Mon Sep 23 16:16:28 2019 -0500 refine codes commit c24378f Author: Daochen <[email protected]> Date: Mon Sep 23 15:50:16 2019 -0500 refine nfsp commit 29fd24e Author: Daochen <[email protected]> Date: Mon Sep 23 15:07:19 2019 -0500 nfsp commit ac90938 Author: Songyi Huang <[email protected]> Date: Mon Sep 23 12:46:42 2019 -0700 nolimit holdem env commit b641964 Author: Songyi Huang <[email protected]> Date: Mon Sep 23 12:46:11 2019 -0700 nolimit holdem env commit dfc09cd Author: songyih <[email protected]> Date: Mon Sep 23 11:22:56 2019 -0700 nolimitholdem env commit 4448d14 Author: songyih <[email protected]> Date: Mon Sep 23 10:34:05 2019 -0700 todo commit dbddd96 Author: Ruzhe Wei <[email protected]> Date: Mon Sep 23 22:23:46 2019 +0800 limitholdem util comments commit f6cfccb Author: Yuanpu <[email protected]> Date: Mon Sep 23 12:58:15 2019 +0800 limitholdem unit test commit f895859 Author: Songyi Huang <[email protected]> Date: Sun Sep 22 15:29:19 2019 -0700 clean unlimit holdem commit 764d3cc Merge: 7f89904 e1d6386 Author: Songyi Huang <[email protected]> Date: Sun Sep 22 15:21:39 2019 -0700 unlimit holdem commit e831027 Author: Songyi Huang <[email protected]> Date: Sun Sep 22 15:20:44 2019 -0700 unlimit holdem commit 2823756 Author: Yuanpu <[email protected]> Date: Mon Sep 23 05:49:19 2019 +0800 restructure doudizhu commit 34f9fe3 Merge: 6354fce 4c14df6 Author: Daochen <[email protected]> Date: Sun Sep 22 12:16:14 2019 -0500 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 25691c9 Author: Daochen <[email protected]> Date: Sun Sep 22 12:15:55 2019 -0500 limit holdem commit 925c1f7 Author: Daochen <[email protected]> Date: Sun Sep 22 12:13:58 2019 -0500 limit holdem commit cd45ae5 Author: Yuanpu <[email protected]> Date: Sun Sep 22 23:08:01 2019 +0800 doudizhu state commit d82d6fe Author: Yuanpu <[email protected]> Date: Sun Sep 22 22:58:10 2019 +0800 fix doudizhu state commit 2b8d3ca Author: Daochen <[email protected]> Date: Sun Sep 22 01:50:40 2019 -0500 refine codes commit 5e9e2b6 Author: Daochen <[email protected]> Date: Sun Sep 22 01:14:25 2019 -0500 refine codes commit 638430c Author: Daochen <[email protected]> Date: Sun Sep 22 01:04:00 2019 -0500 refine codes commit 5bd7224 Author: Yuanpu <[email protected]> Date: Sun Sep 22 13:54:34 2019 +0800 fix doudizhu bug commit ec63f2d Author: Daochen <[email protected]> Date: Sun Sep 22 00:51:19 2019 -0500 nfsp commit 9b8d388 Author: Yuanpu <[email protected]> Date: Sun Sep 22 11:31:18 2019 +0800 test json oder commit 773e49a Author: Yuanpu <[email protected]> Date: Sun Sep 22 11:25:54 2019 +0800 json order commit fa2ddc1 Author: Songyi Huang <[email protected]> Date: Sat Sep 21 14:06:09 2019 -0700 unlimit holdem commit c661488 Merge: 562d509 131fda3 Author: Songyi Huang <[email protected]> Date: Sat Sep 21 09:57:40 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit c63942b Merge: 3a09892 23f2fa8 Author: Ruzhe Wei <[email protected]> Date: Sat Sep 21 10:43:09 2019 +0800 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 45e8335 Author: Ruzhe Wei <[email protected]> Date: Sat Sep 21 10:42:52 2019 +0800 Update utils.py commit 8def3e4 Merge: 86ebbb5 23f2fa8 Author: Songyi Huang <[email protected]> Date: Fri Sep 20 19:02:42 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 9b90670 Author: Kwei-Herng Lai <[email protected]> Date: Fri Sep 20 17:58:21 2019 -0700 deep_cfr for legal_action (no mask yet) commit da832d5 Merge: 28ff08c 019f76d Author: Kwei-Herng Lai <[email protected]> Date: Fri Sep 20 17:51:26 2019 -0700 deep_cfr for legal_action (no mask yet) commit 615c9b8 Author: Kwei-Herng Lai <[email protected]> Date: Fri Sep 20 17:49:04 2019 -0700 deep_cfr for legal_action (no mask yet) commit 581e358 Merge: 751d1b2 3902bf2 Author: Ruzhe Wei <[email protected]> Date: Sat Sep 21 08:11:01 2019 +0800 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 29dd036 Author: Ruzhe Wei <[email protected]> Date: Sat Sep 21 08:07:53 2019 +0800 limitholdem ultils error eliminated commit 82e7877 Merge: 5c55e59 3902bf2 Author: Kwei-Herng Lai <[email protected]> Date: Fri Sep 20 15:14:20 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 57c2140 Author: Yuanpu <[email protected]> Date: Sat Sep 21 06:13:59 2019 +0800 doudizhu dict to list commit b7f8cb1 Author: Kwei-Herng Lai <[email protected]> Date: Fri Sep 20 15:06:51 2019 -0700 add test deepCFR2 for testing legal-action games commit c3b0884 Merge: 0facf96 94d4b88 Author: Kwei-Herng Lai <[email protected]> Date: Fri Sep 20 14:25:27 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit a4dc18b Author: Kwei-Herng Lai <[email protected]> Date: Fri Sep 20 14:25:23 2019 -0700 add test deepCFR2 code quality commit 6479ba6 Author: Yuanpu <[email protected]> Date: Sat Sep 21 01:32:00 2019 +0800 fix uno bug commit 3037b5d Author: Yuanpu <[email protected]> Date: Sat Sep 21 00:55:33 2019 +0800 uno env and random example commit 41a961a Author: Daochen <[email protected]> Date: Fri Sep 20 11:41:21 2019 -0500 doudizhu legal added commit d2df6ab Author: Daochen <[email protected]> Date: Fri Sep 20 11:23:21 2019 -0500 nfsp commit 1f7f6db Merge: 7836f35 dca84fa Author: Songyi Huang <[email protected]> Date: Thu Sep 19 18:36:44 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 0597936 Author: Kwei-Herng Lai <[email protected]> Date: Thu Sep 19 13:21:01 2019 -0700 refine deepCFR code quality commit 23ff235 Author: Kwei-Herng Lai <[email protected]> Date: Thu Sep 19 13:07:53 2019 -0700 refine deepCFR code qualirt commit ccfb3c1 Author: Kwei-Herng Lai <[email protected]> Date: Thu Sep 19 12:57:23 2019 -0700 sonnet MLP for deepCFR commit b8f7f3a Merge: 17d271a 43b27ac Author: Songyi Huang <[email protected]> Date: Wed Sep 18 19:51:37 2019 -0700 Merge branch 'dev' of https://github.com/datamllab/rlcard into dev commit 3a1d5c6 Author: Songyi Huang <[email protected]> Date: Wed Sep 18 19:51:31 2019 -0700 typo commit ac38aaf Author: Daochen <[email protected]> Date: Wed Sep 18 20:03:39 2019 -0500 add legal actions commit 54965de Author: Yuanpu <[email protected]> Date: Thu Sep 19 06:21:45 2019 +0800 roughly complete uno game
MyOneTaps · Oct 10, 2019 · 4f0d6df · 4f0d6df
1 parent cdab1bc
commit 4f0d6df
Show file tree

Hide file tree

Showing 151 changed files with 8,566 additions and 1,557 deletions.
diff --git a/.gitignore b/.gitignore
@@ -15,3 +15,4 @@ docs/rst
 docs/sphinx
 experiments/
 newtest/
+dist/
diff --git a/.travis.yml b/.travis.yml
@@ -2,8 +2,6 @@ language: python
 install: 
   - pip install -e .
 before_script:
-  - pip install matplotlib
-  - pip install dm-sonnet
   - pip install python-coveralls
   - pip install pytest-cover
 script: 

diff --git a/README.md b/README.md
@@ -1,52 +1,110 @@
 # RLCard: A Toolkit for Reinforcement Learning in Card Games
 [![Build Status](https://travis-ci.org/datamllab/RLCard.svg?branch=master)](https://travis-ci.org/datamllab/RLCard)
 [![Codacy Badge](https://api.codacy.com/project/badge/Grade/248eb15c086748a4bcc830755f1bd798)](https://www.codacy.com/manual/daochenzha/rlcard?utm_source=github.com&amp;utm_medium=referral&amp;utm_content=datamllab/rlcard&amp;utm_campaign=Badge_Grade)
-[![Coverage Status](https://coveralls.io/repos/github/datamllab/rlcard/badge.svg?branch=master)](https://coveralls.io/github/datamllab/rlcard?branch=master)
+[![Coverage Status](https://coveralls.io/repos/github/datamllab/rlcard/badge.svg)](https://coveralls.io/github/datamllab/rlcard?branch=master)
 
-RLCard is a opensource toolkit for developing Reinforcement Learning (RL) algorithms in card games. It supports multiple challenging card game environments with common and easy-to-use interfaces. The  goal  of  the  toolkit  is  to  enable  more  people  to  study  game  AI  and  push  forward  the  research of imperfect information games. RLCard is developed by [DATA Lab](http://faculty.cs.tamu.edu/xiahu/) at Texas A&M University. **NOTE: The project is still in final testing!**
+RLCard is a toolkit for Reinforcement Learning (RL) in card games. It supports multiple card environments with easy-to-use interfaces. The goal of RLCard is to bridge reinforcement learning and imperfect information games, and push forward the research of reinforcement learning in domains with multiple agents, large state and action space, and sparse reward. RLCard is developed by [DATA Lab](http://faculty.cs.tamu.edu/xiahu/) at Texas A&M University.
+
+*   Official Website: [http://www.rlcard.org](http://www.rlcard.org)
 
 ## Installation
-Make sure that you have **Python 3.5+** and **pip** installed. You can install `rlcard` with `pip` as follow:
-```console
+Make sure that you have **Python 3.5+** and **pip** installed. We recommend installing `rlcard` with `pip` as follow:
+
+```
 git clone https://github.com/datamllab/rlcard.git
 cd rlcard
 pip install -e .
 ```
-To check whether it is intalled correctly, try the example with random agents:
-```console
-python examples/blackjack_random.py
+
+Or you can directly install the package with
+
+```
+pip install rlcard
 ```
 
-## Getting Started
-The interfaces generally follow [OpenAI gym](https://github.com/openai/gym) style. We recommend starting with the following **toy examples**.
-* [Playing with random agents](docs/toy-examples.md#playing-with-random-agents)
-* [Deep-Q learning on Blackjack](docs/toy-examples.md#deep-q-learning-on-blackjack)
-* [DeepCFR on Blackjack](docs/toy-examples.md#deepcfr-on-blackjack)
+## Examples
+Please refer to [examples/](examples). A **short example** is as below.
+
+```python
+import rlcard
+from rlcard.agents.random_agent import RandomAgent
+
+env = rlcard.make('blackjack')
+env.set_agents([RandomAgent()])
+
+trajectories, payoffs = env.run()
+```
 
-For more examples, please refer to [examples/](examples).
+We also recommend the following **toy examples**.
+
+*   [Playing with random agents](docs/toy-examples.md#playing-with-random-agents)
+*   [Deep-Q learning on Blackjack](docs/toy-examples.md#deep-q-learning-on-blackjack)
+*   [Running multiple processes](docs/toy-examples.md#running-multiple-processes)
+*   [Having fun with pretrained Leduc model](docs/toy-examples.md#having-fun-with-pretrained-leduc-model)
+*   [Leduc Hold'em as single-agent environment](docs/toy-examples.md#leduc-holdem-as-single-agent-environment)
+*   [Training CFR on Leduc Hold'em](docs/toy-examples.md#training-cfr-on-leduc-holdem)
+
+## Demo
+Run `examples/leduc_holdem_human.py` to play with the pre-trained Leduc Hold'em model:
+
+```
+>> Leduc Hold'em pre-trained model
+
+>> Start a new game!
+>> Agent 1 chooses raise
+
+=============== Community Card ===============
+┌─────────┐
+│░░░░░░░░░│
+│░░░░░░░░░│
+│░░░░░░░░░│
+│░░░░░░░░░│
+│░░░░░░░░░│
+│░░░░░░░░░│
+│░░░░░░░░░│
+└─────────┘
+===============   Your Hand    ===============
+┌─────────┐
+│J        │
+│         │
+│         │
+│    ♥    │
+│         │
+│         │
+│        J│
+└─────────┘
+===============     Chips      ===============
+Yours:   +
+Agent 1: +++
+=========== Actions You Can Choose ===========
+0: call, 1: raise, 2: fold
+
+>> You choose action (integer):
+```
 
 ## Documents
-Please refer to the [Documents](docs/README.md) for general concepts introduction. API documents are available at our [github page](https://rlcard.github.io/index.html).
+Please refer to the [Documents](docs/README.md) for general introductions. API documents are available at our [website](http://www.rlcard.org).
 
 ## Available Environments
-The table below shows the environments that are (or will be soon) available in RLCard. We provide a complexity estimation for the games on several aspects. **InfoSet Number:** the number of information set; **Avg. InfoSet Size:** the average number of states in a single information set; **Action Size:** the size of the action space. For some of the complex card games, we can only provide a range of estimation. **Name** is the name that should be passed to `env.make` to create the game environment.
-
-| Game                                                                                                                                                                                           | InfoSet Number  | Avg. InfoSet Size | Action Size | Name            | Status    |
-| :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------: | :---------------: | :---------: | :-------------: | :-------: |
-| Blackjack ([wiki](https://en.wikipedia.org/wiki/Blackjack), [baike](https://baike.baidu.com/item/21%E7%82%B9/5481683?fr=aladdin))                                                              | 10^3            | 10^1              | 10^0        | blackjack       | Available |
-| Limit Texas Hold'em ([wiki](https://en.wikipedia.org/wiki/Texas_hold_%27em), [baike](https://baike.baidu.com/item/%E5%BE%B7%E5%85%8B%E8%90%A8%E6%96%AF%E6%89%91%E5%85%8B/83440?fr=aladdin))    | 10^14           | 10^3              | 10^0        | limit-holdem    | Available |
-| Dou Dizhu ([wiki](https://en.wikipedia.org/wiki/Dou_dizhu), [baike](https://baike.baidu.com/item/%E6%96%97%E5%9C%B0%E4%B8%BB/177997?fr=aladdin))                                               | 10^53 ~ 10^83   | 10^23             | 10^4        | doudizhu        | Available |
-| Mahjong ([wiki](https://en.wikipedia.org/wiki/Competition_Mahjong_scoring_rules), [baike](https://baike.baidu.com/item/%E9%BA%BB%E5%B0%86/215))                                                | 10^121          | 10^48             | 10^2        | -               | Come soon | 
-| No-limit Texas Hold'em ([wiki](https://en.wikipedia.org/wiki/Texas_hold_%27em), [baike](https://baike.baidu.com/item/%E5%BE%B7%E5%85%8B%E8%90%A8%E6%96%AF%E6%89%91%E5%85%8B/83440?fr=aladdin)) | 10^162          | 10^3              | 10^4        | no-limit-holdem | Available |
-| UNO ([wiki](https://en.wikipedia.org/wiki/Uno_\(card_game), [baike](https://baike.baidu.com/item/UNO%E7%89%8C/2249587))                                                                        |  10^163         | 10^10             | 10^1        | -               | Come soon |
-| Sheng Ji ([wiki](https://en.wikipedia.org/wiki/Sheng_ji), [baike](https://baike.baidu.com/item/%E5%8D%87%E7%BA%A7/3563150))                                                                    | 10^157 ~ 10^165 | 10^61             | 10^13       | -               | Come soon |
+We provide a complexity estimation for the games on several aspects. **InfoSet Number:** the number of information sets; **Avg. InfoSet Size:** the average number of states in a single information set; **Action Size:** the size of the action space. **Name:** the name that should be passed to `env.make` to create the game environment.
+
+| Game                                                                                                                                                                                           | InfoSet Number  | Avg. InfoSet Size | Action Size | Name            | Status     |
+| :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------: | :---------------: | :---------: | :-------------: | :--------: |
+| Blackjack ([wiki](https://en.wikipedia.org/wiki/Blackjack), [baike](https://baike.baidu.com/item/21%E7%82%B9/5481683?fr=aladdin))                                                              | 10^3            | 10^1              | 10^0        | blackjack       | Available  |
+| Leduc Hold’em                                                                                                                                                                                  | 10^2            | 10^2              | 10^0        | leduc-holdem    | Available  |
+| Limit Texas Hold'em ([wiki](https://en.wikipedia.org/wiki/Texas_hold_%27em), [baike](https://baike.baidu.com/item/%E5%BE%B7%E5%85%8B%E8%90%A8%E6%96%AF%E6%89%91%E5%85%8B/83440?fr=aladdin))    | 10^14           | 10^3              | 10^0        | limit-holdem    | Available  |
+| Dou Dizhu ([wiki](https://en.wikipedia.org/wiki/Dou_dizhu), [baike](https://baike.baidu.com/item/%E6%96%97%E5%9C%B0%E4%B8%BB/177997?fr=aladdin))                                               | 10^53 ~ 10^83   | 10^23             | 10^4        | doudizhu        | Available  |
+| Mahjong ([wiki](https://en.wikipedia.org/wiki/Competition_Mahjong_scoring_rules), [baike](https://baike.baidu.com/item/%E9%BA%BB%E5%B0%86/215))                                                | 10^121          | 10^48             | 10^2        | mahjong         | Available  | 
+| No-limit Texas Hold'em ([wiki](https://en.wikipedia.org/wiki/Texas_hold_%27em), [baike](https://baike.baidu.com/item/%E5%BE%B7%E5%85%8B%E8%90%A8%E6%96%AF%E6%89%91%E5%85%8B/83440?fr=aladdin)) | 10^162          | 10^3              | 10^4        | no-limit-holdem | Available  |
+| UNO ([wiki](https://en.wikipedia.org/wiki/Uno_\(card_game), [baike](https://baike.baidu.com/item/UNO%E7%89%8C/2249587))                                                                        |  10^163         | 10^10             | 10^1        | uno             | Available  |
+| Sheng Ji ([wiki](https://en.wikipedia.org/wiki/Sheng_ji), [baike](https://baike.baidu.com/item/%E5%8D%87%E7%BA%A7/3563150))                                                                    | 10^157 ~ 10^165 | 10^61             | 10^11       | -               | Developing |
 
 ## Evaluation
-We wrap a `Logger` that conveniently saves/plots the results. Example outputs are as follows:
-![Learning Curves](docs/imgs/curves.png "Learning Curves")
+The perfomance is measured by winning rates through tournaments. Example outputs are as follows:
+![Learning Curves](http://rlcard.org/imgs/curves.png "Learning Curves")
 
-## Disclaimer
-Please note that this is a **pre-release** version of the RLCard. The toolkit is provided "**as is**," without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement.
+## Contributing
+Contribution to this project is greatly appreciated! Please create a issue for feedbacks/bugs. If you want to contribute codes, pleast contact [[email protected]](mailto:[email protected]) or [[email protected]]([email protected]).
 
 ## Acknowledgements
-We would like to thank JJ World Network Technology Co.,LTD for technical the support.
+We would like to thank JJ World Network Technology Co.,LTD for the generous support.
diff --git a/docs/README.md b/docs/README.md
@@ -1,20 +1,24 @@
-# Overview
-The toolkit wraps each game by `Env` with easy-to-use interfaces. The goal of this toolkit is to enable the users to focus on algorithm design on challenging card games instead of developping game engines. The following design principles are applied:
-* **Simple.** We make the interfaces straightforward and simple. Users can easily run one game and obtain the statistics of the game.
-* **Consistent.** All the games are implemented following the same logical pattern. The main classes/functions of each game share the same class/function name. Users can easily understand each game and modify the rules for research purpose.
-* **Reproducible.** The results can be seeded for reproducibility purpose.
-* **Minimum Dependency.** We minimize the dependencies used in the toolkit so that the codes are easy to modify or migrate.
-* **Scalable.** New card environments can be added conveniently into RLCard with the above design principles.
+# Documents of RLCard
 
-# User Guide
-* [Toy examples](toy-examples.md)
-* [RLCard high-level design](high-level-design.md)
-* [Games in RLCard](games.md)
-* [Algorithms in RLCard](algorithms.md)
-* [Developping new algorithms](developping-algorithms.md)
+## Overview
+The toolkit wraps each game by `Env` class with easy-to-use interfaces. The goal of this toolkit is to enable the users to focus on algorithm development without caring about the environment. The following design principles are applied when developing the toolkit:
+*   **Reproducible.** Results on the environments can be reproduced. The same result should be obtained with the same random seed in different runs.
+*   **Accessible.** The experiences are collected and well organized after each game with easy-to-use interfaces. Uses can conveniently configure state representation, action encoding, reward design, or even the game rules.
+*   **Scalable.** New card environments can be added conveniently into the toolkit with the above design principles. We also try to minimize the dependencies in the toolkit so that the codes can be easily maintained.
 
-# Developer Guide
-* [Adding new environments](adding-new-environments.md)
+## User Guide
 
-# Application Programming Interface (API)
-The API documents are and available in [github page](https://rlcard.github.io/index.html).
+*   [Toy examples](toy-examples.md)
+*   [RLCard high-level design](high-level-design.md)
+*   [Games in RLCard](games.md)
+*   [Algorithms in RLCard](algorithms.md)
+
+## Developer Guide
+
+*   [Developping new algorithms](developping-algorithms.md)
+*   [Adding new environments](adding-new-environments.md)
+*   [Customizing environments](customizing-environments.md)
+*   [Adding pre-trained/rule-based models](adding-models.md)
+
+## Application Programming Interface (API)
+The API documents are and available at [Official Website](http://www.rlcard.org).
diff --git a/docs/adding-models.md b/docs/adding-models.md
@@ -0,0 +1,7 @@
+# Adding Pre-trained/Rule-based models
+You can add your own pre-trained/rule-based models to the toolkit by following several steps:
+
+*   **Develop models.** You can either design a rule-based model save a neural network model. For each game, you need to develop models for all the players at the same time. You need to wrap each model as a class and make sure that `step` and `eval_step` can work correctly.
+*   **Wrap models.** You need to inherit the `Model` class in `rlcard/models.model.py`. Then put all the models for the players into a list. Rewrite `get_agent` function and return this list.
+*   **Register the model.** Register the model in `rlcard/models/__init__.py`.
+*   **Load the model in environment.** To load the model, modify `load_pretrained_models` in the corresponding game environment in `rlcard/envs`. Use the resgistered name to load the model.
diff --git a/docs/adding-new-environments.md b/docs/adding-new-environments.md
@@ -1,11 +1,11 @@
 # Adding New Environments
 To add a new environment to the toolkit, generally you should take the following steps:
-* **Implement a game.** Card games usually have similar structures so that they can be implemented with `Game`, `Round`, `Dealer`, `Judger`, `Player` as in existing games. The easiest way is to inherit the classed in [rlcard/core.py](rlcard/core.py) and implement the functions.
-* **Wrap the game with an environment.** The easiest way is to inherit `Env` in [rlcard/envs/env.py](rlcard/env/env.py). You need to implement `extract_state` which encodes the state, `decode_action` which decode actions from the id to the text string, and `get_payoffs` which calculate payoffs of the players.
-* **Register the game.** Now it is time to tell the toolkit where to locate the new environment. Go to [rlcard/envs/__init__.py](rlcard/envs/__init__.py), and indicate the name of the game and its entry point.
+*   **Implement a game.** Card games usually have similar structures so that they can be implemented with `Game`, `Round`, `Dealer`, `Judger`, `Player`, as in existing games. The easiest way is to inherit the classed in [rlcard/core.py](../rlcard/core.py) and implement the functions.
+*   **Wrap the game with an environment.** The easiest way is to inherit `Env` in [rlcard/envs/env.py](../rlcard/env/env.py). You need to implement `extract_state` which encodes the state, `decode_action` which decodes actions from the id to the text string, and `get_payoffs` which calculates payoffs of the players.
+*   **Register the game.** Now it is time to tell the toolkit where to locate the new environment. Go to [rlcard/envs/\_\_init\_\_.py](../rlcard/envs/__init__.py), and indicate the name of the game and its entry point.
 
 To test whether the new environment is set up successfully:
 ```python
 import rlcard
-env.make(#the new evironment#)
+rlcard.make(#the new evironment#)
 ```