Skip to content

Commit

Permalink
Squashed commit of the following:
Browse files Browse the repository at this point in the history
commit f9ef5ce
Author: Yuanpu <[email protected]>
Date:   Fri Oct 11 03:46:52 2019 +0800

    update multi process doc

commit 9950d88
Author: Daochen <[email protected]>
Date:   Thu Oct 10 10:39:51 2019 -0500

    readme

commit 319e8b3
Author: Daochen <[email protected]>
Date:   Thu Oct 10 09:43:38 2019 -0500

    fix leduc bug

commit 77f2614
Author: Daochen <[email protected]>
Date:   Thu Oct 10 08:28:43 2019 -0500

    docs and models

commit 3d169ed
Author: Yuanpu <[email protected]>
Date:   Thu Oct 10 11:40:33 2019 +0800

    dqn multi-process

commit b93f4ba
Author: songyih <[email protected]>
Date:   Wed Oct 9 18:03:43 2019 -0700

    pip conf

commit 938c72f
Merge: 1768694 e681d50
Author: songyih <[email protected]>
Date:   Wed Oct 9 17:02:38 2019 -0700

    update version

commit 09f48ef
Author: songyih <[email protected]>
Date:   Wed Oct 9 17:01:00 2019 -0700

    update version

commit 99a6709
Author: Daochen <[email protected]>
Date:   Wed Oct 9 08:57:36 2019 -0500

    docs

commit 103e929
Author: Daochen <[email protected]>
Date:   Wed Oct 9 00:48:14 2019 -0500

    cfr doc

commit 7fca314
Author: Daochen <[email protected]>
Date:   Tue Oct 8 23:34:14 2019 -0500

    setup version

commit b75cbff
Author: songyih <[email protected]>
Date:   Tue Oct 8 20:05:34 2019 -0700

    learning curve

commit dedb835
Author: Kwei-Herng Lai <[email protected]>
Date:   Tue Oct 8 19:15:25 2019 -0700

    mahjong (readme)

commit bcf8d0a
Merge: 18b65b1 d462bf3
Author: songyih <[email protected]>
Date:   Tue Oct 8 18:52:36 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 2b3fb46
Author: Kwei-Herng Lai <[email protected]>
Date:   Tue Oct 8 18:51:59 2019 -0700

    mahjong (fix table)

commit 3e58c7f
Author: songyih <[email protected]>
Date:   Tue Oct 8 18:50:50 2019 -0700

    setup.py for pip hosting

commit c392028
Author: Kwei-Herng Lai <[email protected]>
Date:   Tue Oct 8 18:50:38 2019 -0700

    mahjong (table)

commit 9c0ed28
Author: songyih <[email protected]>
Date:   Tue Oct 8 18:49:38 2019 -0700

    setup.py for pip hosting

commit d5aacff
Author: Kwei-Herng Lai <[email protected]>
Date:   Tue Oct 8 18:47:30 2019 -0700

    mahjong (unit test)

commit f1481e3
Merge: 36d4837 9fdbf95
Author: Kwei-Herng Lai <[email protected]>
Date:   Tue Oct 8 18:42:47 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 9549556
Author: Kwei-Herng Lai <[email protected]>
Date:   Tue Oct 8 18:42:45 2019 -0700

    mahjong (unit test)

commit 81e7bb3
Author: Daochen <[email protected]>
Date:   Tue Oct 8 20:41:13 2019 -0500

    setup

commit 65dded7
Author: Daochen <[email protected]>
Date:   Tue Oct 8 20:03:22 2019 -0500

    readme

commit 5567186
Merge: 976b8f3 ff4cddc
Author: Kwei-Herng Lai <[email protected]>
Date:   Tue Oct 8 17:54:14 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 9120a8e
Author: Kwei-Herng Lai <[email protected]>
Date:   Tue Oct 8 17:54:11 2019 -0700

    mahjong (unit test)

commit f63214e
Author: Daochen <[email protected]>
Date:   Tue Oct 8 04:28:21 2019 -0500

    refine docs and codes

commit 14dbf0f
Author: Daochen <[email protected]>
Date:   Tue Oct 8 04:02:35 2019 -0500

    cfr

commit e01e6da
Author: Yuanpu <[email protected]>
Date:   Tue Oct 8 04:09:57 2019 +0800

    fix uno test class name

commit 2836790
Author: Daochen <[email protected]>
Date:   Mon Oct 7 14:57:38 2019 -0500

    refine mahjong

commit 2f592c9
Merge: 4eb87c3 b47c55b
Author: Kwei-Herng Lai <[email protected]>
Date:   Mon Oct 7 12:43:48 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit f02675f
Author: Kwei-Herng Lai <[email protected]>
Date:   Mon Oct 7 12:43:45 2019 -0700

    mahjong (fixed bug)

commit a785435
Author: Daochen <[email protected]>
Date:   Mon Oct 7 11:40:44 2019 -0500

    refine codes, docs and stepbacks

commit ea2575d
Author: Yuanpu <[email protected]>
Date:   Mon Oct 7 11:26:49 2019 +0800

    init dqn multi process

commit 48e45d0
Author: Daochen <[email protected]>
Date:   Sun Oct 6 14:07:51 2019 -0500

    update docs

commit b383bd5
Author: Daochen <[email protected]>
Date:   Sun Oct 6 08:54:48 2019 -0500

    update docs

commit 813827c
Author: Daochen Zha <[email protected]>
Date:   Sat Oct 5 22:29:40 2019 -0500

    update docs

commit 100ac76
Author: Daochen Zha <[email protected]>
Date:   Sat Oct 5 22:23:44 2019 -0500

    update docs

commit e249175
Author: Daochen Zha <[email protected]>
Date:   Sat Oct 5 21:55:32 2019 -0500

    update docs

commit b5d6233
Author: Daochen <[email protected]>
Date:   Sat Oct 5 20:30:44 2019 -0500

    update docs

commit 762ef6b
Author: Daochen <[email protected]>
Date:   Sat Oct 5 20:26:41 2019 -0500

    update docs

commit 01e9385
Author: Kwei-Herng Lai <[email protected]>
Date:   Sat Oct 5 13:53:15 2019 -0700

    mahjong (with comment)

commit af3a194
Merge: b6dac75 49e3477
Author: Kwei-Herng Lai <[email protected]>
Date:   Sat Oct 5 13:48:57 2019 -0700

    mahjong (with comment)

commit 94b05f8
Author: Kwei-Herng Lai <[email protected]>
Date:   Sat Oct 5 13:48:17 2019 -0700

    mahjong (with comment)

commit 43c96a8
Author: Daochen <[email protected]>
Date:   Sat Oct 5 13:38:49 2019 -0500

    refine mahjong and doc

commit 89723db
Author: Daochen <[email protected]>
Date:   Sat Oct 5 11:40:49 2019 -0500

    refine codes

commit 0fb3bcd
Author: Daochen <[email protected]>
Date:   Sat Oct 5 11:32:29 2019 -0500

    nice leduc interface; uno rule agent

commit bea2396
Author: Ruzhe Wei <[email protected]>
Date:   Sat Oct 5 21:42:31 2019 +0800

    Update utils.py

commit 166454e
Author: Ruzhe Wei <[email protected]>
Date:   Sat Oct 5 21:40:26 2019 +0800

    refine holdem util

commit 0bba3c6
Author: Ruzhe Wei <[email protected]>
Date:   Sat Oct 5 21:37:24 2019 +0800

    Refine

commit bd44a61
Author: Ruzhe Wei <[email protected]>
Date:   Sat Oct 5 19:58:22 2019 +0800

    refine code

commit 73376c3
Author: Ruzhe Wei <[email protected]>
Date:   Sat Oct 5 19:55:09 2019 +0800

    refine code

commit c51f6cf
Author: Yuanpu <[email protected]>
Date:   Sat Oct 5 11:45:31 2019 +0800

    fix uno legal action

commit 4fff8ba
Merge: cb7d49a a9e1b26
Author: Kwei-Herng Lai <[email protected]>
Date:   Fri Oct 4 20:22:29 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 9afee4e
Author: Kwei-Herng Lai <[email protected]>
Date:   Fri Oct 4 20:22:26 2019 -0700

    mahjong (with comment)

commit e8c6127
Author: Yuanpu <[email protected]>
Date:   Sat Oct 5 10:16:08 2019 +0800

    multi process

commit 6e395b8
Author: Yuanpu <[email protected]>
Date:   Sat Oct 5 10:06:23 2019 +0800

    delete print

commit a42985a
Author: Yuanpu <[email protected]>
Date:   Sat Oct 5 10:02:50 2019 +0800

    fix uno legal_actions

commit 81f68e4
Author: Ruzhe Wei <[email protected]>
Date:   Sat Oct 5 01:17:31 2019 +0800

    nothing happened, just for good-looking

commit 17ca3e1
Merge: 4c2b7c2 41700e8
Author: Ruzhe Wei <[email protected]>
Date:   Sat Oct 5 00:52:44 2019 +0800

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 76ff800
Author: Ruzhe Wei <[email protected]>
Date:   Sat Oct 5 00:50:44 2019 +0800

    holdem util test

commit 65b6418
Author: Daochen <[email protected]>
Date:   Fri Oct 4 10:46:35 2019 -0500

    uno test interface

commit 74e1944
Author: Daochen <[email protected]>
Date:   Fri Oct 4 09:25:28 2019 -0500

    refine codes

commit 0e6cc2c
Merge: f4411f0 81ccbbd
Author: Daochen <[email protected]>
Date:   Fri Oct 4 09:04:59 2019 -0500

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit e34126e
Author: Daochen <[email protected]>
Date:   Fri Oct 4 09:04:33 2019 -0500

    initilize uno rule agent and refine limit holdem env

commit a284ff4
Author: Ruzhe Wei <[email protected]>
Date:   Fri Oct 4 21:57:50 2019 +0800

    improve efficiency

commit 6d207cf
Author: Ruzhe Wei <[email protected]>
Date:   Fri Oct 4 21:29:00 2019 +0800

    improve efficiency

commit f0a93fe
Author: Ruzhe Wei <[email protected]>
Date:   Fri Oct 4 21:16:34 2019 +0800

    improve efficiency

commit eb54c05
Author: Ruzhe Wei <[email protected]>
Date:   Fri Oct 4 21:01:08 2019 +0800

    Improve efficiency

commit c6117b0
Author: Ruzhe Wei <[email protected]>
Date:   Fri Oct 4 20:02:04 2019 +0800

    Bug fixed

commit e109f24
Author: Kwei-Herng Lai <[email protected]>
Date:   Thu Oct 3 20:47:39 2019 -0700

    mahjong (finish)

commit b66d115
Merge: 8403d24 b2b504d
Author: Kwei-Herng Lai <[email protected]>
Date:   Thu Oct 3 20:40:10 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 3c7e23b
Author: Daochen <[email protected]>
Date:   Thu Oct 3 22:02:53 2019 -0500

    update doce

commit 7547a7f
Author: Daochen <[email protected]>
Date:   Thu Oct 3 21:58:00 2019 -0500

    refine codes and add docs

commit ada29b5
Author: Daochen <[email protected]>
Date:   Thu Oct 3 17:59:08 2019 -0500

    refine codes

commit 6bec418
Author: Daochen <[email protected]>
Date:   Thu Oct 3 17:46:29 2019 -0500

    Add human interface and single-agent environment for Leduc

commit 79fa47d
Author: Yuanpu <[email protected]>
Date:   Thu Oct 3 22:23:04 2019 +0800

    uno doc

commit 4e656d0
Merge: 0fe38d9 2dc9560
Author: Ruzhe Wei <[email protected]>
Date:   Thu Oct 3 15:25:14 2019 +0800

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 558280c
Author: Ruzhe Wei <[email protected]>
Date:   Thu Oct 3 15:25:06 2019 +0800

    Update utils.py

commit 5f57ebe
Author: Yuanpu <[email protected]>
Date:   Thu Oct 3 14:33:04 2019 +0800

    uno docstring

commit edf644b
Author: Daochen <[email protected]>
Date:   Wed Oct 2 13:14:13 2019 -0500

    modify texas holdem feature

commit 133af48
Author: Daochen <[email protected]>
Date:   Wed Oct 2 11:43:27 2019 -0500

    deepcfr test

commit 9054ffe
Merge: 382e196 d618551
Author: Kwei-Herng Lai <[email protected]>
Date:   Wed Oct 2 09:39:22 2019 -0700

    deep_cfr

commit cc7fccb
Author: Kwei-Herng Lai <[email protected]>
Date:   Wed Oct 2 09:38:30 2019 -0700

    deep_cfr

commit 05f8b0d
Merge: c661d23 e2c055d
Author: Daochen <[email protected]>
Date:   Wed Oct 2 11:08:58 2019 -0500

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 9f197e9
Author: Daochen <[email protected]>
Date:   Wed Oct 2 11:08:35 2019 -0500

    deepcfr test

commit 077efbf
Author: Yuanpu <[email protected]>
Date:   Wed Oct 2 23:57:55 2019 +0800

    normal uno

commit 129e96b
Author: Ruzhe Wei <[email protected]>
Date:   Wed Oct 2 22:33:18 2019 +0800

    Update utils.py

commit 09a9eef
Author: Ruzhe Wei <[email protected]>
Date:   Wed Oct 2 18:57:57 2019 +0800

    issues fixed

commit 885c4f1
Author: Daochen <[email protected]>
Date:   Tue Oct 1 21:07:45 2019 -0500

    fix blackjack

commit ac99154
Author: Daochen <[email protected]>
Date:   Tue Oct 1 20:59:45 2019 -0500

    fix limit holdem

commit 18fa410
Merge: e23056c 098742f
Author: Daochen <[email protected]>
Date:   Tue Oct 1 20:51:02 2019 -0500

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 8494ed3
Author: Daochen <[email protected]>
Date:   Tue Oct 1 20:50:42 2019 -0500

    fix limit holdem

commit 12a11ac
Author: Daochen <[email protected]>
Date:   Tue Oct 1 18:59:27 2019 -0500

    add name to Adam

commit 8beb85e
Author: Daochen <[email protected]>
Date:   Tue Oct 1 16:49:54 2019 -0500

    fix limit holdem dqn example

commit a2c4beb
Author: Yuanpu <[email protected]>
Date:   Wed Oct 2 03:54:36 2019 +0800

    normal uno

commit 2ace052
Merge: ca4f04b cb4a983
Author: Kwei-Herng Lai <[email protected]>
Date:   Tue Oct 1 12:36:26 2019 -0700

    deep_cfr

commit 95f597b
Author: Kwei-Herng Lai <[email protected]>
Date:   Tue Oct 1 12:35:53 2019 -0700

    deep_cfr

commit f70a1fa
Author: Kwei-Herng Lai <[email protected]>
Date:   Tue Oct 1 12:35:29 2019 -0700

    deep_cfr

commit d92ef74
Author: Daochen <[email protected]>
Date:   Tue Oct 1 12:16:29 2019 -0500

    leduc test

commit 1f5391e
Author: Daochen <[email protected]>
Date:   Tue Oct 1 11:47:33 2019 -0500

    Change setup

commit b4ca4b1
Author: Daochen <[email protected]>
Date:   Tue Oct 1 11:22:15 2019 -0500

    leduc test

commit c0110b1
Author: Yuanpu <[email protected]>
Date:   Tue Oct 1 06:38:21 2019 +0800

    doudizhu random multi process

commit e9f5d0e
Author: Yuanpu <[email protected]>
Date:   Tue Oct 1 01:26:04 2019 +0800

    uno env test

commit 4d70710
Author: Daochen <[email protected]>
Date:   Mon Sep 30 12:16:10 2019 -0500

    uno examples

commit 0cc72ad
Author: Daochen <[email protected]>
Date:   Mon Sep 30 11:09:26 2019 -0500

    update examples

commit 34bfcca
Author: Daochen <[email protected]>
Date:   Mon Sep 30 09:03:59 2019 -0500

    fix uno state_shape

commit 8e51abd
Author: Daochen <[email protected]>
Date:   Mon Sep 30 08:50:05 2019 -0500

    add state space in env

commit 810fdb5
Author: Yuanpu <[email protected]>
Date:   Mon Sep 30 16:28:46 2019 +0800

    uno test

commit b1514a5
Author: Daochen <[email protected]>
Date:   Sun Sep 29 16:51:02 2019 -0500

    update docs

commit a4bd9ee
Author: Daochen <[email protected]>
Date:   Sun Sep 29 16:47:36 2019 -0500

    refine codes

commit 7e1b32c
Author: Daochen <[email protected]>
Date:   Sun Sep 29 16:39:36 2019 -0500

    update docs

commit 44fe921
Author: Daochen <[email protected]>
Date:   Sun Sep 29 16:29:34 2019 -0500

    update docs

commit 19646a6
Author: Daochen <[email protected]>
Date:   Sun Sep 29 16:19:30 2019 -0500

    clean codes

commit 6c63ad8
Author: Daochen <[email protected]>
Date:   Sun Sep 29 15:51:22 2019 -0500

    Accelerate Dou Dizhu

commit 5b5e389
Author: Ruzhe Wei <[email protected]>
Date:   Sat Sep 28 18:45:49 2019 +0800

    Refine Holdem Utils

commit 6266310
Author: Daochen <[email protected]>
Date:   Fri Sep 27 14:15:21 2019 -0500

    refine nolimit test

commit 93893ad
Merge: 41c72ba 5a498f1
Author: Kwei-Herng Lai <[email protected]>
Date:   Fri Sep 27 11:46:58 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 126c86b
Author: Kwei-Herng Lai <[email protected]>
Date:   Fri Sep 27 11:46:56 2019 -0700

    mahjong (debugging))

commit 9156ffe
Author: JunyuGuo <[email protected]>
Date:   Sat Sep 28 02:03:23 2019 +0800

    Add files via upload

commit 1660e1f
Author: JunyuGuo <[email protected]>
Date:   Sat Sep 28 02:01:19 2019 +0800

    Delete test_nolimitholdem_game.py

commit 23b7c31
Author: Daochen Zha <[email protected]>
Date:   Fri Sep 27 02:50:47 2019 -0500

    update docs

commit d2eae49
Author: Daochen <[email protected]>
Date:   Fri Sep 27 02:31:19 2019 -0500

    add dqn and nfsp to leduc

commit 767db7c
Author: Songyi Huang <[email protected]>
Date:   Thu Sep 26 20:38:24 2019 -0700

    game description

commit 3b4e2f5
Author: JunyuGuo <[email protected]>
Date:   Fri Sep 27 04:22:16 2019 +0800

    Add files via upload

commit 784b8da
Author: JunyuGuo <[email protected]>
Date:   Fri Sep 27 04:20:30 2019 +0800

    Add files via upload

commit aea7a54
Author: Daochen Zha <[email protected]>
Date:   Thu Sep 26 09:28:07 2019 -0500

    update docs

commit 264c335
Author: Daochen Zha <[email protected]>
Date:   Thu Sep 26 08:48:04 2019 -0500

    update docs

commit 230f49b
Merge: bb17e8b 06e28fa
Author: Kwei-Herng Lai <[email protected]>
Date:   Wed Sep 25 20:42:13 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit b2c325f
Author: Kwei-Herng Lai <[email protected]>
Date:   Wed Sep 25 20:42:10 2019 -0700

    mahjong (debugging))

commit 13be45f
Merge: dcaf830 ccc4edc
Author: songyih <[email protected]>
Date:   Wed Sep 25 17:05:22 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 84f9e7a
Author: songyih <[email protected]>
Date:   Wed Sep 25 17:05:11 2019 -0700

    leduc holdem env

commit 8da3820
Author: Daochen <[email protected]>
Date:   Wed Sep 25 18:49:15 2019 -0500

    refine codes

commit 16426db
Merge: ad53e65 7dd2647
Author: songyih <[email protected]>
Date:   Wed Sep 25 16:42:14 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 79cf482
Author: Daochen <[email protected]>
Date:   Wed Sep 25 18:34:27 2019 -0500

    update test

commit fe365f5
Author: Daochen <[email protected]>
Date:   Wed Sep 25 18:23:54 2019 -0500

    update tests

commit 6ecee75
Author: Daochen Zha <[email protected]>
Date:   Wed Sep 25 16:48:00 2019 -0500

    update docs

commit 4ecbebf
Author: Daochen Zha <[email protected]>
Date:   Wed Sep 25 16:33:42 2019 -0500

    update docs

commit f27f730
Author: Songyi Huang <[email protected]>
Date:   Tue Sep 24 20:05:13 2019 -0700

    leducholdem

commit 44d1355
Author: Daochen <[email protected]>
Date:   Tue Sep 24 18:10:38 2019 -0500

    no-limit examples

commit b17e17c
Author: Daochen <[email protected]>
Date:   Tue Sep 24 12:21:23 2019 -0500

    Add timesteps and refine codes

commit 5024674
Author: Yuanpu <[email protected]>
Date:   Wed Sep 25 00:27:46 2019 +0800

    test_limitholdem_env

commit 7b51d5e
Author: Songyi Huang <[email protected]>
Date:   Mon Sep 23 20:13:13 2019 -0700

    init leduc holdem

commit f8046a6
Author: Daochen <[email protected]>
Date:   Mon Sep 23 21:51:19 2019 -0500

    modify setup

commit 427a92f
Author: Daochen <[email protected]>
Date:   Mon Sep 23 21:31:23 2019 -0500

    fix tests

commit 03ccb13
Author: Daochen <[email protected]>
Date:   Mon Sep 23 21:24:15 2019 -0500

    refine tests

commit 72e4d4b
Author: Daochen <[email protected]>
Date:   Mon Sep 23 20:11:11 2019 -0500

    refine codes

commit c48ce57
Author: Ruzhe Wei <[email protected]>
Date:   Tue Sep 24 08:37:09 2019 +0800

    limitholdem util comments

commit f7b0d4b
Author: Daochen <[email protected]>
Date:   Mon Sep 23 17:48:57 2019 -0500

    refine codes

commit 5bfdd52
Author: Kwei-Herng Lai <[email protected]>
Date:   Mon Sep 23 15:47:09 2019 -0700

    deep_cfr (stable)

commit 06b144a
Author: Kwei-Herng Lai <[email protected]>
Date:   Mon Sep 23 15:04:36 2019 -0700

    deep_cfr (stable)

commit 2ac02fd
Merge: 820d7b7 3fc95fd
Author: Kwei-Herng Lai <[email protected]>
Date:   Mon Sep 23 15:00:18 2019 -0700

    deep_cfr (stable)

commit de20483
Author: Kwei-Herng Lai <[email protected]>
Date:   Mon Sep 23 14:58:40 2019 -0700

    deep_cfr (stable)

commit 147c957
Author: Yuanpu <[email protected]>
Date:   Tue Sep 24 05:32:11 2019 +0800

    fix uno extract_state and limitholdem test

commit 6448566
Author: Daochen <[email protected]>
Date:   Mon Sep 23 16:16:28 2019 -0500

    refine codes

commit c24378f
Author: Daochen <[email protected]>
Date:   Mon Sep 23 15:50:16 2019 -0500

    refine nfsp

commit 29fd24e
Author: Daochen <[email protected]>
Date:   Mon Sep 23 15:07:19 2019 -0500

    nfsp

commit ac90938
Author: Songyi Huang <[email protected]>
Date:   Mon Sep 23 12:46:42 2019 -0700

    nolimit holdem env

commit b641964
Author: Songyi Huang <[email protected]>
Date:   Mon Sep 23 12:46:11 2019 -0700

    nolimit holdem env

commit dfc09cd
Author: songyih <[email protected]>
Date:   Mon Sep 23 11:22:56 2019 -0700

    nolimitholdem env

commit 4448d14
Author: songyih <[email protected]>
Date:   Mon Sep 23 10:34:05 2019 -0700

    todo

commit dbddd96
Author: Ruzhe Wei <[email protected]>
Date:   Mon Sep 23 22:23:46 2019 +0800

    limitholdem util comments

commit f6cfccb
Author: Yuanpu <[email protected]>
Date:   Mon Sep 23 12:58:15 2019 +0800

    limitholdem unit test

commit f895859
Author: Songyi Huang <[email protected]>
Date:   Sun Sep 22 15:29:19 2019 -0700

    clean unlimit holdem

commit 764d3cc
Merge: 7f89904 e1d6386
Author: Songyi Huang <[email protected]>
Date:   Sun Sep 22 15:21:39 2019 -0700

    unlimit holdem

commit e831027
Author: Songyi Huang <[email protected]>
Date:   Sun Sep 22 15:20:44 2019 -0700

    unlimit holdem

commit 2823756
Author: Yuanpu <[email protected]>
Date:   Mon Sep 23 05:49:19 2019 +0800

    restructure doudizhu

commit 34f9fe3
Merge: 6354fce 4c14df6
Author: Daochen <[email protected]>
Date:   Sun Sep 22 12:16:14 2019 -0500

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 25691c9
Author: Daochen <[email protected]>
Date:   Sun Sep 22 12:15:55 2019 -0500

    limit holdem

commit 925c1f7
Author: Daochen <[email protected]>
Date:   Sun Sep 22 12:13:58 2019 -0500

    limit holdem

commit cd45ae5
Author: Yuanpu <[email protected]>
Date:   Sun Sep 22 23:08:01 2019 +0800

    doudizhu state

commit d82d6fe
Author: Yuanpu <[email protected]>
Date:   Sun Sep 22 22:58:10 2019 +0800

    fix doudizhu state

commit 2b8d3ca
Author: Daochen <[email protected]>
Date:   Sun Sep 22 01:50:40 2019 -0500

    refine codes

commit 5e9e2b6
Author: Daochen <[email protected]>
Date:   Sun Sep 22 01:14:25 2019 -0500

    refine codes

commit 638430c
Author: Daochen <[email protected]>
Date:   Sun Sep 22 01:04:00 2019 -0500

    refine codes

commit 5bd7224
Author: Yuanpu <[email protected]>
Date:   Sun Sep 22 13:54:34 2019 +0800

    fix doudizhu bug

commit ec63f2d
Author: Daochen <[email protected]>
Date:   Sun Sep 22 00:51:19 2019 -0500

    nfsp

commit 9b8d388
Author: Yuanpu <[email protected]>
Date:   Sun Sep 22 11:31:18 2019 +0800

    test json oder

commit 773e49a
Author: Yuanpu <[email protected]>
Date:   Sun Sep 22 11:25:54 2019 +0800

    json order

commit fa2ddc1
Author: Songyi Huang <[email protected]>
Date:   Sat Sep 21 14:06:09 2019 -0700

    unlimit holdem

commit c661488
Merge: 562d509 131fda3
Author: Songyi Huang <[email protected]>
Date:   Sat Sep 21 09:57:40 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit c63942b
Merge: 3a09892 23f2fa8
Author: Ruzhe Wei <[email protected]>
Date:   Sat Sep 21 10:43:09 2019 +0800

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 45e8335
Author: Ruzhe Wei <[email protected]>
Date:   Sat Sep 21 10:42:52 2019 +0800

    Update utils.py

commit 8def3e4
Merge: 86ebbb5 23f2fa8
Author: Songyi Huang <[email protected]>
Date:   Fri Sep 20 19:02:42 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 9b90670
Author: Kwei-Herng Lai <[email protected]>
Date:   Fri Sep 20 17:58:21 2019 -0700

    deep_cfr for legal_action (no mask yet)

commit da832d5
Merge: 28ff08c 019f76d
Author: Kwei-Herng Lai <[email protected]>
Date:   Fri Sep 20 17:51:26 2019 -0700

    deep_cfr for legal_action (no mask yet)

commit 615c9b8
Author: Kwei-Herng Lai <[email protected]>
Date:   Fri Sep 20 17:49:04 2019 -0700

    deep_cfr for legal_action (no mask yet)

commit 581e358
Merge: 751d1b2 3902bf2
Author: Ruzhe Wei <[email protected]>
Date:   Sat Sep 21 08:11:01 2019 +0800

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 29dd036
Author: Ruzhe Wei <[email protected]>
Date:   Sat Sep 21 08:07:53 2019 +0800

    limitholdem ultils error eliminated

commit 82e7877
Merge: 5c55e59 3902bf2
Author: Kwei-Herng Lai <[email protected]>
Date:   Fri Sep 20 15:14:20 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 57c2140
Author: Yuanpu <[email protected]>
Date:   Sat Sep 21 06:13:59 2019 +0800

    doudizhu dict to list

commit b7f8cb1
Author: Kwei-Herng Lai <[email protected]>
Date:   Fri Sep 20 15:06:51 2019 -0700

    add test deepCFR2 for testing legal-action games

commit c3b0884
Merge: 0facf96 94d4b88
Author: Kwei-Herng Lai <[email protected]>
Date:   Fri Sep 20 14:25:27 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit a4dc18b
Author: Kwei-Herng Lai <[email protected]>
Date:   Fri Sep 20 14:25:23 2019 -0700

    add test deepCFR2 code quality

commit 6479ba6
Author: Yuanpu <[email protected]>
Date:   Sat Sep 21 01:32:00 2019 +0800

    fix uno bug

commit 3037b5d
Author: Yuanpu <[email protected]>
Date:   Sat Sep 21 00:55:33 2019 +0800

    uno env and random example

commit 41a961a
Author: Daochen <[email protected]>
Date:   Fri Sep 20 11:41:21 2019 -0500

    doudizhu legal added

commit d2df6ab
Author: Daochen <[email protected]>
Date:   Fri Sep 20 11:23:21 2019 -0500

    nfsp

commit 1f7f6db
Merge: 7836f35 dca84fa
Author: Songyi Huang <[email protected]>
Date:   Thu Sep 19 18:36:44 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 0597936
Author: Kwei-Herng Lai <[email protected]>
Date:   Thu Sep 19 13:21:01 2019 -0700

    refine deepCFR code quality

commit 23ff235
Author: Kwei-Herng Lai <[email protected]>
Date:   Thu Sep 19 13:07:53 2019 -0700

    refine deepCFR code qualirt

commit ccfb3c1
Author: Kwei-Herng Lai <[email protected]>
Date:   Thu Sep 19 12:57:23 2019 -0700

    sonnet MLP for deepCFR

commit b8f7f3a
Merge: 17d271a 43b27ac
Author: Songyi Huang <[email protected]>
Date:   Wed Sep 18 19:51:37 2019 -0700

    Merge branch 'dev' of https://github.com/datamllab/rlcard into dev

commit 3a1d5c6
Author: Songyi Huang <[email protected]>
Date:   Wed Sep 18 19:51:31 2019 -0700

    typo

commit ac38aaf
Author: Daochen <[email protected]>
Date:   Wed Sep 18 20:03:39 2019 -0500

    add legal actions

commit 54965de
Author: Yuanpu <[email protected]>
Date:   Thu Sep 19 06:21:45 2019 +0800

    roughly complete uno game
  • Loading branch information
Daochen committed Oct 10, 2019
1 parent cdab1bc commit 4f0d6df
Show file tree
Hide file tree
Showing 151 changed files with 8,566 additions and 1,557 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,3 +15,4 @@ docs/rst
docs/sphinx
experiments/
newtest/
dist/
2 changes: 0 additions & 2 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@ language: python
install:
- pip install -e .
before_script:
- pip install matplotlib
- pip install dm-sonnet
- pip install python-coveralls
- pip install pytest-cover
script:
Expand Down
118 changes: 88 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,110 @@
# RLCard: A Toolkit for Reinforcement Learning in Card Games
[![Build Status](https://travis-ci.org/datamllab/RLCard.svg?branch=master)](https://travis-ci.org/datamllab/RLCard)
[![Codacy Badge](https://api.codacy.com/project/badge/Grade/248eb15c086748a4bcc830755f1bd798)](https://www.codacy.com/manual/daochenzha/rlcard?utm_source=github.com&amp;utm_medium=referral&amp;utm_content=datamllab/rlcard&amp;utm_campaign=Badge_Grade)
[![Coverage Status](https://coveralls.io/repos/github/datamllab/rlcard/badge.svg?branch=master)](https://coveralls.io/github/datamllab/rlcard?branch=master)
[![Coverage Status](https://coveralls.io/repos/github/datamllab/rlcard/badge.svg)](https://coveralls.io/github/datamllab/rlcard?branch=master)

RLCard is a opensource toolkit for developing Reinforcement Learning (RL) algorithms in card games. It supports multiple challenging card game environments with common and easy-to-use interfaces. The goal of the toolkit is to enable more people to study game AI and push forward the research of imperfect information games. RLCard is developed by [DATA Lab](http://faculty.cs.tamu.edu/xiahu/) at Texas A&M University. **NOTE: The project is still in final testing!**
RLCard is a toolkit for Reinforcement Learning (RL) in card games. It supports multiple card environments with easy-to-use interfaces. The goal of RLCard is to bridge reinforcement learning and imperfect information games, and push forward the research of reinforcement learning in domains with multiple agents, large state and action space, and sparse reward. RLCard is developed by [DATA Lab](http://faculty.cs.tamu.edu/xiahu/) at Texas A&M University.

* Official Website: [http://www.rlcard.org](http://www.rlcard.org)

## Installation
Make sure that you have **Python 3.5+** and **pip** installed. You can install `rlcard` with `pip` as follow:
```console
Make sure that you have **Python 3.5+** and **pip** installed. We recommend installing `rlcard` with `pip` as follow:

```
git clone https://github.com/datamllab/rlcard.git
cd rlcard
pip install -e .
```
To check whether it is intalled correctly, try the example with random agents:
```console
python examples/blackjack_random.py

Or you can directly install the package with

```
pip install rlcard
```

## Getting Started
The interfaces generally follow [OpenAI gym](https://github.com/openai/gym) style. We recommend starting with the following **toy examples**.
* [Playing with random agents](docs/toy-examples.md#playing-with-random-agents)
* [Deep-Q learning on Blackjack](docs/toy-examples.md#deep-q-learning-on-blackjack)
* [DeepCFR on Blackjack](docs/toy-examples.md#deepcfr-on-blackjack)
## Examples
Please refer to [examples/](examples). A **short example** is as below.

```python
import rlcard
from rlcard.agents.random_agent import RandomAgent

env = rlcard.make('blackjack')
env.set_agents([RandomAgent()])

trajectories, payoffs = env.run()
```

For more examples, please refer to [examples/](examples).
We also recommend the following **toy examples**.

* [Playing with random agents](docs/toy-examples.md#playing-with-random-agents)
* [Deep-Q learning on Blackjack](docs/toy-examples.md#deep-q-learning-on-blackjack)
* [Running multiple processes](docs/toy-examples.md#running-multiple-processes)
* [Having fun with pretrained Leduc model](docs/toy-examples.md#having-fun-with-pretrained-leduc-model)
* [Leduc Hold'em as single-agent environment](docs/toy-examples.md#leduc-holdem-as-single-agent-environment)
* [Training CFR on Leduc Hold'em](docs/toy-examples.md#training-cfr-on-leduc-holdem)

## Demo
Run `examples/leduc_holdem_human.py` to play with the pre-trained Leduc Hold'em model:

```
>> Leduc Hold'em pre-trained model
>> Start a new game!
>> Agent 1 chooses raise
=============== Community Card ===============
┌─────────┐
│░░░░░░░░░│
│░░░░░░░░░│
│░░░░░░░░░│
│░░░░░░░░░│
│░░░░░░░░░│
│░░░░░░░░░│
│░░░░░░░░░│
└─────────┘
=============== Your Hand ===============
┌─────────┐
│J │
│ │
│ │
│ ♥ │
│ │
│ │
│ J│
└─────────┘
=============== Chips ===============
Yours: +
Agent 1: +++
=========== Actions You Can Choose ===========
0: call, 1: raise, 2: fold
>> You choose action (integer):
```

## Documents
Please refer to the [Documents](docs/README.md) for general concepts introduction. API documents are available at our [github page](https://rlcard.github.io/index.html).
Please refer to the [Documents](docs/README.md) for general introductions. API documents are available at our [website](http://www.rlcard.org).

## Available Environments
The table below shows the environments that are (or will be soon) available in RLCard. We provide a complexity estimation for the games on several aspects. **InfoSet Number:** the number of information set; **Avg. InfoSet Size:** the average number of states in a single information set; **Action Size:** the size of the action space. For some of the complex card games, we can only provide a range of estimation. **Name** is the name that should be passed to `env.make` to create the game environment.

| Game | InfoSet Number | Avg. InfoSet Size | Action Size | Name | Status |
| :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------: | :---------------: | :---------: | :-------------: | :-------: |
| Blackjack ([wiki](https://en.wikipedia.org/wiki/Blackjack), [baike](https://baike.baidu.com/item/21%E7%82%B9/5481683?fr=aladdin)) | 10^3 | 10^1 | 10^0 | blackjack | Available |
| Limit Texas Hold'em ([wiki](https://en.wikipedia.org/wiki/Texas_hold_%27em), [baike](https://baike.baidu.com/item/%E5%BE%B7%E5%85%8B%E8%90%A8%E6%96%AF%E6%89%91%E5%85%8B/83440?fr=aladdin)) | 10^14 | 10^3 | 10^0 | limit-holdem | Available |
| Dou Dizhu ([wiki](https://en.wikipedia.org/wiki/Dou_dizhu), [baike](https://baike.baidu.com/item/%E6%96%97%E5%9C%B0%E4%B8%BB/177997?fr=aladdin)) | 10^53 ~ 10^83 | 10^23 | 10^4 | doudizhu | Available |
| Mahjong ([wiki](https://en.wikipedia.org/wiki/Competition_Mahjong_scoring_rules), [baike](https://baike.baidu.com/item/%E9%BA%BB%E5%B0%86/215)) | 10^121 | 10^48 | 10^2 | - | Come soon |
| No-limit Texas Hold'em ([wiki](https://en.wikipedia.org/wiki/Texas_hold_%27em), [baike](https://baike.baidu.com/item/%E5%BE%B7%E5%85%8B%E8%90%A8%E6%96%AF%E6%89%91%E5%85%8B/83440?fr=aladdin)) | 10^162 | 10^3 | 10^4 | no-limit-holdem | Available |
| UNO ([wiki](https://en.wikipedia.org/wiki/Uno_\(card_game), [baike](https://baike.baidu.com/item/UNO%E7%89%8C/2249587)) | 10^163 | 10^10 | 10^1 | - | Come soon |
| Sheng Ji ([wiki](https://en.wikipedia.org/wiki/Sheng_ji), [baike](https://baike.baidu.com/item/%E5%8D%87%E7%BA%A7/3563150)) | 10^157 ~ 10^165 | 10^61 | 10^13 | - | Come soon |
We provide a complexity estimation for the games on several aspects. **InfoSet Number:** the number of information sets; **Avg. InfoSet Size:** the average number of states in a single information set; **Action Size:** the size of the action space. **Name:** the name that should be passed to `env.make` to create the game environment.

| Game | InfoSet Number | Avg. InfoSet Size | Action Size | Name | Status |
| :--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-------------: | :---------------: | :---------: | :-------------: | :--------: |
| Blackjack ([wiki](https://en.wikipedia.org/wiki/Blackjack), [baike](https://baike.baidu.com/item/21%E7%82%B9/5481683?fr=aladdin)) | 10^3 | 10^1 | 10^0 | blackjack | Available |
| Leduc Hold’em | 10^2 | 10^2 | 10^0 | leduc-holdem | Available |
| Limit Texas Hold'em ([wiki](https://en.wikipedia.org/wiki/Texas_hold_%27em), [baike](https://baike.baidu.com/item/%E5%BE%B7%E5%85%8B%E8%90%A8%E6%96%AF%E6%89%91%E5%85%8B/83440?fr=aladdin)) | 10^14 | 10^3 | 10^0 | limit-holdem | Available |
| Dou Dizhu ([wiki](https://en.wikipedia.org/wiki/Dou_dizhu), [baike](https://baike.baidu.com/item/%E6%96%97%E5%9C%B0%E4%B8%BB/177997?fr=aladdin)) | 10^53 ~ 10^83 | 10^23 | 10^4 | doudizhu | Available |
| Mahjong ([wiki](https://en.wikipedia.org/wiki/Competition_Mahjong_scoring_rules), [baike](https://baike.baidu.com/item/%E9%BA%BB%E5%B0%86/215)) | 10^121 | 10^48 | 10^2 | mahjong | Available |
| No-limit Texas Hold'em ([wiki](https://en.wikipedia.org/wiki/Texas_hold_%27em), [baike](https://baike.baidu.com/item/%E5%BE%B7%E5%85%8B%E8%90%A8%E6%96%AF%E6%89%91%E5%85%8B/83440?fr=aladdin)) | 10^162 | 10^3 | 10^4 | no-limit-holdem | Available |
| UNO ([wiki](https://en.wikipedia.org/wiki/Uno_\(card_game), [baike](https://baike.baidu.com/item/UNO%E7%89%8C/2249587)) | 10^163 | 10^10 | 10^1 | uno | Available |
| Sheng Ji ([wiki](https://en.wikipedia.org/wiki/Sheng_ji), [baike](https://baike.baidu.com/item/%E5%8D%87%E7%BA%A7/3563150)) | 10^157 ~ 10^165 | 10^61 | 10^11 | - | Developing |

## Evaluation
We wrap a `Logger` that conveniently saves/plots the results. Example outputs are as follows:
![Learning Curves](docs/imgs/curves.png "Learning Curves")
The perfomance is measured by winning rates through tournaments. Example outputs are as follows:
![Learning Curves](http://rlcard.org/imgs/curves.png "Learning Curves")

## Disclaimer
Please note that this is a **pre-release** version of the RLCard. The toolkit is provided "**as is**," without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and noninfringement.
## Contributing
Contribution to this project is greatly appreciated! Please create a issue for feedbacks/bugs. If you want to contribute codes, pleast contact [[email protected]](mailto:[email protected]) or [[email protected]]([email protected]).

## Acknowledgements
We would like to thank JJ World Network Technology Co.,LTD for technical the support.
We would like to thank JJ World Network Technology Co.,LTD for the generous support.
38 changes: 21 additions & 17 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,24 @@
# Overview
The toolkit wraps each game by `Env` with easy-to-use interfaces. The goal of this toolkit is to enable the users to focus on algorithm design on challenging card games instead of developping game engines. The following design principles are applied:
* **Simple.** We make the interfaces straightforward and simple. Users can easily run one game and obtain the statistics of the game.
* **Consistent.** All the games are implemented following the same logical pattern. The main classes/functions of each game share the same class/function name. Users can easily understand each game and modify the rules for research purpose.
* **Reproducible.** The results can be seeded for reproducibility purpose.
* **Minimum Dependency.** We minimize the dependencies used in the toolkit so that the codes are easy to modify or migrate.
* **Scalable.** New card environments can be added conveniently into RLCard with the above design principles.
# Documents of RLCard

# User Guide
* [Toy examples](toy-examples.md)
* [RLCard high-level design](high-level-design.md)
* [Games in RLCard](games.md)
* [Algorithms in RLCard](algorithms.md)
* [Developping new algorithms](developping-algorithms.md)
## Overview
The toolkit wraps each game by `Env` class with easy-to-use interfaces. The goal of this toolkit is to enable the users to focus on algorithm development without caring about the environment. The following design principles are applied when developing the toolkit:
* **Reproducible.** Results on the environments can be reproduced. The same result should be obtained with the same random seed in different runs.
* **Accessible.** The experiences are collected and well organized after each game with easy-to-use interfaces. Uses can conveniently configure state representation, action encoding, reward design, or even the game rules.
* **Scalable.** New card environments can be added conveniently into the toolkit with the above design principles. We also try to minimize the dependencies in the toolkit so that the codes can be easily maintained.

# Developer Guide
* [Adding new environments](adding-new-environments.md)
## User Guide

# Application Programming Interface (API)
The API documents are and available in [github page](https://rlcard.github.io/index.html).
* [Toy examples](toy-examples.md)
* [RLCard high-level design](high-level-design.md)
* [Games in RLCard](games.md)
* [Algorithms in RLCard](algorithms.md)

## Developer Guide

* [Developping new algorithms](developping-algorithms.md)
* [Adding new environments](adding-new-environments.md)
* [Customizing environments](customizing-environments.md)
* [Adding pre-trained/rule-based models](adding-models.md)

## Application Programming Interface (API)
The API documents are and available at [Official Website](http://www.rlcard.org).
7 changes: 7 additions & 0 deletions docs/adding-models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Adding Pre-trained/Rule-based models
You can add your own pre-trained/rule-based models to the toolkit by following several steps:

* **Develop models.** You can either design a rule-based model save a neural network model. For each game, you need to develop models for all the players at the same time. You need to wrap each model as a class and make sure that `step` and `eval_step` can work correctly.
* **Wrap models.** You need to inherit the `Model` class in `rlcard/models.model.py`. Then put all the models for the players into a list. Rewrite `get_agent` function and return this list.
* **Register the model.** Register the model in `rlcard/models/__init__.py`.
* **Load the model in environment.** To load the model, modify `load_pretrained_models` in the corresponding game environment in `rlcard/envs`. Use the resgistered name to load the model.
8 changes: 4 additions & 4 deletions docs/adding-new-environments.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Adding New Environments
To add a new environment to the toolkit, generally you should take the following steps:
* **Implement a game.** Card games usually have similar structures so that they can be implemented with `Game`, `Round`, `Dealer`, `Judger`, `Player` as in existing games. The easiest way is to inherit the classed in [rlcard/core.py](rlcard/core.py) and implement the functions.
* **Wrap the game with an environment.** The easiest way is to inherit `Env` in [rlcard/envs/env.py](rlcard/env/env.py). You need to implement `extract_state` which encodes the state, `decode_action` which decode actions from the id to the text string, and `get_payoffs` which calculate payoffs of the players.
* **Register the game.** Now it is time to tell the toolkit where to locate the new environment. Go to [rlcard/envs/__init__.py](rlcard/envs/__init__.py), and indicate the name of the game and its entry point.
* **Implement a game.** Card games usually have similar structures so that they can be implemented with `Game`, `Round`, `Dealer`, `Judger`, `Player`, as in existing games. The easiest way is to inherit the classed in [rlcard/core.py](../rlcard/core.py) and implement the functions.
* **Wrap the game with an environment.** The easiest way is to inherit `Env` in [rlcard/envs/env.py](../rlcard/env/env.py). You need to implement `extract_state` which encodes the state, `decode_action` which decodes actions from the id to the text string, and `get_payoffs` which calculates payoffs of the players.
* **Register the game.** Now it is time to tell the toolkit where to locate the new environment. Go to [rlcard/envs/\_\_init\_\_.py](../rlcard/envs/__init__.py), and indicate the name of the game and its entry point.

To test whether the new environment is set up successfully:
```python
import rlcard
env.make(#the new evironment#)
rlcard.make(#the new evironment#)
```
Loading

0 comments on commit 4f0d6df

Please sign in to comment.