We add one more game round to the curling environment with serving order switched. So in round one, agent purple start first and agent green finish last; while in round two agent green start first and agent purple finish last. 我们在冰壶环境中额外添加了一局比赛,并互换双方发球顺序。如:第一局紫色方先发球,第二局绿色方先发球。
After each round, a team scores one game point for each of its own stones closer to the center than any stone of the opposite team and only one team can score in the end of each round. After two game rounds, total game point will be computed and the winner has the highest total score. 每一局游戏结束时将会进行当局游戏得分结算,一方每有一个冰壶比另一方所有冰壶更靠近圆心则得一分,每局比赛仅有一方得分;两局比赛结束后,将根据两局比赛的总得分决出胜负,得分高的一方为获胜方。
Check details in Jidi Competition RLChina2022智能体竞赛
标签:不完全观测;连续动作空间;连续状态空间
环境简介:智能体参加奥林匹克运动会。在这个系列的竞赛中, 两个智能体参加冰壶竞赛,目标是将球推至目标中心点处。
环境规则:
- 对战双方各控制四个有相同质量和半径的弹性小球智能体;
- 双方智能体轮流向场地中央的目标点抛掷小球,每方智能体有四次抛掷的机会;
- 四个回合结束后,所抛掷小球离目标点近的一方取得胜利;
- 智能体可以互相碰撞,也可以碰撞墙壁;
- 智能体的视野限定为自身朝向前方30*30的矩阵区域;
- 当回合结束时环境结束。
动作空间:连续;两维。分别代表施加力量和转向角度。
观测:每一步环境返回一个30x30的二维矩阵,详情请见*/olympics_engine*文件夹
奖励函数: 距离目标点近的一方得100分,否则得0分。
环境终止条件: 当回合结束时环境结束。
评测说明:该环境属于零和游戏,在金榜的积分按照ELO进行匹配算法进行计算并排名。零和游戏在匹配对手或队友时,按照瑞士轮进行匹配。 平台验证和评测时,在单核CPU上运行用户代码(暂不支持GPU),限制用户每一步返回动作的时间不超过1s,内存不超过500M。
报名方式:访问“及第”平台( www.jidiai.cn ),在“擂台”页面选择“RLChina 智能体挑战赛 - 壬寅年春赛季”即可报名参赛。RLCN 微信公众号后台回复“智能体竞赛”,可进入竞赛讨论群。
This is a POMDP simulated environment of 2D sports games where althletes are spheres and have continuous action space (torque and steering). The observation is a 30*30 array of agent's limited view range. We introduce collision and agent's fatigue such that no torque applies when running out of energy.
This is for now a beta version and we intend to add more sports scenario, stay tuned :)
conda create -n olympics python=3.8.5
conda activate olympics
pip install -r requirements.txt
python olympics_engine/main.py
python rl_trainer/main.py
You can also locally evaluate your trained model by executing:
python evaluation_local.py --my_ai rl --opponent random --episode=50
You can locally test your submission. At Jidi platform, we evaluate your submission as same as run_log.py
For example,
python run_log.py --my_ai "rl" --opponent "random"
in which you are controlling agent 1 which is green.
- Random policy --> agents/random/submission.py
- RL policy --> all files in agents/rl