Abebual Zerihun Demilew and Rachel Fisher
Image from Kohls.com
In this project we developed a Monte-Carlo Reinforcment Learning Agent to play an optimal strategy to win a simplified scope of UNO Card Game. The project is stractured as follows:
- Game Environment
- Monte-Carlo Reinforcment Learning Agent
- Model Training and Performance
- Experments
In environment.py class objects for card, deck, player, turn and game are defined. In main.ipynb, the classes are imported as module, to run simulations. The state_action_reward.py includes the state, action, and reward class objects.
The monte-carlo method reinforcment learning algorithm is defined in agents.py module.
Results from model training included in the project dataset folder - [Rachel Fisher]-[Abebual Demilew]--dataset.zip. Monte-carlo RL model outputs such as q-tables, q-values coverage, win-rates, turns, state-action pairs and visits are included in the dataset folder.
Model performance results from best performing monte-carlo RL Agent:
Frequency of playable cards available:
Additional experments were conducted to:
Understand if there is a first player advantage.
Understand the exploration vs exploitation tradeoff.
Understand if any card has a playable advantage.
Python Version: 3.7
Packages: pandas, numpy, random, itertools, time, tqdm, sys, os, matplotlib, seaborn, ipywidgets