This is a class project for USU CS 5640 (Reinforcement Learning Applications)
This project utilizes a DQN with dense policy and target networks to find ideal traffic signal controls.
It utilizes the environment, default rewards, and files representing a 2-way intersection from SUMO-RL.
There are various rewards implemented from the environment in SUMO-RL, these are tested using different objective values.
Final video presentation link: Final Presentation
- dqn_learning/
- python files used to generate models and plots (main.py)
- Python file used to run gui-enabled simulation of specific generated model
- nets/ network files to run simulation
- sumo_rl/ environment files (modified from original to include some additional metrics)
- Final report PDF
- Final presentation slides in PDF format
This simulation only has the option for a 2-way intersection so the observation space (obs
) is as follows:
obs[0:4]
is a one-hot encoded vector indicating the current active green phaseobs[4]
is a binary variable indicating whether min_green seconds have already passed in the current phaseobs[5:13]
is the density of each lane: the number of vehicles in a lane divided by the lane's total capacityobs[13:21]
is the queue amount of each lane: the number of queued vehicles in a lane divided by the lane's total capacity
The action space (act
) is 4 discrete actions, each representing a different green phase:
act[0]
Green Signals: North-South straight/right turnact[1]
Green Signals: North-South left turnact[2]
Green Signals: East-West straight/right turnact[3]
Green Signals: East-West left turn
There are four reward functions:
- Difference in waiting time between steps
- Average speed
- Queue
- Pressure
After each iteration of training and testing an agent, that agent's policy NN and info about the episode is saved. The info that is saved includes:
- Total number of vehicles moved through the intersection
- Average speed at each step
- Vehicles in intersection at each step
- Average and total wait times at each step
- Vehicles queued at each step