Skip to content

The Custom Gridworld and Environment Demo of Ship Route Planning with Reinforcement Learning. The reinforcement learning based on Qlearning method is realized. Q tables can be saved. Support documentation of training sessions. Support the display of result graphs

Notifications You must be signed in to change notification settings

shiningxy/shipRouteRL

Repository files navigation

shipRouteRL

Note

  • 已实现自定义海区功能
  • 已实现自定义起止点功能
  • 已实现自定义船舶安全水深功能
  • 支持VisualDL可视化分析工具
  • 支持自动保存训练历史奖励、时间成本和绘图结果
  • 支持QLearning算法
  • 计划补充DQN算法
  • 计划根据ERA5气象数据,将显著波高作为网格权重大小。波高越大,网格权重越大,奖励越小,越会避开大浪

Download

下载ETOPO1_Bed_c_gmt4.grd.gz,存放至根目录

Install

conda create -n shiprl python=3.7
conda activate shiprl
pip install parl -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install visualdl -i https://mirror.baidu.com/pypi/simple
pip install gym==0.26.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install numpy -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install netCDF4 -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pygame -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install pandas -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install seaborn -i https://pypi.tuna.tsinghua.edu.cn/simple

TODO

gridworld.py & shiproute.py中的这九个参数,需要预先使用init_position.py进行选择并不断调整确定,之后手动修改

先确定网格范围,再将鼠标放在python Figure图中,获取起止点索引坐标

# 初始化真实世界中的经纬度 之后的代码会自动将这个经纬度转换为nc数据中的索引
latstart = 37
latend = 37.5
lonstart = 122.5
lonend = 123
# 通过init_position.py鼠标手动调整,找到的起止点x y索引坐标
self.xStartIndex = 2
self.yStartIndex = 23
self.xEndIndex = 3
self.yEndIndex = 2
# 船舶吃水要求
self.shipDraught = 5

! !

Structure

main.py -> 主程序入口,完成训练和测试,保存训练结果,绘制结果图

init_position.py -> 用于确定网格范围和起止点索引坐标

gridworld.py -> 继承gym.Wrapper类,构建网格,可单独运行查看渲染窗口的大小是否合适

shiproute.py -> 继承gym.Env类,构建环境,定义动作空间

agent.py -> 定义Qlearning智能体

utils.py -> 定义存储训练结果函数,定义绘图函数

VisualDL可视化分析工具使用介绍.ipynb -> VisualDL的训练过程展示工具包说明,类似tensorboard,效果更美观

Result

结果动画展示

About

The Custom Gridworld and Environment Demo of Ship Route Planning with Reinforcement Learning. The reinforcement learning based on Qlearning method is realized. Q tables can be saved. Support documentation of training sessions. Support the display of result graphs

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published