Skip to content

Commit

Permalink
Stage is now include in name of model directory.
Browse files Browse the repository at this point in the history
  • Loading branch information
tomasvr committed May 1, 2023
1 parent afbc0d0 commit 8fe4bd3
Show file tree
Hide file tree
Showing 37 changed files with 51 additions and 46 deletions.
54 changes: 27 additions & 27 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,18 +196,18 @@ You should see the gazebo GUI come up with the robot model loaded and two moving

In a second terminal run
```
ros2 run turtlebot3_drl drl_gazebo
ros2 run turtlebot3_drl gazebo_goals
```

In a third terminal run
```
ros2 run turtlebot3_drl drl_environment
ros2 run turtlebot3_drl environment
```

And lastly, in the fourth terminal run the ddpg agent
For DDPG:
```
ros2 run turtlebot3_drl drl_agent ddpg 1
ros2 run turtlebot3_drl train_agent ddpg
```

The first argument indicates whether we are testing or training (0 = testing, 1 = training)
Expand All @@ -218,12 +218,12 @@ The first argument indicates whether we are testing or training (0 = testing, 1

for TD3:
```
ros2 run turtlebot3_drl drl_agent td3 1
ros2 run turtlebot3_drl train_agent td3
```

for DQN:
```
ros2 run turtlebot3_drl drl_agent dqn 1
ros2 run turtlebot3_drl train_agent dqn
```

Your robot should now be moving and training progress is being printed to the terminals!
Expand All @@ -239,21 +239,16 @@ The current state of the agent (weights, parameters, replay buffer and graphs) w
In order to load a model for testing (e.g. ddpg_0 at episode 500) the following command should be used:

```
ros2 run turtlebot3_drl drl_agent ddpg 0 "ddpg_0" 500
ros2 run turtlebot3_drl test_agent ddpg "ddpg_0" 500
```

In order to load a model to continue training (e.g. ddpg_0 at episode 500) the following command should be used:

```
ros2 run turtlebot3_drl drl_agent ddpg 1 "ddpg_0" 500
ros2 run turtlebot3_drl train_agent ddpg "ddpg_0" 500
```

**Note:** If you are loading a model on a different stage than it was trained on (e.g. for transfer learning or testing generalizabilty) you have to add a 4th argument specifying the current stage. For example, model ddpg_0 which was trained on stage 4 can be evaluated in stage 3 using the following command
```
ros2 run turtlebot3_drl drl_agent ddpg 0 "ddpg_0" 500 3
```

(the original training stage is specified in training logfile (e.g _train_**stage2**_*.txt)
**Note:** You can also test (or continue training) a model on a different stage than where it was originally trained on.

### Loading one of the included example models

Expand All @@ -266,28 +261,26 @@ ros2 launch turtlebot3_gazebo turtlebot3_drl_stage9.launch.py

Terminal 2:
```
ros2 run turtlebot3_drl drl_gazebo
ros2 run turtlebot3_drl gazebo_goals
```

Terminal 3:
```
ros2 run turtlebot3_drl drl_environment
ros2 run turtlebot3_drl environment
```

Terminal 4:
For DDPG:
```
ros2 run turtlebot3_drl drl_agent ddpg 0 'examples/ddpg_0' 8000
ros2 run turtlebot3_drl test_agent ddpg 'examples/ddpg_0' 8000
```

Or, for TD3
```
ros2 run turtlebot3_drl drl_agent td3 0 'examples/td3_0' 7400
ros2 run turtlebot3_drl test_agent td3 'examples/td3_0' 7400
```

The pretrained model should then start to navigate successfully.

Note: Do not include 'examples/' in the command when running models trained on your own machine.
You should then see the example model navigate successfully towards the goal

### Switching environments

Expand Down Expand Up @@ -356,15 +349,22 @@ The visual should mainly be used during evaluation as it can slow down training

## Command Specification

**drl_agent:**
**train_agent:**

```ros2 run turtlebot3_drl train_agent [algorithm=dqn/ddpg/td3] [loadmodel=\path\to\model] [loadepisode=episode] ```

* `algorithm`: algorithm to run, one of either: `dqn`, `ddpg`, `td3`
* `modelpath`: path to the model to be loaded to continue training
* `loadepisode`: is the episode to load from `modelpath`

**test_agent:**

```ros2 run turtlebot3_drl test_agent [algorithm=dqn/ddpg/td3] [loadmodel=\path\to\model] [loadepisode=episode] ```

```ros2 run turtlebot3_drl drl_agent [algorithm=dqn/ddpg/td3] [mode=0/1] [loadmodel=\path\to\model] [loadepisode=episode] [trainingstage=stage]```
* `algorithm`: algorithm to run, one of either: `dqn`, `ddpg`, `td3`
* `modelpath`: path to model to be loaded for testing
* `loadepisode`: is the episode to load from `modelpath`

`algorithm` can be either: `dqn`, `ddpg`, `td3`
`mode` is either: `0` (training) or `1` (evaluating)
`modelpath` is the path to the model to load
`loadepisode` is the episode to load from `modelpath`
`trainingstage` is the original training stage of `modelpath` (if different from current stage)

## Physical Robot

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
episode, outcome, step, episode_duration, distance, s/cw/co/t
1, 1, 1974, 1.432716391998838, 5.1544623374938965, 1/0/0/0/0
2, 1, 11785, 8.245651064997219, 1.9160521030426025, 2/0/0/0/0
4 changes: 2 additions & 2 deletions src/turtlebot3_drl/turtlebot3_drl/common/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
ENABLE_STACKING = False
ENABLE_VISUAL = False # Meant to be used only during evaluation/testing phase
ENABLE_TRUE_RANDOM_GOALS = False # If false, goals are taken randomly from a list of known valid goal positions
MODEL_STORE_INTERVAL = 100 # Store the model weights every N episodes
MODEL_STORE_INTERVAL = 3 # Store the model weights every N episodes

# DRL parameters
ACTION_SIZE = 2 # Not used for DQN, see DQN_ACTION_SIZE
Expand All @@ -15,7 +15,7 @@
LEARNING_RATE = 0.003
TAU = 0.003

OBSERVE_STEPS = 25000 # At training start random actions are taken for N steps for better exploration
OBSERVE_STEPS = 0 # At training start random actions are taken for N steps for better exploration
STEP_TIME = 0.01 # Delay between steps, can be set to 0
EPSILON_DECAY = 0.9995 # Epsilon decay per step
EPSILON_MINIMUM = 0.05
Expand Down
10 changes: 5 additions & 5 deletions src/turtlebot3_drl/turtlebot3_drl/common/storagemanager.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
import torch

class StorageManager:
def __init__(self, name, stage, load_session, load_episode, device):
def __init__(self, name, load_session, load_episode, device, stage):
if load_session and name not in load_session:
print(f"ERROR: wrong combination of command and model! make sure command is: {name}_agent")
while True:
Expand All @@ -15,18 +15,18 @@ def __init__(self, name, stage, load_session, load_episode, device):
if 'examples' in load_session:
self.machine_dir = (os.getenv('DRLNAV_BASE_PATH') + '/src/turtlebot3_drl/model/')
self.name = name
self.stage = stage
self.stage = load_session[-1] if load_session else stage
self.session = load_session
self.load_episode = load_episode
self.session_dir = os.path.join(self.machine_dir, self.session)
self.map_location = device

def new_session_dir(self):
def new_session_dir(self, stage):
i = 0
session_dir = os.path.join(self.machine_dir, f"{self.name}_{i}")
session_dir = os.path.join(self.machine_dir, f"{self.name}_{i}_stage{stage}")
while(os.path.exists(session_dir)):
i += 1
session_dir = os.path.join(self.machine_dir, f"{self.name}_{i}")
session_dir = os.path.join(self.machine_dir, f"{self.name}_{i}_stage{stage}")
self.session = f"{self.name}_{i}"
print(f"making new model dir: {self.session}")
os.makedirs(session_dir)
Expand Down
2 changes: 1 addition & 1 deletion src/turtlebot3_drl/turtlebot3_drl/common/utilities.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@
import xml.etree.ElementTree as ET

with open('/tmp/drlnav_current_stage.txt', 'r') as f:
test_stage = int(f.read())
stage = int(f.read())

def check_gpu():
print("gpu torch available: ", torch.cuda.is_available())
Expand Down
24 changes: 13 additions & 11 deletions src/turtlebot3_drl/turtlebot3_drl/drl_agent/drl_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,19 +43,17 @@
from ..common.replaybuffer import ReplayBuffer

class DrlAgent(Node):
def __init__(self, training, algorithm, load_session="", load_episode=0, train_stage=util.test_stage):
def __init__(self, training, algorithm, load_session="", load_episode=0):
super().__init__(algorithm + '_agent')
self.algorithm = algorithm
self.training = int(training)
self.load_session = load_session
self.episode = int(load_episode)
self.train_stage = train_stage
if (not self.training and not self.load_session):
quit("ERROR no test agent specified")
quit("Invalid command: Testing but no model to load specified (example format: ros2 run turtlebot3_drl test_agent ddpg ddpg_0_stage4 1)")
self.device = util.check_gpu()
self.sim_speed = util.get_simulation_speed(self.train_stage)
print(f"{'training' if (self.training) else 'testing' } on stage: {util.test_stage}")

self.sim_speed = util.get_simulation_speed(util.stage)
print(f"{'training' if (self.training) else 'testing' } on stage: {util.stage}")
self.total_steps = 0
self.observe_steps = OBSERVE_STEPS

Expand All @@ -66,7 +64,7 @@ def __init__(self, training, algorithm, load_session="", load_episode=0, train_s
elif self.algorithm == 'td3':
self.model = TD3(self.device, self.sim_speed)
else:
quit(f"invalid algorithm specified: {self.algorithm}, chose one of: ddpg, td3, td3conv")
quit(f"invalid algorithm specified: {self.algorithm}, choose one of: dqn, ddpg, td3")

self.replay_buffer = ReplayBuffer(self.model.buffer_size)
self.graph = Graph()
Expand All @@ -75,24 +73,24 @@ def __init__(self, training, algorithm, load_session="", load_episode=0, train_s
# Model loading #
# ===================================================================== #

self.sm = StorageManager(self.algorithm, self.train_stage, self.load_session, self.episode, self.device)
self.sm = StorageManager(self.algorithm, self.load_session, self.episode, self.device, util.stage)

if self.load_session:
del self.model
self.model = self.sm.load_model()
self.model.device = self.device
self.sm.load_weights(self.model.networks)
if self.training:
self.replay_buffer.buffer = self.sm.load_replay_buffer(self.model.buffer_size, os.path.join(self.load_session, 'stage'+str(self.train_stage)+'_latest_buffer.pkl'))
self.replay_buffer.buffer = self.sm.load_replay_buffer(self.model.buffer_size, os.path.join(self.load_session, 'stage'+str(self.sm.stage)+'_latest_buffer.pkl'))
self.total_steps = self.graph.set_graphdata(self.sm.load_graphdata(), self.episode)
print(f"global steps: {self.total_steps}")
print(f"loaded model {self.load_session} (eps {self.episode}): {self.model.get_model_parameters()}")
else:
self.sm.new_session_dir(util.test_stage)
self.sm.new_session_dir(util.stage)
self.sm.store_model(self.model)

self.graph.session_dir = self.sm.session_dir
self.logger = Logger(self.training, self.sm.machine_dir, self.sm.session_dir, self.sm.session, self.model.get_model_parameters(), self.model.get_model_configuration(), str(util.test_stage), self.algorithm, self.episode)
self.logger = Logger(self.training, self.sm.machine_dir, self.sm.session_dir, self.sm.session, self.model.get_model_parameters(), self.model.get_model_configuration(), str(util.stage), self.algorithm, self.episode)
if ENABLE_VISUAL:
self.visual = DrlVisual(self.model.state_size, self.model.hidden_size)
self.model.attach_visual(self.visual)
Expand Down Expand Up @@ -202,5 +200,9 @@ def main_test(args=sys.argv[1:]):
args = ['0'] + args
main(args)

def main_real(args=sys.argv[1:]):
args = ['0'] + args
main(args)

if __name__ == '__main__':
main()

0 comments on commit 8fe4bd3

Please sign in to comment.