Paper website: https://arxiv.org/abs/2309.14237
Authors: Jing Zhang, Karinne Ramirez-Amaro
[email protected]:jingzhang00/RL_operator.git
pip install -r requirements.txt
cd scripts/experiments
bash main.bash 42 cuda:0 stack_2
cd scripts/evaluation
bash visualize_model.bash 37 "trained_model/" "state_dict.pt" "sacx_experiment_setting.pkl" 50 5 false
bash visualize_model.bash 37 "trained_model/" "state_dict.pt" "sacx_experiment_setting.pkl" 50 5 true
for other policies, change the last number "5", (open:0, close:1, reach:2, lift:3, move:4, stack:5)
- Current hyperparameters only apply for 2 blocks, more blocks need fine-tuning.
- As for planning success rate, it was evaluated through training process, check "/home/omen/Downloads/RL_operator/trained_model/tensorboard":
cd trained_model
tensorboard --logdir=tensorboard
then it is shown in "evaluation_info/epoch_success_rate".