LOPE

An implementation of Learning Online with trajectory Preference guidancE (LOPE) in PyTorch

Requirements

Creat the running environment with the following command:

conda env create -f environment.yml

How to run it

python3 train_lope.py

Noting

In this code, we use the hopper environment to demonstrate the performance improvement of LOPE. You can also adopt the other agents provided in MuJoCo, but the corresponding demonstrations are needed like "hopper_trajs.npy". Moreover, some hyperparameters may need to be adjusted for better evaluation results.

Citing

Please use this bibtex if you want to cite this repository in your publications :

@misc{wang2024preferenceguidedreinforcementlearningefficient,
      title={Preference-Guided Reinforcement Learning for Efficient Exploration}, 
      author={Guojian Wang and Faguo Wu and Xiao Zhang and Tianyuan Chen and Xuyang Chen and Lin Zhao},
      year={2024},
      eprint={2407.06503},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2407.06503}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
assets		assets
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
example_config_for_no_transfer.py		example_config_for_no_transfer.py
hopper_trajs.npy		hopper_trajs.npy
lope.py		lope.py
mmd.py		mmd.py
monitor.py		monitor.py
prefer_memory.py		prefer_memory.py
process_traj.py		process_traj.py
train_lope.py		train_lope.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LOPE

Requirements

How to run it

Noting

Citing

About

Releases

Packages

Languages

License

buaawgj/LOPE

Folders and files

Latest commit

History

Repository files navigation

LOPE

Requirements

How to run it

Noting

Citing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages