Skip to content

Commit

Permalink
update webpage
Browse files Browse the repository at this point in the history
  • Loading branch information
BruceGeLi committed Jan 17, 2024
1 parent 8064439 commit e3be2d5
Show file tree
Hide file tree
Showing 9 changed files with 123 additions and 33 deletions.
51 changes: 20 additions & 31 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
.gitconfig
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
Expand All @@ -20,6 +21,7 @@ parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
Expand Down Expand Up @@ -49,7 +51,6 @@ coverage.xml
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
Expand All @@ -72,7 +73,6 @@ instance/
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
Expand All @@ -83,9 +83,7 @@ profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
.python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
Expand All @@ -94,22 +92,7 @@ ipython_config.py
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
Expand Down Expand Up @@ -146,15 +129,21 @@ dmypy.json
# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/
.idea/

# Media
media/

# Numerical result
numerical_result/

# Log
log/
tmp/
/mprl/experiment/reacher5d/wandb/
/mprl/result/
**/wandb/
**/result
**/MUJOCO_LOG.TXT
26 changes: 24 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,24 @@
# TCE_RL
Temporally Correlated Episodic Reinforcement Learning, ICLR 2024
<p align="center">
<img src='./web_assets/Metaworld.gif' width="200" />
<img src='./web_assets/Box_Pushing.gif' width="200" />
<img src='./web_assets/Table_Tennis.gif' width="200" />
</p>
<br><br>
## Abstract

Current advancements in reinforcement learning (RL) have predominantly focused on learning step-based policies that generate actions for each perceived state. While these methods efficiently leverage step information from environmental interaction, they often ignore the temporal correlation between actions, resulting in inefficient exploration and unsmooth trajectories that are challenging to implement on real hardware. Episodic RL (ERL) seeks to overcome these challenges by exploring in parameters space that capture the correlation of actions. However, these approaches typically compromise data efficiency, as they treat trajectories as opaque 'black boxes.' In this work, we introduce a novel ERL algorithm, Temporally-Correlated Episodic RL (TCE), which effectively utilizes step information in episodic policy updates, opening the 'black box' in existing ERL methods while retaining the smooth and consistent exploration in parameter space. TCE synergistically combines the advantages of step-based and episodic RL, achieving comparable performance to recent ERL methods while maintaining data efficiency akin to state-of-the-art (SoTA) step-based RL.

Our work has been accepted by ICLR 2024.


<br><br>
![TCE](./web_assets/Framework.png)
<!--- -->

## Citation
todo

<div align="center">
<br><br>
<a href='https://github.com/BruceGeLi/Temporally_Correlated_Exploration_RL'><img src='./web_assets/CodeOnGithub.png' width="300px"></a>
</div>
5 changes: 5 additions & 0 deletions _config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
theme: 'minima'

title: 'Open the Black Box: Step-based Policy Updates for Temporally-Correlated Episodic Reinforcement Learning'
description: Page for ICLR24 submission 5877
titles_size: 0
Binary file added web_assets/Box_Pushing.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added web_assets/CodeOnGithub.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
74 changes: 74 additions & 0 deletions web_assets/CodeOnGithub.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added web_assets/Framework.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added web_assets/Metaworld.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added web_assets/Table_Tennis.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit e3be2d5

Please sign in to comment.