google-research/caql at master · hjh1213/google-research

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
agent_policy.py		agent_policy.py
agent_policy_test.py		agent_policy_test.py
caql_agent.py		caql_agent.py
caql_agent_test.py		caql_agent_test.py
caql_network.py		caql_network.py
caql_network_test.py		caql_network_test.py
dual_ibp_method.py		dual_ibp_method.py
dual_ibp_method_test.py		dual_ibp_method_test.py
dual_method.py		dual_method.py
dual_method_test.py		dual_method_test.py
epsilon_greedy_policy.py		epsilon_greedy_policy.py
epsilon_greedy_policy_test.py		epsilon_greedy_policy_test.py
gaussian_noise_policy.py		gaussian_noise_policy.py
gaussian_noise_policy_test.py		gaussian_noise_policy_test.py
policy.py		policy.py
replay_memory.py		replay_memory.py
replay_memory_test.py		replay_memory_test.py
run.sh		run.sh
train_eval.py		train_eval.py
utils.py		utils.py
utils_test.py		utils_test.py

README.md

CAQL, Continuous Action Q-Learning, is a class of algorithms for continuous-action Q-learning that can use several plug-and-play optimizers for the max-Q problem.

NOTE: MIP optimizer is not included in this initial version because the MIP optimizer is a Google-internal library and has not been open-sourced yet. However, it has a plan for open-source, and we will update this repository as soon as it is released.

For technical details of CAQL, refer to ICLR 2020 paper (https://openreview.net/forum?id=BkxXe0Etwr)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

caql

caql

README.md

Files

caql

Directory actions

More options

Directory actions

More options

Latest commit

History

caql

Folders and files

parent directory

README.md