Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about the running results #1

Open
2345as opened this issue May 4, 2023 · 1 comment
Open

Questions about the running results #1

2345as opened this issue May 4, 2023 · 1 comment

Comments

@2345as
Copy link

2345as commented May 4, 2023

Why are the training results of the policy.py file all nan, and then the results of DP.py seem to be unable to be used in the test files simulate.py and env.py

@alabatie
Copy link
Owner

alabatie commented May 6, 2023

Hi, thank you for using the repo. However, it's difficult to address your issue without further information.

Are you using the dynamic programming approach or the policy network approach? Note that the two approaches are different, and mutually exclusive.

Having NaNs can be the sign of numerical instability (either due to numerical format or optimizer's hyperparameters).

Best,
Antoine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants