Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[kuberay] Logging-related autoscaler stability improvement.
The autoscaler container writes logs to a directory set up by the Ray container. This PR moves the logic that sets up autoscaler logging so that it is done after the Ray container is ready. This PR also changes things so that the autoscaler process exits after hitting 5 total exceptions. Kubernetes will then restart the autoscaler. The idea here is to ensure the autoscaler is able to restart cleanly in long-running deployments of Ray.
- Loading branch information