Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default detach retry limit to 100 & enable unlimit option #185

Merged
merged 1 commit into from
Sep 26, 2021

Conversation

dkeven
Copy link
Member

@dkeven dkeven commented Sep 26, 2021

Signed-off-by: dkeven [email protected]

What type of PR is this?
/kind improvement

What this PR does / why we need it:
Currently, the detaching retry limiter(added in #160) will block any further detaching operation for a specific volume if this volume has more than --retry-times-max(defaults to 10) failed detaching operations recorded in the limiter.

But very often, when the cause of the failing detachments is fixed or self-recovered, the retry times have already exceeded the maxium, and since the retry times are stored in a map and never reset, this leaves the detaching of the volume blocked forever.

This PR:

  1. Renames the ambiguous --retry-times-max option to --retry-detach-times-max.
  2. Change the default retry times limit of detaching to 100.
  3. Allows users to lift the retry times limit of detaching completely, by setting the --retry-detach-times-max to 0.

Which issue(s) this PR fixes:

Fixes #184

endpoint = flag.String("endpoint", "unix://tmp/csi.sock", "CSI endpoint")
maxVolume = flag.Int64("maxvolume", 10, "Maximum number of volumes that controller can publish to the node.")
nodeId = flag.String("nodeid", "", "If driver cannot get instance ID from /etc/qingcloud/instance-id, we would use this flag.")
retryIntervalMax = flag.Duration("retry-interval-max", 2*time.Minute, "Maximum retry interval of failed deletion.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove retryIntervalMax? ie. remove the retries and hand over the control to csi sidecar?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think it should be removed, I'll do that in another PR.

Copy link
Contributor

@stoneshi-yunify stoneshi-yunify left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@stoneshi-yunify stoneshi-yunify merged commit 4971d80 into yunify:master Sep 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

about retryIntervalMax and retryTimesMax
2 participants