Skip to content

Tags: anuvu/cruise-control

Tags

2.8.15

Toggle 2.8.15's commit message
fix clusterModel

2.8.14

Toggle 2.8.14's commit message
Upgrade io.vertx to 4.4.3 due to CVE-2023-24815 (linkedin#2023)

2.8.13

Toggle 2.8.13's commit message

Unverified

This user has not yet uploaded their public signing key.
fix/cruisecontrol: add partition movement timeout to executor

There is an edge case wherein after the partition reassignment was submitted to kafka and before it finished, there was a partition leadership re-lection- this causes the reassignment to stall until there is another re-election. However, we do see cases where there is no re-election triggered leading to a partition reaissgnment being in IN_PROGRESS indefinitely and potentially missing new anomalies due to executor state being in INTER_BROKER_REPLICA_ACTION

By adding a max timeout, we avoid this state by cancelling such reassignemnts and retrying them later

includes minor cleanup

2.8.12

Toggle 2.8.12's commit message
fix/cruisecontrol: add partition movement timeout to executor

There is an edge case wherein after the partition reassignment was submitted to kafka and before it finished, there was a partition leadership re-lection- this causes the reassignment to stall until there is another re-election. However, we do see cases where there is no re-election triggered leading to a partition reaissgnment being in IN_PROGRESS indefinitely and potentially missing new anomalies due to executor state being in INTER_BROKER_REPLICA_ACTION

By adding a max timeout, we avoid this state by cancelling such reassignemnts and retrying them later

includes minor cleanup

2.8.11

Toggle 2.8.11's commit message
fix/cruisecontrol: add partition movement timeout to executor

There is an edge case wherein after the partition reassignment was submitted to kafka and before it finished, there was a partition leadership re-lection- this causes the reassignment to stall until there is another re-election. However, we do see cases where there is no re-election triggered leading to a partition reaissgnment being in IN_PROGRESS indefinitely and potentially missing new anomalies due to executor state being in INTER_BROKER_REPLICA_ACTION

By adding a max timeout, we avoid this state by cancelling such reassignemnts and retrying them later

includes minor cleanup

2.8.10

Toggle 2.8.10's commit message
add option to delete partition reassignments not started by CruiseCon…

…trol

2.8.9

Toggle 2.8.9's commit message
fix/LaggingReplicaReassignmentGoal: store state in clusterModel

2.8.8

Toggle 2.8.8's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Merge pull request #1 from mavemuri/reassign-lagging-replicas

feat: reassign stuck partition replicas

2.8.7

Toggle 2.8.7's commit message
feat: cleaup stuck partitionReassignments

Sometimes, an active partition reassignment goes into a limbo state
due to the destination and source brokers going offline at the same time.
When this happens, there will be a partitionReassignment stuck in kafka
until it is maually cleared- due to this, CC stops reacting to any
anomalies/broker failures/etc.

This commit is for detecting and fixing such stuck active partitionReassignments.

2.8.6

Toggle 2.8.6's commit message
fix: allow multiple partition reassignments to be scheduled

Kafka Admin API states that this should be safe-

It also allows us to escape stuck states when partition movements
fail due to dead brokers similar to linkedin#664
but with the additional case of revert not being possible due to original
broker also going down