Skip to content

Commit

Permalink
Adds more aggressive cancelling of duplicate Build Image jobs (apache…
Browse files Browse the repository at this point in the history
…#12018)

This change adds even more aggressive cancelling of duplicates of
'Build Image' jobs. it's not an obvious task to know which
Build Image jobs are duplicates, we are matching those duplicates
based on specially crafted "build-info" job names. We add
Event, Branch, Repo to the job names and assume that two
runs with the same event + branch + repo are duplicates.

It also disables self-preservation for this step because
it is perfectly ok to cancel itself in case there is a newer
in-progress Build Image job.

Unfortunately even this will not work perfectly well. Those job
names are resolved only for the jobs that are runnning rather than
the queued ones, so in case we have several duplicates of the
same build image job in the queue, they will not be found/cancelled.
The cancelling will only happen if both duplicates are already
running.

It's good enough for now and we cannot do much more until there
is a missing feature added to GitHub API that allows to link
the workflow_run with the run that triggered it. This issue has
been raised to GitHub Support and internal engineering ticket
has been apparently opened to add this feature.

More detailed status for the missing feature is kept at apache#11294
  • Loading branch information
potiuk authored Nov 1, 2020
1 parent 0314a3a commit 1d14e74
Show file tree
Hide file tree
Showing 2 changed files with 26 additions and 16 deletions.
40 changes: 25 additions & 15 deletions .github/workflows/build-images-workflow-run.yml
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ jobs:
token: ${{ secrets.GITHUB_TOKEN }}
sourceRunId: ${{ github.event.workflow_run.id }}
- name: "Cancel duplicated 'CI Build' runs"
uses: potiuk/cancel-workflow-runs@99869d37d982384d18c79539b67df94f17557cbe # v4_1
uses: potiuk/cancel-workflow-runs@f06d03cd576a179ea5169d048dbd8c8d73757b52 # v4_4
with:
token: ${{ secrets.GITHUB_TOKEN }}
cancelMode: allDuplicates
Expand All @@ -78,12 +78,12 @@ jobs:
# https://github.community/t/how-to-set-and-access-a-workflow-variable/17335/16
echo "::set-output name=buildImages::${BUILD_IMAGES}"
- name: "Cancel duplicated 'Build Image' runs"

# We find duplicates of our own "Build Image" runs - due to a missing feature
# in GitHub Actions, we have to use Job names to match Event/Repo/Branch from the
# build-info step there to find the duplicates ¯\_(ツ)_/¯.

uses: potiuk/cancel-workflow-runs@99869d37d982384d18c79539b67df94f17557cbe # v4_1
# in GitHub Actions, we have to use Job names to match Event/Repo/Branch matching
# trick ¯\_(ツ)_/¯. We name the build-info job appropriately
# and then we try to find and cancel all the jobs with the same Event + Repo + Branch as the
# current Event/Repo/Branch combination.
uses: potiuk/cancel-workflow-runs@f06d03cd576a179ea5169d048dbd8c8d73757b52 # v4_4
with:
cancelMode: namedJobs
token: ${{ secrets.GITHUB_TOKEN }}
Expand All @@ -94,14 +94,12 @@ jobs:
Branch: ${{ steps.source-run-info.outputs.sourceHeadBranch }}.*"]
if: env.BUILD_IMAGES == 'true'
- name: "Cancel all 'CI Build' runs where some jobs failed"

# We find any of the "CI Build" workflow runs, where any of the important jobs
# failed. The important jobs are selected by the regexp array below.
# We also produce list of canceled "CI Build' runs as output, so that we
# can cancel all the matching "Build Images" workflow runs in the two following steps.
# Yeah. Adding to the complexity ¯\_(ツ)_/¯.

uses: potiuk/cancel-workflow-runs@99869d37d982384d18c79539b67df94f17557cbe # v4_1
uses: potiuk/cancel-workflow-runs@f06d03cd576a179ea5169d048dbd8c8d73757b52 # v4_4
id: cancel-failed
with:
token: ${{ secrets.GITHUB_TOKEN }}
Expand All @@ -112,7 +110,6 @@ jobs:
["^Pylint$", "^Static checks", "^Build docs$", "^Spell check docs$", "^Backport packages$",
"^Provider packages", "^Checks: Helm tests$", "^Test OpenAPI*"]
- name: "Extract canceled failed runs"

# We use this step to build regexp that will be used to match the Source Run id in
# the build-info job below. If we cancelled some "CI Build" runs in the "cancel-failed' step
# above - we want to cancel also the corresponding "Build Images" runs. Again we have
Expand All @@ -134,15 +131,15 @@ jobs:
# We take the extracted regexp array prepared in the previous step and we use
# it to cancel any jobs that have matching names containing Source Run Id:
# followed by one of the run ids. Yes I know it's super complex ¯\_(ツ)_/¯.
if: env.BUILD_IMAGES == 'true' && steps.source-run-info-failed.outputs.cancelledRuns != '[]'
uses: potiuk/cancel-workflow-runs@99869d37d982384d18c79539b67df94f17557cbe # v4_1
if: env.BUILD_IMAGES == 'true' && steps.cancel-failed.outputs.cancelledRuns != '[]'
uses: potiuk/cancel-workflow-runs@f06d03cd576a179ea5169d048dbd8c8d73757b52 # v4_4
with:
cancelMode: namedJobs
token: ${{ secrets.GITHUB_TOKEN }}
notifyPRCancel: true
jobNameRegexps: ${{ steps.extract-cancelled-failed-runs.outputs.matching-regexp }}
- name: "Cancel duplicated 'CodeQL' runs"
uses: potiuk/cancel-workflow-runs@99869d37d982384d18c79539b67df94f17557cbe # v4_1
uses: potiuk/cancel-workflow-runs@f06d03cd576a179ea5169d048dbd8c8d73757b52 # v4_4
id: cancel
with:
token: ${{ secrets.GITHUB_TOKEN }}
Expand All @@ -165,6 +162,19 @@ jobs:
else
echo "::set-output name=upgradeToLatestConstraints::false"
fi
- name: "Cancel all duplicated 'Build Image' runs"
# We find duplicates of all "Build Image" runs - due to a missing feature
# in GitHub Actions, we have to use Job names to match Event/Repo/Branch matching
# trick ¯\_(ツ)_/¯. We name the build-info job appropriately and then we try to match
# all the jobs with the same Event + Repo + Branch match and cancel all the duplicates for those
# This might cancel own run, so this is the last step in the job
uses: potiuk/cancel-workflow-runs@f06d03cd576a179ea5169d048dbd8c8d73757b52 # v4_4
with:
cancelMode: allDuplicatedNamedJobs
token: ${{ secrets.GITHUB_TOKEN }}
notifyPRCancel: true
selfPreservation: false
jobNameRegexps: '["Event: \\S* Repo: \\S* Branch: \\S* "]'

build-info:
# The name is such long because we are using it to cancel duplicated 'Build Images' runs
Expand Down Expand Up @@ -363,7 +373,7 @@ jobs:
needs: [build-images]
steps:
- name: "Canceling the 'CI Build' source workflow in case of failure!"
uses: potiuk/cancel-workflow-runs@99869d37d982384d18c79539b67df94f17557cbe # v4_1
uses: potiuk/cancel-workflow-runs@f06d03cd576a179ea5169d048dbd8c8d73757b52 # v4_4
with:
token: ${{ secrets.GITHUB_TOKEN }}
cancelMode: self
Expand All @@ -378,7 +388,7 @@ jobs:
needs: [build-images]
steps:
- name: "Canceling the 'CI Build' source workflow in case of failure!"
uses: potiuk/cancel-workflow-runs@99869d37d982384d18c79539b67df94f17557cbe # v4_1
uses: potiuk/cancel-workflow-runs@f06d03cd576a179ea5169d048dbd8c8d73757b52 # v4_4
with:
token: ${{ secrets.GITHUB_TOKEN }}
cancelMode: self
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ jobs:
uses: actions/checkout@v2
- name: >
Event: ${{ github.event_name }}
Repo: ${{ github.repository }}
Repo: ${{ steps.source-run-info.outputs.sourceHeadRepo }}
Branch: ${{ github.head_ref }}
Run id: ${{ github.run_id }}
Sha: ${{ github.sha }}
Expand Down

0 comments on commit 1d14e74

Please sign in to comment.