Skip to content

Commit

Permalink
[ci] Clean up artifacts before/after jobs (ray-project#23463)
Browse files Browse the repository at this point in the history
We sometimes end up with stale wheel uploads from previous runs of a Buildkite agent. The result is that commit wheels are being overwritten from old build jobs - effectively breaking the wheel build logic.

Example:

This Agent: https://buildkite.com/organizations/ray-project/agents/4b955117-2f6c-4849-b703-3457daf69f89

- builds wheels (in post-wheels tests) for a35ebc9
- and then runs both the Ray CPP worker and the Train + Tune tests in 6746e9f
- Usually these two tests shouldn't provide artifacts at all, but they do - these are the wheels from a35ebc9 though! Meaning these are uncleaned leftovers from the first build task.
- See here for proof of artifact upload: https://buildkite.com/ray-project/ray-builders-pr/builds/27622#d11bc514-ebd8-4e0c-a2ce-826b9bad27de

The solution is thus to always clean up the artifacts directory in the worker, i.e. `rm -rf /artifact-mount/*`

This PR adds two of such clean up instructions - once before commands are run and once after artifacts are uploaded. We can probably just do either, but it doesn't hurt to have both.
  • Loading branch information
krfricke authored Mar 25, 2022
1 parent f5e492e commit 940c028
Show file tree
Hide file tree
Showing 3 changed files with 46 additions and 1 deletion.
22 changes: 22 additions & 0 deletions .buildkite/hooks/post-artifact
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/bin/bash
# This script is executed by Buildkite on the host machine.
# In contrast, our build jobs are run in Docker containers.
# This means that even though our build jobs write to
# `/artifact-mount`, the directory on the host machine is
# actually `/tmp/artifacts`.
# We clean up the artifacts directory before any command and
# after uploading artifacts to make sure no stale artifacts
# remain on the node when a Buildkite runner is re-used.
set -ex
if [ -d "/tmp/artifacts" ]; then
echo "Cleaning up artifacts after upload."
echo "Artifact directory contents before cleanup:"
find /tmp/artifacts -print || true

# Need sudo because artifacts were created by root
# within the docker container
rm -rf /tmp/artifacts/{,.[!.],..?}* || true

echo "Artifact directory contents after cleanup:"
find /tmp/artifacts -print || true
fi
22 changes: 22 additions & 0 deletions .buildkite/hooks/pre-command
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#!/bin/bash
# This script is executed by Buildkite on the host machine.
# In contrast, our build jobs are run in Docker containers.
# This means that even though our build jobs write to
# `/artifact-mount`, the directory on the host machine is
# actually `/tmp/artifacts`.
# We clean up the artifacts directory before any command and
# after uploading artifacts to make sure no stale artifacts
# remain on the node when a Buildkite runner is re-used.
set -ex
if [ -d "/tmp/artifacts" ]; then
echo "Cleaning up old artifacts before command."
echo "Artifact directory contents before cleanup:"
find /tmp/artifacts -print || true

# Need sudo because artifacts were created by root
# within the docker container
rm -rf /tmp/artifacts/{,.[!.],..?}* || true

echo "Artifact directory contents after cleanup:"
find /tmp/artifacts -print || true
fi
3 changes: 2 additions & 1 deletion ci/travis/ci.sh
Original file line number Diff line number Diff line change
Expand Up @@ -417,8 +417,9 @@ build_wheels() {
# Sync the directory to buildkite artifacts
rm -rf /artifact-mount/.whl || true
cp -r .whl /artifact-mount/.whl
chmod -R 777 /artifact-mount/.whl

validate_wheels_commit_str
validate_wheels_commit_str
fi
;;
darwin*)
Expand Down

0 comments on commit 940c028

Please sign in to comment.