Skip to content

Latest commit

 

History

History
136 lines (88 loc) · 3.96 KB

design-doc.org

File metadata and controls

136 lines (88 loc) · 3.96 KB

todo

  • [ ] check if merge incompatibility is stated in PR event, if so, don’t bother running the job
  • [ ] convert oauth-token to a secret volume
  • [ ] respond to pull_request_review events by changing review_status

notes

  • the pr system has two event sources: github and the batch system
  • batch system doesn’t have a list jobs, so let’s ignore recovering from jobs that have died
  • keep in mind that if I died previously, the batch system might inform me of the completion of a test run about which I did not know
  • actually, lets ignore state recovery completely for now even github
  • when master changes, users will not be able to see old test failures because status will be set to PENDING
  • for first pass, I ignore custom images for each PR, we just use a single image
  • docker image should have minimal gcloud permissions, it just needs to cp into a particular bucket (but the auth token I created seems to have lots of privileges, not sure why)

resolved notes

  • gotta figure out github authentication: resolved by adding a file called oauth-token which is present in the working directory of the ci server.
  • simplification: we assume everyone is merging into master, in reality I need to track target branch in prs and only cancel jobs, set statuses to pending, etc. when target branch changes: this was actually pretty easy to support

storing results

Jobs should copy build results into a google storage bucket with the commit hash.

No, what if a second job is executed? or retest is called?

one bucket per hash, whatever was there is blown away and replaced. no history for version 1

questions

if my job fails due to node loss, etc., will batch restart it? How many callbacks will I receive?

state

  • map from pr to currently executing test job + hash

sketch

when a test job completes,

  • check if it was PASS or FAIL and update github BUILD status accordingly
  • if it was PASS, and PR is APPROVED, set all other build statuses to pending, cancel all jobs, merge PR

when a PR gets a new commit (including if the PR is created),

  • cancel all jobs for that PR
  • start new job for that PR

when a PR is reviewed,

  • if the BUILD status is PASS, and PR is APPROVED, set all other build statuses to pending, cancel all jobs, merge PR

when master gets a new commit,

  • mark all BUILD statuses as pending
  • cancel all jobs
  • get list of prs
  • from approved PRs, pick one at random and kick off a new test job
  • if there are no approved PRs, pick one PR at random and kick off a new test job

when try-merge endpoint is hit,

  • if BUILD status is PASS, and PR is APPROVED, set all other build statuses to pending, cancel all jobs, merge PR

when retest endpoint is hit,

  • cancel all jobs for a PR
  • start new job for that PR

when status endpoint is hit,

  • return jsonified map from pr to currently executing test job

when ready-for-merge endpoint is hit,

  • return json array of PRs thought to be approved and passing

when review_status endpoint is hit,

  • return json array of PRs thought to be approved

when passing endpoint is hit,

  • return json array of PRs thought to be passing

running a job

execute image with command:

git clone SOURCE_REPO_URL \ –depth 1 \ –branch SOURCE_BRANCH && \ git checkout TIP_HASH && \ git remote add hi https://github.com/hail-is/hail.git && \ git fetch –depth 1 hi master && \ git merge master && \ source active hail && \ ./gradlew testAll –gradle-user-home /gradle-cache ; \ GRADLE_EXIT=$? ; \ gsutil cp \ build/reports/tests/test \ gs://hail-ci-0-1/SOURCE_BRANCH/TIP_HASH ; \ gsutil acl ch -r -u AllUsers:R gs://hail-ci-0-1/SOURCE_BRANCH/TIP_HASH ; \ exit $GRADLE_EXIT

NB: the checkout of TIP_HASH. each webhook is called for a particular “synchronize”, I want to make sure I’m building what I think I’m building

status should link to https://storage.googleapis.com/hail-ci-0-1/SOURCE_BRANCH/TIP_HASH