Skip to content

Commit

Permalink
Python base images are stored in cache (apache#8943)
Browse files Browse the repository at this point in the history
All PRs will used cached "latest good" version of the python
base images from our GitHub registry. The python versions in
the Github Registry will only get updated after a master
build (which pulls latest Python image from DockerHub) builds
and passes test correctly.

This is to avoid problems that we had recently with Python
patchlevel releases breaking our Docker builds.
  • Loading branch information
potiuk authored May 21, 2020
1 parent 97b6cc7 commit 41481bb
Show file tree
Hide file tree
Showing 3 changed files with 47 additions and 9 deletions.
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -363,6 +363,7 @@ jobs:
matrix:
python-version: [3.6, 3.7]
env:
PULL_PYTHON_BASE_IMAGES_FROM_CACHE: "false"
PYTHON_MAJOR_MINOR_VERSION: ${{ matrix.python-version }}
CI_JOB_TYPE: "Prod image"
steps:
Expand Down
27 changes: 21 additions & 6 deletions CI.rst
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,9 @@ purpose and context.
(container registry, code repository). This is necessary as the code in those PRs (including CI job
definition) might be modified by people who are not committers for the Apache Airflow Code Repository.
The main purpose of those jobs is to check if PR builds cleanly, if the test run properly and if
the PR is ready to review and merge.
the PR is ready to review and merge. The runs are using cached images from the Private GitHub registry -
CI, Production Images as well as base Python images that are also cached in the Private GitHub registry.

* **Direct Push/Merge Run** - Those runs are results of direct pushes done by the committers or as result
of merge of a Pull Request by the committers. Those runs execute in the context of the Apache Airflow
Code Repository and have also write permission for GitHub resources (container registry, code repository).
Expand All @@ -85,7 +87,16 @@ purpose and context.
multiple PRs might cause build and test failures after merge even if they do not fail in isolation. Also
those runs are already reviewed and confirmed by the committers so they can be used to do some housekeeping
- for now they are pushing most recent image build in the PR to the Github Private Registry - which is our
image cache for all the builds.
image cache for all the builds. Another purpose of those runs is to refresh latest Python base images.
Python base images are refreshed with varying frequency (once every few months usually but sometimes
several times per week) with the latest security and bug fixes. Those patch level images releases can
occasionally break Airflow builds (specifically Docker image builds based on those images) therefore
in PRs we always use latest "good" python image that we store in the private GitHub cache. The direct
push/master builds are not using registry cache to pull the python images - they are directly
pulling the images from DockerHub, therefore they will try the latest images after they are released
and in case they are fine, CI Docker image is build and tests are passing - those jobs will push the base
images to the private GitHub Registry so that they be used by subsequent PR runs.

* **Scheduled Run** - those runs are results of (nightly) triggered job - only for ``master`` branch. The
main purpose of the job is to check if there was no impact of external dependency changes on the Apache
Airflow code (for example transitive dependencies released that fail the build). It also checks if the
Expand Down Expand Up @@ -120,11 +131,15 @@ Those jobs often have matrix run strategy which runs several different variation
+---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+
| Quarantined tests | Those are tests that are flaky and we need to fix them | Yes (if pyfiles count >0) | Yes | Yes * |
+---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+
| Requirements | Checks if requirement constraints in the code are up-to-date | Yes (fails if missing requirement) | YesFails if missing requirement | Yes (Eager dependency upgradeDoes not fail for changed requirements) |
| Requirements | Checks if requirement constraints in the code are up-to-date | Yes (fails if missing requirement) | Yes (fails missing requirement) | Yes (Eager dependency upgrade - does not fail changed requirements) |
+---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+
| Pull python from cache | Pulls Python base images from Github Private Image registry to keep the last good python image used in PRs | Yes | No | - |
+---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+
| Push python to cache | Pushes Python base images to Github Private Image registry - checks if latest image is fine and pushes if so | No | Yes | - |
+---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+
| Push Prod image | Pushes production images to GitHub Private Image RegistryThis is to cache the build images for following runs. | - | Yes | - |
| Push Prod image | Pushes production images to GitHub Private Image Registry to cache the build images for following runs | - | Yes | - |
+---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+
| Push CI image | Pushes CI images to GitHub Private Image RegistryThis is to cache the build images for following runs. | - | Yes | - |
| Push CI image | Pushes CI images to GitHub Private Image Registry to cache the build images for following runs | - | Yes | - |
+---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+
| Tag Repo nightly | Tags the repository with nightly tagIt is a lightweight tag that moves nightly | - | - | Yes.Triggers DockerHub build for public registry |
| Tag Repo nightly | Tags the repository with nightly tagIt is a lightweight tag that moves nightly | - | - | Yes. Triggers DockerHub build for public registry |
+---------------------------+----------------------------------------------------------------------------------------------------------------+------------------------------------+---------------------------------+----------------------------------------------------------------------+
28 changes: 25 additions & 3 deletions scripts/ci/_utils.sh
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,9 @@ function initialize_common_environment {
# --push-images flag is specified
export PUSH_IMAGES=${PUSH_IMAGES:="false"}

# Whether base python images should be pulled from cache
export PULL_PYTHON_BASE_IMAGES_FROM_CACHE=${PULL_PYTHON_BASE_IMAGES_FROM_CACHE:="true"}

# Disable writing .pyc files - slightly slower imports but not messing around when switching
# Python version and avoids problems with root-owned .pyc files in host
export PYTHONDONTWRITEBYTECODE=${PYTHONDONTWRITEBYTECODE:="true"}
Expand Down Expand Up @@ -1212,7 +1215,11 @@ function pull_ci_image_if_needed() {
Docker pulling ${PYTHON_BASE_IMAGE}.
" > "${DETECTED_TERMINAL}"
fi
verbose_docker pull "${PYTHON_BASE_IMAGE}" | tee -a "${OUTPUT_LOG}"
if [[ ${PULL_PYTHON_BASE_IMAGES_FROM_CACHE:="true"} == "true" ]]; then
pull_image_possibly_from_cache "${PYTHON_BASE_IMAGE}" "${CACHED_PYTHON_BASE_IMAGE}"
else
verbose_docker pull "${PYTHON_BASE_IMAGE}" | tee -a "${OUTPUT_LOG}"
fi
echo
fi
pull_image_possibly_from_cache "${AIRFLOW_CI_IMAGE}" "${CACHED_AIRFLOW_CI_IMAGE}"
Expand All @@ -1230,7 +1237,11 @@ function pull_prod_images_if_needed() {
echo
echo "Force pull base image ${PYTHON_BASE_IMAGE}"
echo
verbose_docker pull "${PYTHON_BASE_IMAGE}" | tee -a "${OUTPUT_LOG}"
if [[ ${PULL_PYTHON_BASE_IMAGES_FROM_CACHE:="true"} == "true" ]]; then
pull_image_possibly_from_cache "${PYTHON_BASE_IMAGE}" "${CACHED_PYTHON_BASE_IMAGE}"
else
verbose_docker pull "${PYTHON_BASE_IMAGE}" | tee -a "${OUTPUT_LOG}"
fi
echo
fi
# "Build" segment of production image
Expand Down Expand Up @@ -1469,8 +1480,10 @@ function prepare_ci_build() {
--password-stdin \
"${CACHE_REGISTRY}"
export CACHED_AIRFLOW_CI_IMAGE="${CACHE_REGISTRY}/${CACHE_IMAGE_PREFIX}/${AIRFLOW_CI_BASE_TAG}"
export CACHED_PYTHON_BASE_IMAGE="${CACHE_REGISTRY}/${CACHE_IMAGE_PREFIX}/python:${PYTHON_MAJOR_MINOR_VERSION}-slim-buster"
else
export CACHED_AIRFLOW_CI_IMAGE=""
export CACHED_PYTHON_BASE_IMAGE=""
fi
export AIRFLOW_BUILD_CI_IMAGE="${DOCKERHUB_USER}/${DOCKERHUB_REPO}/${AIRFLOW_CI_BASE_TAG}"
export AIRFLOW_CI_IMAGE_DEFAULT="${DOCKERHUB_USER}/${DOCKERHUB_REPO}:${DEFAULT_BRANCH}-ci"
Expand Down Expand Up @@ -1552,9 +1565,11 @@ function prepare_prod_build() {
"${CACHE_REGISTRY}"
export CACHED_AIRFLOW_PROD_IMAGE="${CACHE_REGISTRY}/${CACHE_IMAGE_PREFIX}/${AIRFLOW_PROD_BASE_TAG}"
export CACHED_AIRFLOW_PROD_BUILD_IMAGE="${CACHE_REGISTRY}/${CACHE_IMAGE_PREFIX}/${AIRFLOW_PROD_BASE_TAG}-build"
export CACHED_PYTHON_BASE_IMAGE="${CACHE_REGISTRY}/${CACHE_IMAGE_PREFIX}/python:${PYTHON_MAJOR_MINOR_VERSION}-slim-buster"
else
export CACHED_AIRFLOW_PROD_IMAGE=""
export CACHED_AIRFLOW_PROD_BUILD_IMAGE=""
export CACHED_PYTHON_BASE_IMAGE=""
fi

if [[ "${INSTALL_AIRFLOW_REFERENCE:=}" != "" ]]; then
Expand All @@ -1581,7 +1596,7 @@ function prepare_prod_build() {
}

# Pushes Ci image and it's manifest to the registry. In case the image was taken from cache registry
# it is also pushed to the cache, not to the main registry. Manifest is only pushed to the main registry
# it is pushed to the cache, not to the main registry. Manifest is only pushed to the main registry
function push_ci_image() {
if [[ ${CACHED_AIRFLOW_CI_IMAGE:=} != "" ]]; then
verbose_docker tag "${AIRFLOW_CI_IMAGE}" "${CACHED_AIRFLOW_CI_IMAGE}"
Expand All @@ -1598,6 +1613,11 @@ function push_ci_image() {
verbose_docker push "${DEFAULT_IMAGE}"
fi
fi
if [[ ${CACHED_PYTHON_BASE_IMAGE} != "" ]]; then
verbose_docker tag "${PYTHON_BASE_IMAGE}" "${CACHED_PYTHON_BASE_IMAGE}"
verbose_docker push "${CACHED_PYTHON_BASE_IMAGE}"
fi

}

# Pushes PROD image to the registry. In case the image was taken from cache registry
Expand All @@ -1620,6 +1640,8 @@ function push_prod_images() {
if [[ -n ${DEFAULT_IMAGE:=""} && ${CACHED_AIRFLOW_PROD_IMAGE} == "" ]]; then
verbose_docker push "${DEFAULT_IMAGE}"
fi

# we do not need to push PYTHON base image here - they are already pushed in the CI push
}

# Docker command to generate constraint requirement files.
Expand Down

0 comments on commit 41481bb

Please sign in to comment.