Skip to content

Commit

Permalink
Improve caching for multi-platform images. (apache#23562)
Browse files Browse the repository at this point in the history
This is another attempt to improve caching performance for
multi-platform images as the previous ones were undermined by a
bug in buildx multiplatform cache-to implementattion that caused
the image cache to be overwritten between platforms,
when multiple images were build.

The bug is created for the buildx behaviour at
docker/buildx#1044 and until it is fixed
we have to prpare separate caches for each platform and push them
to separate tags.

That adds a bit overhead on the building step, but for now it is
the simplest way we can workaround the bug if we do not want to
manually manipulate manifests and images.
  • Loading branch information
potiuk authored May 9, 2022
1 parent 46c1c00 commit 9a6baab
Show file tree
Hide file tree
Showing 46 changed files with 1,128 additions and 1,066 deletions.
3 changes: 1 addition & 2 deletions .github/workflows/build-images.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ env:
secrets.CONSTRAINTS_GITHUB_REPOSITORY || 'apache/airflow' }}
# This token is WRITE one - pull_request_target type of events always have the WRITE token
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
LOGIN_TO_GITHUB_REGISTRY: "true"
IMAGE_TAG_FOR_THE_BUILD: "${{ github.event.pull_request.head.sha || github.sha }}"

concurrency:
Expand Down Expand Up @@ -135,7 +134,7 @@ jobs:
if [[ "${{ github.event_name }}" == 'schedule' ]]; then
echo "::set-output name=cacheDirective::disabled"
else
echo "::set-output name=cacheDirective::pulled"
echo "::set-output name=cacheDirective:registry"
fi
if [[ "$SELECTIVE_CHECKS_IMAGE_BUILD" == "true" ]]; then
Expand Down
10 changes: 5 additions & 5 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,6 @@ env:
secrets.CONSTRAINTS_GITHUB_REPOSITORY || 'apache/airflow' }}
# In builds from forks, this token is read-only. For scheduler/direct push it is WRITE one
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
LOGIN_TO_GITHUB_REGISTRY: "true"
ENABLE_TEST_COVERAGE: "${{ github.event_name == 'push' }}"
IMAGE_TAG_FOR_THE_BUILD: "${{ github.event.pull_request.head.sha || github.sha }}"

Expand Down Expand Up @@ -265,7 +264,7 @@ jobs:
if [[ "${{ github.event_name }}" == 'schedule' ]]; then
echo "::set-output name=cacheDirective::disabled"
else
echo "::set-output name=cacheDirective::pulled"
echo "::set-output name=cacheDirective::registry"
fi
if [[ "$SELECTIVE_CHECKS_IMAGE_BUILD" == "true" ]]; then
Expand Down Expand Up @@ -1608,9 +1607,10 @@ ${{ hashFiles('.pre-commit-config.yaml') }}"
IMAGE_TAG: ${{ env.IMAGE_TAG_FOR_THE_BUILD }}
- name: "Generate constraints"
run: |
breeze generate-constraints --run-in-parallel --generate-constraints-mode source-providers
breeze generate-constraints --run-in-parallel --generate-constraints-mode pypi-providers
breeze generate-constraints --run-in-parallel --generate-constraints-mode no-providers
breeze generate-constraints --run-in-parallel \
--airflow-constraints-mode constraints-source-providers
breeze generate-constraints --run-in-parallel --airflow-constraints-mode constraints-no-providers
breeze generate-constraints --run-in-parallel --airflow-constraints-mode constraints
env:
PYTHON_VERSIONS: ${{ needs.build-info.outputs.pythonVersionsListAsString }}
- name: "Set constraints branch name"
Expand Down
9 changes: 4 additions & 5 deletions BREEZE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1128,23 +1128,22 @@ all or selected python version and single constraint mode like this:

.. code-block:: bash
breeze generate-constraints --generate-constraints-mode pypi-providers
breeze generate-constraints --airflow-constraints-mode constraints
Constraints are generated separately for each python version and there are separate constraints modes:

* 'constraints' - those are constraints generated by matching the current airflow version from sources
and providers that are installed from PyPI. Those are constraints used by the users who want to
install airflow with pip. Use ``pypi-providers`` mode for that.
install airflow with pip.

* "constraints-source-providers" - those are constraints generated by using providers installed from
current sources. While adding new providers their dependencies might change, so this set of providers
is the current set of the constraints for airflow and providers from the current main sources.
Those providers are used by CI system to keep "stable" set of constraints. Use
``source-providers`` mode for that.
Those providers are used by CI system to keep "stable" set of constraints.

* "constraints-no-providers" - those are constraints generated from only Apache Airflow, without any
providers. If you want to manage airflow separately and then add providers individually, you can
use those. Use ``no-providers`` mode for that.
use those.

Those are all available flags of ``generate-constraints`` command:

Expand Down
11 changes: 6 additions & 5 deletions CONTRIBUTING.rst
Original file line number Diff line number Diff line change
Expand Up @@ -882,18 +882,19 @@ There are several sets of constraints we keep:
providers. If you want to manage airflow separately and then add providers individually, you can
use those. Those constraints are named ``constraints-no-providers-<PYTHON_MAJOR_MINOR_VERSION>.txt``.

We also have constraints with "source-providers" but they are used i

The first two can be used as constraints file when installing Apache Airflow in a repeatable way.
It can be done from the sources:

from the PyPI package:

.. code-block:: bash
pip install apache-airflow==2.2.5 \
pip install apache-airflow[google,amazon,async]==2.2.5 \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.2.5/constraints-3.7.txt"
The last one can be used to install Airflow in "minimal" mode - i.e when bare Airflow is installed without
extras.

When you install airflow from sources (in editable mode) you should use "constraints-source-providers"
instead (this accounts for the case when some providers have not yet been released and have conflicting
requirements).
Expand All @@ -909,14 +910,14 @@ This works also with extras - for example:
.. code-block:: bash
pip install ".[ssh]" \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints--source-providers-3.7.txt"
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-source-providers-3.7.txt"
There are different set of fixed constraint files for different python major/minor versions and you should
use the right file for the right python version.

If you want to update just airflow dependencies, without paying attention to providers, you can do it using
-no-providers constraint files as well.
``constraints-no-providers`` constraint files as well.

.. code-block:: bash
Expand Down
62 changes: 25 additions & 37 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -315,12 +315,14 @@ function install_airflow_dependencies_from_branch_tip() {
fi
# Install latest set of dependencies using constraints. In case constraints were upgraded and there
# are conflicts, this might fail, but it should be fixed in the following installation steps
set -x
pip install \
"https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz#egg=apache-airflow[${AIRFLOW_EXTRAS}]" \
--constraint "${AIRFLOW_CONSTRAINTS_LOCATION}" || true
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
pip freeze | grep apache-airflow-providers | xargs pip uninstall --yes 2>/dev/null || true
set +x
echo
echo "${COLOR_BLUE}Uninstalling just airflow. Dependencies remain. Now target airflow can be reinstalled using mostly cached dependencies${COLOR_RESET}"
echo
Expand Down Expand Up @@ -384,7 +386,7 @@ function common::get_constraints_location() {
local constraints_base="https://raw.githubusercontent.com/${CONSTRAINTS_GITHUB_REPOSITORY}/${AIRFLOW_CONSTRAINTS_REFERENCE}"
local python_version
python_version="$(python --version 2>/dev/stdout | cut -d " " -f 2 | cut -d "." -f 1-2)"
AIRFLOW_CONSTRAINTS_LOCATION="${constraints_base}/${AIRFLOW_CONSTRAINTS}-${python_version}.txt"
AIRFLOW_CONSTRAINTS_LOCATION="${constraints_base}/${AIRFLOW_CONSTRAINTS_MODE}-${python_version}.txt"
fi
}

Expand Down Expand Up @@ -563,40 +565,19 @@ function install_airflow_and_providers_from_docker_context_files(){
return
fi

if [[ "${UPGRADE_TO_NEWER_DEPENDENCIES}" != "false" ]]; then
echo
echo "${COLOR_BLUE}Force re-installing airflow and providers from local files with eager upgrade${COLOR_RESET}"
echo
# force reinstall all airflow + provider package local files with eager upgrade
pip install "${pip_flags[@]}" --upgrade --upgrade-strategy eager \
${reinstalling_apache_airflow_package} ${reinstalling_apache_airflow_providers_packages} \
${EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS}
else
echo
echo "${COLOR_BLUE}Force re-installing airflow and providers from local files with constraints and upgrade if needed${COLOR_RESET}"
echo
if [[ ${AIRFLOW_CONSTRAINTS_LOCATION} == "/"* ]]; then
grep -ve '^apache-airflow' <"${AIRFLOW_CONSTRAINTS_LOCATION}" > /tmp/constraints.txt
else
# Remove provider packages from constraint files because they are locally prepared
curl -L "${AIRFLOW_CONSTRAINTS_LOCATION}" | grep -ve '^apache-airflow' > /tmp/constraints.txt
fi
# force reinstall airflow + provider package local files with constraints + upgrade if needed
pip install "${pip_flags[@]}" --force-reinstall \
${reinstalling_apache_airflow_package} ${reinstalling_apache_airflow_providers_packages} \
--constraint /tmp/constraints.txt
rm /tmp/constraints.txt
# make sure correct PIP version is used \
pip install "pip==${AIRFLOW_PIP_VERSION}"
# then upgrade if needed without using constraints to account for new limits in setup.py
pip install --upgrade --upgrade-strategy only-if-needed \
${reinstalling_apache_airflow_package} ${reinstalling_apache_airflow_providers_packages}
fi
echo
echo "${COLOR_BLUE}Force re-installing airflow and providers from local files with eager upgrade${COLOR_RESET}"
echo
# force reinstall all airflow + provider package local files with eager upgrade
set -x
pip install "${pip_flags[@]}" --upgrade --upgrade-strategy eager \
${reinstalling_apache_airflow_package} ${reinstalling_apache_airflow_providers_packages} \
${EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS}
set +x

# make sure correct PIP version is left installed
pip install "pip==${AIRFLOW_PIP_VERSION}"
pip check

}

function install_all_other_packages_from_docker_context_files() {
Expand All @@ -608,10 +589,12 @@ function install_all_other_packages_from_docker_context_files() {
# shellcheck disable=SC2010
reinstalling_other_packages=$(ls /docker-context-files/*.{whl,tar.gz} 2>/dev/null | \
grep -v apache_airflow | grep -v apache-airflow || true)
if [[ -n "${reinstalling_other_packages}" ]]; then \
if [[ -n "${reinstalling_other_packages}" ]]; then
set -x
pip install --force-reinstall --no-deps --no-index ${reinstalling_other_packages}
# make sure correct PIP version is used
pip install "pip==${AIRFLOW_PIP_VERSION}"
set -x
fi
}

Expand Down Expand Up @@ -664,9 +647,11 @@ function install_airflow() {
if [[ -n "${AIRFLOW_INSTALL_EDITABLE_FLAG}" ]]; then
# Remove airflow and reinstall it using editable flag
# We can only do it when we install airflow from sources
set -x
pip uninstall apache-airflow --yes
pip install ${AIRFLOW_INSTALL_EDITABLE_FLAG} \
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}"
set +x
fi

# make sure correct PIP version is used
Expand All @@ -679,6 +664,7 @@ function install_airflow() {
echo
echo "${COLOR_BLUE}Installing all packages with constraints and upgrade if needed${COLOR_RESET}"
echo
set -x
pip install ${AIRFLOW_INSTALL_EDITABLE_FLAG} \
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}" \
--constraint "${AIRFLOW_CONSTRAINTS_LOCATION}"
Expand All @@ -690,6 +676,7 @@ function install_airflow() {
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}"
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
Expand Down Expand Up @@ -718,18 +705,17 @@ set -euo pipefail

. "$( dirname "${BASH_SOURCE[0]}" )/common.sh"


set -x

function install_additional_dependencies() {
if [[ "${UPGRADE_TO_NEWER_DEPENDENCIES}" != "false" ]]; then
echo
echo "${COLOR_BLUE}Installing additional dependencies while upgrading to newer dependencies${COLOR_RESET}"
echo
set -x
pip install --upgrade --upgrade-strategy eager \
${ADDITIONAL_PYTHON_DEPS} ${EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS}
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
Expand All @@ -738,10 +724,12 @@ function install_additional_dependencies() {
echo
echo "${COLOR_BLUE}Installing additional dependencies upgrading only if needed${COLOR_RESET}"
echo
set -x
pip install --upgrade --upgrade-strategy only-if-needed \
${ADDITIONAL_PYTHON_DEPS}
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
Expand Down Expand Up @@ -1185,7 +1173,7 @@ ARG AIRFLOW_EXTRAS
ARG ADDITIONAL_AIRFLOW_EXTRAS=""
# Allows to override constraints source
ARG CONSTRAINTS_GITHUB_REPOSITORY="apache/airflow"
ARG AIRFLOW_CONSTRAINTS="constraints"
ARG AIRFLOW_CONSTRAINTS_MODE="constraints"
ARG AIRFLOW_CONSTRAINTS_REFERENCE=""
ARG AIRFLOW_CONSTRAINTS_LOCATION=""
ARG DEFAULT_CONSTRAINTS_BRANCH="constraints-main"
Expand Down Expand Up @@ -1275,7 +1263,7 @@ ENV AIRFLOW_PIP_VERSION=${AIRFLOW_PIP_VERSION} \
AIRFLOW_BRANCH=${AIRFLOW_BRANCH} \
AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS}${ADDITIONAL_AIRFLOW_EXTRAS:+,}${ADDITIONAL_AIRFLOW_EXTRAS} \
CONSTRAINTS_GITHUB_REPOSITORY=${CONSTRAINTS_GITHUB_REPOSITORY} \
AIRFLOW_CONSTRAINTS=${AIRFLOW_CONSTRAINTS} \
AIRFLOW_CONSTRAINTS_MODE=${AIRFLOW_CONSTRAINTS_MODE} \
AIRFLOW_CONSTRAINTS_REFERENCE=${AIRFLOW_CONSTRAINTS_REFERENCE} \
AIRFLOW_CONSTRAINTS_LOCATION=${AIRFLOW_CONSTRAINTS_LOCATION} \
DEFAULT_CONSTRAINTS_BRANCH=${DEFAULT_CONSTRAINTS_BRANCH} \
Expand Down
19 changes: 13 additions & 6 deletions Dockerfile.ci
Original file line number Diff line number Diff line change
Expand Up @@ -275,12 +275,14 @@ function install_airflow_dependencies_from_branch_tip() {
fi
# Install latest set of dependencies using constraints. In case constraints were upgraded and there
# are conflicts, this might fail, but it should be fixed in the following installation steps
set -x
pip install \
"https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz#egg=apache-airflow[${AIRFLOW_EXTRAS}]" \
--constraint "${AIRFLOW_CONSTRAINTS_LOCATION}" || true
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
pip freeze | grep apache-airflow-providers | xargs pip uninstall --yes 2>/dev/null || true
set +x
echo
echo "${COLOR_BLUE}Uninstalling just airflow. Dependencies remain. Now target airflow can be reinstalled using mostly cached dependencies${COLOR_RESET}"
echo
Expand Down Expand Up @@ -344,7 +346,7 @@ function common::get_constraints_location() {
local constraints_base="https://raw.githubusercontent.com/${CONSTRAINTS_GITHUB_REPOSITORY}/${AIRFLOW_CONSTRAINTS_REFERENCE}"
local python_version
python_version="$(python --version 2>/dev/stdout | cut -d " " -f 2 | cut -d "." -f 1-2)"
AIRFLOW_CONSTRAINTS_LOCATION="${constraints_base}/${AIRFLOW_CONSTRAINTS}-${python_version}.txt"
AIRFLOW_CONSTRAINTS_LOCATION="${constraints_base}/${AIRFLOW_CONSTRAINTS_MODE}-${python_version}.txt"
fi
}

Expand Down Expand Up @@ -514,9 +516,11 @@ function install_airflow() {
if [[ -n "${AIRFLOW_INSTALL_EDITABLE_FLAG}" ]]; then
# Remove airflow and reinstall it using editable flag
# We can only do it when we install airflow from sources
set -x
pip uninstall apache-airflow --yes
pip install ${AIRFLOW_INSTALL_EDITABLE_FLAG} \
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}"
set +x
fi

# make sure correct PIP version is used
Expand All @@ -529,6 +533,7 @@ function install_airflow() {
echo
echo "${COLOR_BLUE}Installing all packages with constraints and upgrade if needed${COLOR_RESET}"
echo
set -x
pip install ${AIRFLOW_INSTALL_EDITABLE_FLAG} \
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}" \
--constraint "${AIRFLOW_CONSTRAINTS_LOCATION}"
Expand All @@ -540,6 +545,7 @@ function install_airflow() {
"${AIRFLOW_INSTALLATION_METHOD}[${AIRFLOW_EXTRAS}]${AIRFLOW_VERSION_SPECIFICATION}"
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
Expand Down Expand Up @@ -568,18 +574,17 @@ set -euo pipefail

. "$( dirname "${BASH_SOURCE[0]}" )/common.sh"


set -x

function install_additional_dependencies() {
if [[ "${UPGRADE_TO_NEWER_DEPENDENCIES}" != "false" ]]; then
echo
echo "${COLOR_BLUE}Installing additional dependencies while upgrading to newer dependencies${COLOR_RESET}"
echo
set -x
pip install --upgrade --upgrade-strategy eager \
${ADDITIONAL_PYTHON_DEPS} ${EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS}
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
Expand All @@ -588,10 +593,12 @@ function install_additional_dependencies() {
echo
echo "${COLOR_BLUE}Installing additional dependencies upgrading only if needed${COLOR_RESET}"
echo
set -x
pip install --upgrade --upgrade-strategy only-if-needed \
${ADDITIONAL_PYTHON_DEPS}
# make sure correct PIP version is used
pip install --disable-pip-version-check "pip==${AIRFLOW_PIP_VERSION}"
set +x
echo
echo "${COLOR_BLUE}Running 'pip check'${COLOR_RESET}"
echo
Expand Down Expand Up @@ -1178,7 +1185,7 @@ ARG AIRFLOW_EXTRAS="all"
ARG ADDITIONAL_AIRFLOW_EXTRAS=""
# Allows to override constraints source
ARG CONSTRAINTS_GITHUB_REPOSITORY="apache/airflow"
ARG AIRFLOW_CONSTRAINTS="constraints"
ARG AIRFLOW_CONSTRAINTS_MODE="constraints-source-providers"
ARG AIRFLOW_CONSTRAINTS_REFERENCE=""
ARG AIRFLOW_CONSTRAINTS_LOCATION=""
ARG DEFAULT_CONSTRAINTS_BRANCH="constraints-main"
Expand Down Expand Up @@ -1207,7 +1214,7 @@ ENV AIRFLOW_REPO=${AIRFLOW_REPO}\
AIRFLOW_BRANCH=${AIRFLOW_BRANCH} \
AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS}${ADDITIONAL_AIRFLOW_EXTRAS:+,}${ADDITIONAL_AIRFLOW_EXTRAS} \
CONSTRAINTS_GITHUB_REPOSITORY=${CONSTRAINTS_GITHUB_REPOSITORY} \
AIRFLOW_CONSTRAINTS=${AIRFLOW_CONSTRAINTS} \
AIRFLOW_CONSTRAINTS_MODE=${AIRFLOW_CONSTRAINTS_MODE} \
AIRFLOW_CONSTRAINTS_REFERENCE=${AIRFLOW_CONSTRAINTS_REFERENCE} \
AIRFLOW_CONSTRAINTS_LOCATION=${AIRFLOW_CONSTRAINTS_LOCATION} \
DEFAULT_CONSTRAINTS_BRANCH=${DEFAULT_CONSTRAINTS_BRANCH} \
Expand Down
Loading

0 comments on commit 9a6baab

Please sign in to comment.