Skip to content

Commit

Permalink
Constraints and PIP packages can be installed from local sources (apa…
Browse files Browse the repository at this point in the history
…che#11382)

* Constraints and PIP packages can be installed from local sources

This is the final part of implementing apache#11171 based on feedback
from enterprise customers we worked with. They want to have
a capability of building the image using binary wheel packages
that are locally available and the official Dockerfile. This means
that besides the official APT sources the Dockerfile build should
not needd GitHub, nor any other external files pulled from outside
including PIP repository.

This change also includes documentation on how to prepare set of
such binaries ready for inspection and review by security teams
in Enterprise environment. Such sets of "known-working-binary-whl"
files can then be separately committed, tracked and scrutinized
in an artifact repository of such an Enterprise.

Fixes: apache#11171

* Update docs/production-deployment.rst
  • Loading branch information
potiuk authored Oct 10, 2020
1 parent 8640fb6 commit 0497390
Show file tree
Hide file tree
Showing 11 changed files with 245 additions and 35 deletions.
4 changes: 4 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,10 @@
!empty
!.pypirc

# This folder is for you if you want to add any files to the docker context when you build your own
# docker image. most of other files and any new folder you add will be excluded by default
!docker-context-files

# Avoid triggering context change on README change (new companies using Airflow)
# So please do not uncomment this line ;)
# !README.md
Expand Down
48 changes: 39 additions & 9 deletions BREEZE.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1239,9 +1239,6 @@ This is the current syntax for `./breeze <./breeze>`_:
--image-tag TAG
Additional tag in the image.
--disable-pip-cache
Disables GitHub PIP cache during the build. Useful if github is not reachable during build.
--additional-extras ADDITIONAL_EXTRAS
Additional extras to pass to build images The default is no additional extras.
Expand Down Expand Up @@ -1287,6 +1284,19 @@ This is the current syntax for `./breeze <./breeze>`_:
Disables installation of the mysql client which might be problematic if you are building
image in controlled environment. Only valid for production image.
--constraints-location
Url to the constraints file. In case of the production image it can also be a path to the
constraint file placed in 'docker-context-files' folder, in which case it has to be
in the form of '/docker-context-files/<NAME_OF_THE_FILE>'
--disable-pip-cache
Disables GitHub PIP cache during the build. Useful if github is not reachable during build.
--install-local-pip-wheels
This flag is only used in production image building. If it is used then instead of
installing Airflow from PyPI, the packages are installed from the .whl packages placed
in the 'docker-context-files' folder. It implies '--disable-pip-cache'
-C, --force-clean-images
Force build images with cache disabled. This will remove the pulled or build images
and start building images from scratch. This might take a long time.
Expand Down Expand Up @@ -1742,9 +1752,6 @@ This is the current syntax for `./breeze <./breeze>`_:
--image-tag TAG
Additional tag in the image.
--disable-pip-cache
Disables GitHub PIP cache during the build. Useful if github is not reachable during build.
--additional-extras ADDITIONAL_EXTRAS
Additional extras to pass to build images The default is no additional extras.
Expand Down Expand Up @@ -1790,6 +1797,19 @@ This is the current syntax for `./breeze <./breeze>`_:
Disables installation of the mysql client which might be problematic if you are building
image in controlled environment. Only valid for production image.
--constraints-location
Url to the constraints file. In case of the production image it can also be a path to the
constraint file placed in 'docker-context-files' folder, in which case it has to be
in the form of '/docker-context-files/<NAME_OF_THE_FILE>'
--disable-pip-cache
Disables GitHub PIP cache during the build. Useful if github is not reachable during build.
--install-local-pip-wheels
This flag is only used in production image building. If it is used then instead of
installing Airflow from PyPI, the packages are installed from the .whl packages placed
in the 'docker-context-files' folder. It implies '--disable-pip-cache'
-C, --force-clean-images
Force build images with cache disabled. This will remove the pulled or build images
and start building images from scratch. This might take a long time.
Expand Down Expand Up @@ -2195,9 +2215,6 @@ This is the current syntax for `./breeze <./breeze>`_:
--image-tag TAG
Additional tag in the image.
--disable-pip-cache
Disables GitHub PIP cache during the build. Useful if github is not reachable during build.
--additional-extras ADDITIONAL_EXTRAS
Additional extras to pass to build images The default is no additional extras.
Expand Down Expand Up @@ -2243,6 +2260,19 @@ This is the current syntax for `./breeze <./breeze>`_:
Disables installation of the mysql client which might be problematic if you are building
image in controlled environment. Only valid for production image.
--constraints-location
Url to the constraints file. In case of the production image it can also be a path to the
constraint file placed in 'docker-context-files' folder, in which case it has to be
in the form of '/docker-context-files/<NAME_OF_THE_FILE>'
--disable-pip-cache
Disables GitHub PIP cache during the build. Useful if github is not reachable during build.
--install-local-pip-wheels
This flag is only used in production image building. If it is used then instead of
installing Airflow from PyPI, the packages are installed from the .whl packages placed
in the 'docker-context-files' folder. It implies '--disable-pip-cache'
-C, --force-clean-images
Force build images with cache disabled. This will remove the pulled or build images
and start building images from scratch. This might take a long time.
Expand Down
28 changes: 19 additions & 9 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -140,6 +140,8 @@ ARG INSTALL_MYSQL_CLIENT="true"
ENV INSTALL_MYSQL_CLIENT=${INSTALL_MYSQL_CLIENT}

COPY scripts/docker scripts/docker
COPY docker-context-files /docker-context-files

RUN ./scripts/docker/install_mysql.sh dev

ARG AIRFLOW_REPO=apache/airflow
Expand All @@ -153,8 +155,8 @@ ARG ADDITIONAL_AIRFLOW_EXTRAS=""
ENV AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS}${ADDITIONAL_AIRFLOW_EXTRAS:+,}${ADDITIONAL_AIRFLOW_EXTRAS}

ARG AIRFLOW_CONSTRAINTS_REFERENCE="constraints-master"
ARG AIRFLOW_CONSTRAINTS_URL="https://raw.githubusercontent.com/apache/airflow/${AIRFLOW_CONSTRAINTS_REFERENCE}/constraints-${PYTHON_MAJOR_MINOR_VERSION}.txt"
ENV AIRFLOW_CONSTRAINTS_URL=${AIRFLOW_CONSTRAINTS_URL}
ARG AIRFLOW_CONSTRAINTS_LOCATION="https://raw.githubusercontent.com/apache/airflow/${AIRFLOW_CONSTRAINTS_REFERENCE}/constraints-${PYTHON_MAJOR_MINOR_VERSION}.txt"
ENV AIRFLOW_CONSTRAINTS_LOCATION=${AIRFLOW_CONSTRAINTS_LOCATION}

ENV PATH=${PATH}:/root/.local/bin
RUN mkdir -p /root/.local/bin
Expand All @@ -170,7 +172,7 @@ RUN if [[ ${AIRFLOW_PRE_CACHED_PIP_PACKAGES} == "true" ]]; then \
fi; \
pip install --user \
"https://github.com/${AIRFLOW_REPO}/archive/${AIRFLOW_BRANCH}.tar.gz#egg=apache-airflow[${AIRFLOW_EXTRAS}]" \
--constraint "${AIRFLOW_CONSTRAINTS_URL}" && pip uninstall --yes apache-airflow; \
--constraint "${AIRFLOW_CONSTRAINTS_LOCATION}" && pip uninstall --yes apache-airflow; \
fi

ARG AIRFLOW_SOURCES_FROM="."
Expand All @@ -196,6 +198,9 @@ ENV AIRFLOW_INSTALL_SOURCES=${AIRFLOW_INSTALL_SOURCES}
ARG AIRFLOW_INSTALL_VERSION=""
ENV AIRFLOW_INSTALL_VERSION=${AIRFLOW_INSTALL_VERSION}

ARG AIRFLOW_LOCAL_PIP_WHEELS=""
ENV AIRFLOW_LOCAL_PIP_WHEELS=${AIRFLOW_LOCAL_PIP_WHEELS}

ARG SLUGIFY_USES_TEXT_UNIDECODE=""
ENV SLUGIFY_USES_TEXT_UNIDECODE=${SLUGIFY_USES_TEXT_UNIDECODE}

Expand All @@ -205,12 +210,17 @@ WORKDIR /opt/airflow
RUN if [[ ${INSTALL_MYSQL_CLIENT} != "true" ]]; then \
AIRFLOW_EXTRAS=${AIRFLOW_EXTRAS/mysql,}; \
fi; \
pip install --user "${AIRFLOW_INSTALL_SOURCES}[${AIRFLOW_EXTRAS}]${AIRFLOW_INSTALL_VERSION}" \
--constraint "${AIRFLOW_CONSTRAINTS_URL}" && \
if [ -n "${ADDITIONAL_PYTHON_DEPS}" ]; then pip install --user ${ADDITIONAL_PYTHON_DEPS} \
--constraint "${AIRFLOW_CONSTRAINTS_URL}"; fi && \
find /root/.local/ -name '*.pyc' -print0 | xargs -0 rm -r && \
find /root/.local/ -type d -name '__pycache__' -print0 | xargs -0 rm -r
if [[ ${AIRFLOW_LOCAL_PIP_WHEELS} != "true" ]]; then \
pip install --user "${AIRFLOW_INSTALL_SOURCES}[${AIRFLOW_EXTRAS}]${AIRFLOW_INSTALL_VERSION}" \
--constraint "${AIRFLOW_CONSTRAINTS_LOCATION}"; \
if [ -n "${ADDITIONAL_PYTHON_DEPS}" ]; then \
pip install --user ${ADDITIONAL_PYTHON_DEPS} --constraint "${AIRFLOW_CONSTRAINTS_LOCATION}"; \
fi; \
else \
pip install --user /docker-context-files/*.whl; \
fi \
&& find /root/.local/ -name '*.pyc' -print0 | xargs -0 rm -r \
&& find /root/.local/ -type d -name '__pycache__' -print0 | xargs -0 rm -r

RUN AIRFLOW_SITE_PACKAGE="/root/.local/lib/python${PYTHON_MAJOR_MINOR_VERSION}/site-packages/airflow"; \
if [[ -f "${AIRFLOW_SITE_PACKAGE}/www_rbac/package.json" ]]; then \
Expand Down
25 changes: 22 additions & 3 deletions IMAGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,13 @@ Airflow docker images

Airflow has two images (build from Dockerfiles):

* Production image (Dockerfile) - that can be used to build your own production-ready Airflow installation
* CI image (Dockerfile.ci) - used for running tests and local development
* Production image (Dockerfile) - that can be used to build your own production-ready Airflow installation
You can read more about building and using the production image in the
`Production Deployments <docs/production-deployment.rst>`_ document. The image is built using
`Dockerfile <Dockerfile>`_

* CI image (Dockerfile.ci) - used for running tests and local development. The image is built using
`Dockerfile.ci <Dockerfile.ci>`_

Image naming conventions
========================
Expand Down Expand Up @@ -332,7 +337,6 @@ based on example in `this comment <https://github.com/apache/airflow/issues/8605
--build-arg ADDITIONAL_RUNTIME_ENV_VARS="ACCEPT_EULA=Y" \
--tag my-image
CI image build arguments
........................

Expand Down Expand Up @@ -378,6 +382,21 @@ The following build arguments (``--build-arg`` in docker build command) can be u
| | | dependencies from the repository from |
| | | scratch |
+------------------------------------------+------------------------------------------+------------------------------------------+
| ``AIRFLOW_CONSTRAINTS_LOCATION`` | | If not empty, it will override the |
| | | source of the constraints with the |
| | | specified URL or file. Note that the |
| | | file has to be in docker context so |
| | | it's best to place such file in |
| | | one of the folders included in |
| | | dockerignore |
+------------------------------------------+------------------------------------------+------------------------------------------+
| ``AIRFLOW_LOCAL_PIP_WHEELS`` | ``false`` | If set to true, Airflow and it's |
| | | dependencies are installed from locally |
| | | downloaded .whl files placed in the |
| | | ``docker-context-files``. Implies |
| | | ``AIRFLOW_PRE_CACHED_PIP_PACKAGES`` |
| | | to be false. |
+------------------------------------------+------------------------------------------+------------------------------------------+
| ``AIRFLOW_EXTRAS`` | ``all`` | extras to install |
+------------------------------------------+------------------------------------------+------------------------------------------+
| ``AIRFLOW_PRE_CACHED_PIP_PACKAGES`` | ``true`` | Allows to pre-cache airflow PIP packages |
Expand Down
40 changes: 30 additions & 10 deletions breeze
Original file line number Diff line number Diff line change
Expand Up @@ -913,13 +913,6 @@ function breeze::parse_arguments() {
# if not set here, docker cached is determined later, depending on type of image to be build
shift
;;
-B | --disable-pip-cache)
echo "Disable PIP cache during build"
echo
export AIRFLOW_PRE_CACHED_PIP_PACKAGES="false"
shift
;;

-P | --force-pull-images)
echo "Force pulling images before build. Uses pulled images as cache."
echo
Expand Down Expand Up @@ -1007,6 +1000,23 @@ function breeze::parse_arguments() {
echo "Install MySQL client: ${INSTALL_MYSQL_CLIENT}"
shift
;;
--constraints-location)
export AIRFLOW_CONSTRAINTS_LOCATION="${2}"
echo "Constraints location: ${AIRFLOW_CONSTRAINTS_LOCATION}"
shift 2
;;
--disable-pip-cache)
echo "Disable PIP cache during build"
echo
export AIRFLOW_PRE_CACHED_PIP_PACKAGES="false"
shift
;;
--install-local-pip-wheels)
export AIRFLOW_LOCAL_PIP_WHEELS="true"
export AIRFLOW_PRE_CACHED_PIP_PACKAGES="false"
echo "Install from local wheels and disable pip cache"
shift
;;
--image-tag)
export IMAGE_TAG="${2}"
echo "Tag to add to the image: ${IMAGE_TAG}"
Expand Down Expand Up @@ -2291,9 +2301,6 @@ ${FORMATTED_DEFAULT_PROD_EXTRAS}
--image-tag TAG
Additional tag in the image.
--disable-pip-cache
Disables GitHub PIP cache during the build. Useful if github is not reachable during build.
--additional-extras ADDITIONAL_EXTRAS
Additional extras to pass to build images The default is no additional extras.
Expand Down Expand Up @@ -2339,6 +2346,19 @@ Build options:
Disables installation of the mysql client which might be problematic if you are building
image in controlled environment. Only valid for production image.
--constraints-location
Url to the constraints file. In case of the production image it can also be a path to the
constraint file placed in 'docker-context-files' folder, in which case it has to be
in the form of '/docker-context-files/<NAME_OF_THE_FILE>'
--disable-pip-cache
Disables GitHub PIP cache during the build. Useful if github is not reachable during build.
--install-local-pip-wheels
This flag is only used in production image building. If it is used then instead of
installing Airflow from PyPI, the packages are installed from the .whl packages placed
in the 'docker-context-files' folder. It implies '--disable-pip-cache'
-C, --force-clean-images
Force build images with cache disabled. This will remove the pulled or build images
and start building images from scratch. This might take a long time.
Expand Down
2 changes: 1 addition & 1 deletion breeze-complete
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ dockerhub-user: dockerhub-repo: github-registry github-repository: github-image-
postgres-version: mysql-version:
version-suffix-for-pypi: version-suffix-for-svn:
additional-extras: additional-python-deps: additional-dev-deps: additional-runtime-deps: image-tag:
disable-mysql-client-installation
disable-mysql-client-installation constraints-location: disable-pip-cache install-local-pip-wheels
additional-extras: additional-python-deps:
dev-apt-deps: additional-dev-apt-deps: dev-apt-command: additional-dev-apt-command: additional-dev-apt-env:
runtime-apt-deps: additional-runtime-apt-deps: runtime-apt-command: additional-runtime-apt-command: additional-runtime-apt-env:
Expand Down
31 changes: 31 additions & 0 deletions docker-context-files/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

This folder is par of the Docker context.

Most of other folders in Airflow are not part of the context in order to make the context smaller.

The Production [Dockerfile](../Dockerfile) copies th [docker-context-files](.) folder to the "build"
stage of the production image (it is not used in the CI image) and content of the folder is available
in the `/docker-context-file` folder inside the build image. You can store constraint files and wheel
packages there that you want to install as PYPI packages and refer to those packages using
`--constraint-location` flag for constraints or by using `--install-local-pip-wheels` flag.

By default, the content of this folder is .gitignored so that any binaries and files you put here are only
used for local builds and not committed to the repository.
Loading

0 comments on commit 0497390

Please sign in to comment.