Skip to content

Commit

Permalink
[docker] Use ccache and Gradle cache across builds
Browse files Browse the repository at this point in the history
This patch uses buildkits `--mount=type=cache` syntax to allow
the builds ccache and Gradle cache to be used across Docker builds.

This speeds up the Docker build when re-building the thirdparty,
C++, and Java modules and mirrors the best practices for
non-Docker based builds.

See the buildkit documentation for details:
https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/experimental.md#run---mounttypecache

Below shows the runtime difference for subsequent builds of the
thirdparty target and Kudu target before and after this change:

`thirdparty` target:
- Before: 1:18:13
- After:    0:17:47

`kudu` target:
- Before: 0:19:10
- After:    0:04:08

Change-Id: Ie21ed47983b990e9d2aac419454b9a37a0600334
Reviewed-on: http://gerrit.cloudera.org:8080/16181
Tested-by: Kudu Jenkins
Reviewed-by: Bankim Bhavsar <[email protected]>
Reviewed-by: Greg Solovyev <[email protected]>
Reviewed-by: Attila Bukor <[email protected]>
  • Loading branch information
granthenke committed Jul 16, 2020
1 parent 60db4ae commit e81caa8
Show file tree
Hide file tree
Showing 3 changed files with 40 additions and 13 deletions.
44 changes: 32 additions & 12 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
# syntax=docker/dockerfile:1.1.7-experimental

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
Expand Down Expand Up @@ -105,10 +107,11 @@ LABEL org.label-schema.name="Apache Kudu Development Base" \
#
FROM dev AS thirdparty

ARG UID=1000
ARG GID=1000
ARG BUILD_DIR="/kudu"

ENV UID=1000
ENV GID=1000

# Setup the kudu user and create the neccessary directories.
# We do this before copying any files othwerwise the image size is doubled by the chown change.
RUN groupadd -g ${GID} kudu || groupmod -n kudu $(getent group ${GID} | cut -d: -f1) \
Expand All @@ -126,7 +129,10 @@ COPY --chown=kudu:kudu ./build-support/enable_devtoolset.sh \
build-support/
COPY --chown=kudu:kudu ./build-support/ccache-clang build-support/ccache-clang
COPY --chown=kudu:kudu ./build-support/ccache-devtoolset-3 build-support/ccache-devtoolset-3
RUN build-support/enable_devtoolset.sh \
# We explicitly set UID/GID due to https://github.com/moby/buildkit/issues/1237
# Hard coded UID/GID are required due to https://github.com/moby/buildkit/issues/815
RUN --mount=type=cache,id=ccache,uid=1000,gid=1000,target=/home/kudu/.ccache \
build-support/enable_devtoolset.sh \
thirdparty/build-if-necessary.sh \
# Remove the files left behind that we don't need.
# Remove all the source files except the hadoop, hive, postgresql, ranger, and sentry sources
Expand Down Expand Up @@ -177,6 +183,9 @@ ARG PARALLEL=4
# This is a common label argument, but also used in the build invocation.
ARG VCS_REF

ENV UID=1000
ENV GID=1000

# Use the bash shell for all RUN commands.
SHELL ["/bin/bash", "-c"]
# Run the build as the kudu user.
Expand All @@ -198,9 +207,15 @@ COPY --chown=kudu:kudu ./java ${BUILD_DIR}/java

# Build the c++ code.
WORKDIR ${BUILD_DIR}/build/$BUILD_TYPE
# Enable the Gradle build cache in the C++ build.
ENV GRADLE_FLAGS="--build-cache"
# Ensure we don't rebuild thirdparty. Instead let docker handle caching.
ENV NO_REBUILD_THIRDPARTY=1
RUN ../../build-support/enable_devtoolset.sh \
# We explicitly set UID/GID due to https://github.com/moby/buildkit/issues/1237
# Hard coded UID/GID are required due to https://github.com/moby/buildkit/issues/815
RUN --mount=type=cache,id=ccache,uid=1000,gid=1000,target=/home/kudu/.ccache \
--mount=type=cache,id=gradle-cache,uid=1000,gid=1000,target=/home/kudu/.gradle \
../../build-support/enable_devtoolset.sh \
../../thirdparty/installed/common/bin/cmake \
-DCMAKE_BUILD_TYPE=$BUILD_TYPE \
-DKUDU_LINK=$LINK_TYPE \
Expand All @@ -219,13 +234,15 @@ RUN ../../build-support/enable_devtoolset.sh \

# Build the java code.
WORKDIR ${BUILD_DIR}/java
RUN ./gradlew jar
RUN --mount=type=cache,id=gradle-cache,uid=1000,gid=1000,target=/home/kudu/.gradle \
./gradlew jar --build-cache

# Copy the python build source.
COPY --chown=kudu:kudu ./python ${BUILD_DIR}/python
# Build the python code.
WORKDIR ${BUILD_DIR}/python
RUN pip install --user -r requirements.txt \
RUN --mount=type=cache,id=ccache,uid=1000,gid=1000,target=/home/kudu/.ccache \
pip install --user -r requirements.txt \
&& python setup.py sdist

# Copy any remaining source files.
Expand Down Expand Up @@ -263,11 +280,12 @@ LABEL name="Apache Kudu Build" \
#
FROM runtime AS kudu-python

ARG UID=1000
ARG GID=1000
ARG BUILD_DIR="/kudu"
ARG INSTALL_DIR="/opt/kudu"

ENV UID=1000
ENV GID=1000

# Setup the kudu user and create the neccessary directories.
# We do this before copying any files othwerwise the image size is doubled by the chown change.
RUN groupadd -g ${GID} kudu || groupmod -n kudu $(getent group ${GID} | cut -d: -f1) \
Expand Down Expand Up @@ -321,12 +339,13 @@ LABEL org.label-schema.name="Apache Kudu Python Client" \
#
FROM runtime AS kudu

ARG UID=1000
ARG GID=1000
ARG BUILD_DIR="/kudu"
ARG INSTALL_DIR="/opt/kudu"
ARG DATA_DIR="/var/lib/kudu"

ENV UID=1000
ENV GID=1000

# Setup the kudu user and create the neccessary directories.
# We do this before copying any files othwerwise the image size is doubled by the chown change.
RUN groupadd -g ${GID} kudu || groupmod -n kudu $(getent group ${GID} | cut -d: -f1) \
Expand Down Expand Up @@ -439,11 +458,12 @@ LABEL name="Apache Impala Build" \
#
FROM runtime AS impala

ARG UID=1000
ARG GID=1000
ARG DATA_DIR="/var/lib/impala"
ARG IMPALA_VERSION="3.3.0"

ENV UID=1001
ENV GID=1001

ENV IMPALA_HOME="/opt/impala"
ENV HIVE_HOME="/opt/hive"
ENV HIVE_CONF_DIR="/etc/hive/conf"
Expand Down
7 changes: 7 additions & 0 deletions docker/README.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,13 @@ $ TAG_PATTERN="apache/kudu:*"
$ docker rmi $(docker images -q "$TAG_PATTERN" --format "{{.Repository}}:{{.Tag}}")
----

View and remove cache mounts (ccache and Gradle cache):
[source,bash]
----
$ docker system df -v | grep exec.cachemount
$ docker builder prune --filter type=exec.cachemount
----

=== Using the cache from pre-built images
You can tell docker to considered remote or local images in your build
as cache sources. This can be especially useful when the base or
Expand Down
2 changes: 1 addition & 1 deletion docker/docker-build.py
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ def main():

# If this is the default OS, also tag it without the OS-specific tag.
if base == DEFAULT_OS:
default_os_tag = get_full_tag(opts.repository, target, version, '')
default_os_tag = get_full_tag(opts.repository, target, version_tag, '')
default_os_cmd = 'docker tag %s %s' % (full_tag, default_os_tag)
run_command(default_os_cmd, opts)
tags.append(default_os_tag)
Expand Down

0 comments on commit e81caa8

Please sign in to comment.