forked from apache/arrow
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ARROW-8222: [C++] Use bcp to make a slim boost for bundled build
This patch switches our bundled boost ep to use a slimmer version of boost, which was built with the script added at `cpp/build_support/trim-boost.sh`. It uses the official boostorg big tarball as a fallback URL if for some reason ours is unavailable (as we've seen with other ep's, sometimes the download hosts are subject to rate limiting or other downtime, so having redundancy would more generally be good). The resulting tarball is 10mb, much less than 113mb of the full boost but larger than the 800k I suggested in the ticket. This is because in order to build regex and filesystem, we need `config build boost_install headers`, which add some weight. Boost.Build also seems to require `log`, and `predef` is needed by log apparently. In addition to slimming the boost tarball, this patch also refines the cmake logic that determines when boost is required. Due to recent-ish efforts to reduce usage of boost, we only need boost for tests and Gandiva, and for Parquet but only on gcc < 4.9. When building thrift_ep, we also need boost. By narrowing the condition involving Parquet to just gcc < 4.9, we are able to remove boost entirely from the R macOS and Windows packages. Outstanding questions/to-dos that I'm aware of: * [x] Put the boost bundle somewhere official. I plan to put it at https://dl.bintray.com/ursalabs/arrow-boost/ but am open to suggestion if someone has a better idea. * [x] ~~Add a crossbow job to build the boost bundle: to update the boost version or change what's included in the bundle, edit cpp/build_support/trim-boost.sh and run an on-demand build via PR comment.~~ On further reflection, crossbow isn't appropriate because it would mean that anyone could edit the script and overwrite the bundle that gets pulled into all source builds. Instead, I added a script that one can run locally, just requiring a bintray user and token in the ursalabs organization (available to any committer on request). * [ ] Something with [namespacing](https://issues.apache.org/jira/browse/ARROW-4286)? I'm not sure how much of a concern this is since Arrow itself doesn't really require boost anymore. Closes apache#6734 from nealrichardson/bcp Lead-authored-by: Neal Richardson <[email protected]> Co-authored-by: François Saint-Jacques <[email protected]> Co-authored-by: Sutou Kouhei <[email protected]> Signed-off-by: Wes McKinney <[email protected]>
- Loading branch information
Showing
21 changed files
with
197 additions
and
63 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,62 @@ | ||
#!/bin/bash | ||
# | ||
# Licensed to the Apache Software Foundation (ASF) under one | ||
# or more contributor license agreements. See the NOTICE file | ||
# distributed with this work for additional information | ||
# regarding copyright ownership. The ASF licenses this file | ||
# to you under the Apache License, Version 2.0 (the | ||
# "License"); you may not use this file except in compliance | ||
# with the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an | ||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
# | ||
|
||
# This script is used to make the subset of boost that we actually use, | ||
# so that we don't have to download the whole big boost project when we build | ||
# boost from source. | ||
# | ||
# After running this script, run upload-boost.sh to put the bundle on bintray | ||
|
||
set -eu | ||
|
||
# if version is not defined by the caller, set a default. | ||
: ${BOOST_VERSION:=1.71.0} | ||
: ${BOOST_FILE:=boost_${BOOST_VERSION//./_}} | ||
: ${BOOST_URL:=https://dl.bintray.com/boostorg/release/${BOOST_VERSION}/source/${BOOST_FILE}.tar.gz} | ||
|
||
# Arrow tests require these | ||
BOOST_LIBS="system.hpp filesystem.hpp" | ||
# Add these to be able to build those | ||
BOOST_LIBS="$BOOST_LIBS config build boost_install headers log predef" | ||
# Parquet needs this (if using gcc < 4.9) | ||
BOOST_LIBS="$BOOST_LIBS regex.hpp" | ||
# Gandiva needs these | ||
BOOST_LIBS="$BOOST_LIBS functional/hash.hpp multiprecision/cpp_int.hpp" | ||
# These are for Thrift when Thrift_SOURCE=BUNDLED | ||
BOOST_LIBS="$BOOST_LIBS algorithm/string.hpp locale.hpp noncopyable.hpp numeric/conversion/cast.hpp scope_exit.hpp scoped_array.hpp shared_array.hpp tokenizer.hpp version.hpp" | ||
|
||
if [ ! -d ${BOOST_FILE} ]; then | ||
curl -L "${BOOST_URL}" > ${BOOST_FILE}.tar.gz | ||
tar -xzf ${BOOST_FILE}.tar.gz | ||
fi | ||
|
||
pushd ${BOOST_FILE} | ||
|
||
if [ ! -f "dist/bin/bcp" ]; then | ||
./bootstrap.sh | ||
./b2 tools/bcp | ||
fi | ||
mkdir -p ${BOOST_FILE} | ||
./dist/bin/bcp ${BOOST_LIBS} ${BOOST_FILE} | ||
|
||
tar -czf ${BOOST_FILE}.tar.gz ${BOOST_FILE}/ | ||
# Resulting tarball is in ${BOOST_FILE}/${BOOST_FILE}.tar.gz | ||
|
||
popd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
#!/bin/bash | ||
# | ||
# Licensed to the Apache Software Foundation (ASF) under one | ||
# or more contributor license agreements. See the NOTICE file | ||
# distributed with this work for additional information | ||
# regarding copyright ownership. The ASF licenses this file | ||
# to you under the Apache License, Version 2.0 (the | ||
# "License"); you may not use this file except in compliance | ||
# with the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an | ||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
# | ||
|
||
# This assumes you've just run cpp/build-support/trim-boost.sh, so the file | ||
# to upload is at ${BOOST_FILE}/${BOOST_FILE}.tar.gz | ||
# | ||
# Also, you must have a bintray account on the "ursalabs" organization and | ||
# set the BINTRAY_USER and BINTRAY_APIKEY env vars. | ||
|
||
set -eu | ||
|
||
# if version is not defined by the caller, set a default. | ||
: ${BOOST_VERSION:=1.71.0} | ||
: ${BOOST_FILE:=boost_${BOOST_VERSION//./_}} | ||
: ${DST_URL:=https://api.bintray.com/content/ursalabs/arrow-boost/arrow-boost/latest} | ||
|
||
if [ "$BINTRAY_USER" = "" ]; then | ||
echo "Must set BINTRAY_USER" | ||
exit 1 | ||
fi | ||
if [ "$BINTRAY_APIKEY" = "" ]; then | ||
echo "Must set BINTRAY_APIKEY" | ||
exit 1 | ||
fi | ||
|
||
upload_file() { | ||
if [ -f "$1" ]; then | ||
echo "PUT ${DST_URL}/$1?override=1&publish=1" | ||
curl -sS -u "${BINTRAY_USER}:${BINTRAY_APIKEY}" -X PUT "${DST_URL}/$1?override=1&publish=1" --data-binary "@$1" | ||
else | ||
echo "$1 not found" | ||
fi | ||
} | ||
|
||
pushd ${BOOST_FILE} | ||
upload_file ${BOOST_FILE}.tar.gz | ||
popd |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.