Skip to content

Commit

Permalink
ARROW-14062: [Format] Initial arrow-internal specification of compute IR
Browse files Browse the repository at this point in the history
See also apache#10856

Differing design decisions from the above:
- Don't special case for any `Expression`s or `Relation`s. All array functions and relations are identified by name, which may include a namespace for differentiating between extenders.
- Freely extensible without recompilation of flatbuffers (with the cost of being a less "pure" flatbuffers format since bytes blobs are used liberally).
- The root type is a Plan rather than a Relation- instead of expressing a value it is a specification of a side effect which includes the destination for output rows.

Closes apache#10934 from bkietz/compute-ir

Lead-authored-by: Benjamin Kietzman <[email protected]>
Co-authored-by: Phillip Cloud <[email protected]>
Signed-off-by: Phillip Cloud <[email protected]>
  • Loading branch information
bkietz and cpcloud committed Sep 21, 2021
1 parent cecca46 commit ce34ea1
Show file tree
Hide file tree
Showing 19 changed files with 6,701 additions and 62 deletions.
2 changes: 1 addition & 1 deletion .gitattributes
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@ r/R/arrowExports.R linguist-generated=true
r/src/RcppExports.cpp linguist-generated=true
r/src/arrowExports.cpp linguist-generated=true
r/man/*.Rd linguist-generated=true

cpp/src/generated/*.h linguist-generated=true
1 change: 1 addition & 0 deletions cpp/build-support/lint_exclusions.txt
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
*_generated*
*.grpc.fb.*
*parquet_constants.*
*parquet_types.*
*windows_compatibility.h
Expand Down
45 changes: 27 additions & 18 deletions cpp/build-support/update-flatbuffers.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,22 +20,31 @@

# Run this from cpp/ directory. flatc is expected to be in your path

set -euo pipefail

CWD="$(cd "$(dirname "${BASH_SOURCE[0]:-$0}")" && pwd)"
SOURCE_DIR=$CWD/../src
FORMAT_DIR=$CWD/../../format
FLATC="flatc -c --cpp-std c++11"

$FLATC -o $SOURCE_DIR/generated \
--scoped-enums \
$FORMAT_DIR/Message.fbs \
$FORMAT_DIR/File.fbs \
$FORMAT_DIR/Schema.fbs \
$FORMAT_DIR/Tensor.fbs \
$FORMAT_DIR/SparseTensor.fbs \
src/arrow/ipc/feather.fbs

$FLATC -o $SOURCE_DIR/plasma \
--gen-object-api \
--scoped-enums \
$SOURCE_DIR/plasma/common.fbs \
$SOURCE_DIR/plasma/plasma.fbs
SOURCE_DIR="$CWD/../src"
PYTHON_SOURCE_DIR="$CWD/../../python"
FORMAT_DIR="$CWD/../../format"
TOP="$FORMAT_DIR/.."
FLATC="flatc"

OUT_DIR="$SOURCE_DIR/generated"
FILES=($(find $FORMAT_DIR -name '*.fbs'))
FILES+=("$SOURCE_DIR/arrow/ipc/feather.fbs")

# add compute ir files
FILES+=($(find "$TOP/experimental/computeir" -name '*.fbs'))

$FLATC --cpp --cpp-std c++11 \
--scoped-enums \
-o "$OUT_DIR" \
"${FILES[@]}"

PLASMA_FBS=("$SOURCE_DIR"/plasma/{plasma,common}.fbs)

$FLATC --cpp --cpp-std c++11 \
-o "$SOURCE_DIR/plasma" \
--gen-object-api \
--scoped-enums \
"${PLASMA_FBS[@]}"
Loading

0 comments on commit ce34ea1

Please sign in to comment.