forked from apache/arrow
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
ARROW-3282: [R] initial R functionality
* Wrapping C++ pointers to arrow objects as R6 classes holding an R external pointer. * Factory functions for the metadata types, int32(), ... * Factory to create schemas and struct * Create Array, RecordBatch, Table from R vectors and data frames. initially only support integer (int32), numeric (float64) and raw (int8) vectors. * Reading and Writing record batches and Table to files. Author: Romain Francois <[email protected]> Closes apache#2596 from romainfrancois/r-dev-buffer and squashes the following commits: 9ab1882 <Romain Francois> mark Roxygen and Rcpp generated files 661f370 <Romain Francois> Using FirstTimeBitmapWriter instead of BitmapWriter. e81b72b <Romain Francois> only set null_bitmap if null_count > 0 bfe853d <Romain Francois> using 0-based indices in the tests. b391556 <Romain Francois> Also use arrow::internak::BitmapWriter 9e60555 <Romain Francois> name fixes. Using __ consistently bf814bb <Romain Francois> Using arrow::internal::BitmapReader c8aa703 <Romain Francois> Also use std::shared_ptr for MemoryPool. 2aa8a5f <Romain Francois> need dev version of `vctrs` 394bd33 <Romain Francois> 🐀 + RecordBatch$Slice de93a4f <Romain Francois> RecordBatch tests 9d208a4 <Romain Francois> +Array$RangeEquals f860063 <Romain Francois> Move each class to their own file a89a9a8 <Romain Francois> Move RecordBatch impl to own file a2f9f51 <Romain Francois> correctly handling offset() 8263c0d <Romain Francois> + tests for ChunkedArray e02e24f <Romain Francois> +chunked_array and tests b20e4b0 <Romain Francois> More tests d11cda0 <Romain Francois> +R6 class ChunkedArray 29af2ea <Romain Francois> license headers 2f53ebf <Romain Francois> Additional tests for read_arrow / write_arrow 4237c32 <Romain Francois> Clear the bit for non NA. ede8e44 <Romain Francois> Handle null buffer in R <-> Array conversions a5b8190 <Romain Francois> update README with example of reading/writing arrow::Table d951db8 <Romain Francois> "documentation" to quiet check() 908c2ac <Romain Francois> read_arrow and write_arrow now relate to arrow::Table. 110b00d <Romain Francois> resolving conflicts ae55f8b <Romain Francois> .. 767e9d9 <Romain Francois> more generic print method 8d8cdd1 <Romain Francois> + read_arrow / write_arrow for now c1385a0 <Romain Francois> export Array_as_vector, +Array$ToString 23fbd01 <Romain Francois> + column names 97659ff <Romain Francois> + as_tibble.arrow::RecordBatch fa4ee22 <Romain Francois> + read_record_batch f27eeba <Romain Francois> - MakeArray 4977bb2 <Romain Francois> no need to make ArrayData directly ef7cda1 <Romain Francois> class constructors only take the external pointers, logic moved to factory functions 81e059a <Romain Francois> rebasing 421e471 <Romain Francois> +macro R_ERROR_NOT_OK similar to RETURN_NOT_OK but that Rcpp::stop()s f5e3eff <Romain Francois> attempt RecordBatch$to_file 79205fb <Romain Francois> initial stab at arrow::table(data.frame) f6f1775 <Romain Francois> s/data/.data/ b9c215b <Romain Francois> "document" array and record_batch edf6098 <Romain Francois> Need to install `vctrs` from github for now 6aecdce <Romain Francois> skip using rpath linker option b8dac54 <Romain Francois> +RecordBatch$schema 1fc3cc2 <Romain Francois> no longer need this 05da931 <Romain Francois> initial stab at record_batch f4d0a34 <Romain Francois> must include arrow_types.h first aee2d0a <Romain Francois> initial stab at arrow::array a6ae2f3 <Romain Francois> cleanup e14b546 <Romain Francois> follow up from @wesm comments on apache#2489 36e9801 <Romain Francois> + installation instructions 108caf9 <Romain Francois> not checking for headers on these files b829bdf <Romain Francois> initial R 📦 with travis setup and testthat suite, that links to arrow c++ library and calls arrow::int32() 26e712d <Romain Francois> Initial work for type metadata, with tests. e251299 <Romain Francois> + installation instructions a9a8bbb <Romain Francois> not checking for headers on these files e0a7eff <Romain Francois> initial R 📦 with travis setup and testthat suite, that links to arrow c++ library and calls arrow::int32() b1c1109 <Romain Francois> finished rebasing after initial R patch merged 887df48 <Romain Francois> skip using rpath linker option a6de975 <Romain Francois> cleanup 8526e51 <Romain Francois> follow up from @wesm comments on apache#2489 f03a277 <Romain Francois> + installation instructions 0995ca4 <Romain Francois> not checking for headers on these files 1cb547e <Romain Francois> initial R 📦 with travis setup and testthat suite, that links to arrow c++ library and calls arrow::int32() 705c125 <Romain Francois> exclude Rd files 🐀 605e302 <Romain Francois> time32 only handles second and millisecond time64 only handles microsecond and nanosecond afdbae6 <Romain Francois> + licence header for R6.R file 65563f5 <Romain Francois> minimal documentation for check() b7135c7 <Romain Francois> stop exporting everything 6aaf192 <Romain Francois> ignoring the .clang-format file d854f2f <Romain Francois> + license headers for R files 🙊 d992b26 <Romain Francois> Initial work for type metadata, with tests. 614dd07 <Romain Francois> + installation instructions afce06a <Romain Francois> initial R 📦 with travis setup and testthat suite, that links to arrow c++ library and calls arrow::int32()
- Loading branch information
1 parent
5167502
commit ea8940a
Showing
51 changed files
with
4,714 additions
and
32 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
r/R/RcppExports.R linguist-generated=true | ||
r/src/RcppExports.cpp linguist-generated=true | ||
r/man/*.Rd linguist-generated=true | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -39,3 +39,5 @@ python/.eggs/ | |
.pytest_cache/ | ||
pkgs | ||
.Rproj.user | ||
arrow.Rcheck/ | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -127,3 +127,5 @@ r/.Rbuildignore | |
r/arrow.Rproj | ||
r/README.md | ||
r/README.Rmd | ||
r/man/*.Rd | ||
.gitattributes |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,6 @@ | ||
^.*\.Rproj$ | ||
^\.Rproj\.user$ | ||
^README\.Rmd$ | ||
src/.clang-format | ||
LICENSE.md | ||
^data-raw$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,7 +2,7 @@ Package: arrow | |
Title: R Integration to 'Apache' 'Arrow' | ||
Version: 0.0.0.9000 | ||
Authors@R: c( | ||
person("Romain", "François", email = "[email protected]", role = c("aut", "cre")), | ||
person("Romain", "François", email = "[email protected]", role = c("aut", "cre")), | ||
person("Apache Arrow", email = "[email protected]", role = c("aut", "cph")) | ||
) | ||
Description: R Integration to 'Apache' 'Arrow'. | ||
|
@@ -11,11 +11,39 @@ License: Apache License (>= 2.0) | |
Encoding: UTF-8 | ||
LazyData: true | ||
SystemRequirements: C++11 | ||
LinkingTo: | ||
Rcpp | ||
Imports: | ||
Rcpp | ||
LinkingTo: | ||
Rcpp (>= 0.12.18) | ||
Imports: | ||
Rcpp (>= 0.12.18), | ||
rlang, | ||
purrr, | ||
assertthat, | ||
glue, | ||
R6, | ||
vctrs, | ||
fs, | ||
tibble, | ||
crayon | ||
Remotes: | ||
r-lib/vctrs | ||
Roxygen: list(markdown = TRUE) | ||
RoxygenNote: 6.0.1.9000 | ||
Suggests: | ||
RoxygenNote: 6.1.0.9000 | ||
Suggests: | ||
testthat | ||
Collate: | ||
'enums.R' | ||
'R6.R' | ||
'ArrayData.R' | ||
'ChunkedArray.R' | ||
'Column.R' | ||
'Field.R' | ||
'List.R' | ||
'RcppExports.R' | ||
'RecordBatch.R' | ||
'Schema.R' | ||
'Struct.R' | ||
'Table.R' | ||
'array.R' | ||
'memory_pool.R' | ||
'reexports-tibble.R' | ||
'zzz.R' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,61 @@ | ||
# Generated by roxygen2: do not edit by hand | ||
|
||
S3method("!=","arrow::Object") | ||
S3method("$","arrow-enum") | ||
S3method("==","arrow::Array") | ||
S3method("==","arrow::DataType") | ||
S3method("==","arrow::Field") | ||
S3method("==","arrow::RecordBatch") | ||
S3method(as_tibble,"arrow::RecordBatch") | ||
S3method(as_tibble,"arrow::Table") | ||
S3method(length,"arrow::Array") | ||
S3method(names,"arrow::RecordBatch") | ||
S3method(print,"arrow-enum") | ||
export(DateUnit) | ||
export(StatusCode) | ||
export(TimeUnit) | ||
export(Type) | ||
export(array) | ||
export(as_tibble) | ||
export(boolean) | ||
export(chunked_array) | ||
export(date32) | ||
export(date64) | ||
export(decimal) | ||
export(float16) | ||
export(float32) | ||
export(float64) | ||
export(int16) | ||
export(int32) | ||
export(int64) | ||
export(int8) | ||
export(list_of) | ||
export(null) | ||
export(read_arrow) | ||
export(record_batch) | ||
export(schema) | ||
export(struct) | ||
export(table) | ||
export(time32) | ||
export(time64) | ||
export(timestamp) | ||
export(uint16) | ||
export(uint32) | ||
export(uint64) | ||
export(uint8) | ||
export(utf8) | ||
export(write_arrow) | ||
importFrom(R6,R6Class) | ||
importFrom(Rcpp,sourceCpp) | ||
importFrom(assertthat,assert_that) | ||
importFrom(glue,glue) | ||
importFrom(purrr,map) | ||
importFrom(purrr,map2) | ||
importFrom(purrr,map_chr) | ||
importFrom(purrr,map_int) | ||
importFrom(rlang,dots_n) | ||
importFrom(rlang,quo_name) | ||
importFrom(rlang,seq2) | ||
importFrom(rlang,set_names) | ||
importFrom(tibble,as_tibble) | ||
useDynLib(arrow, .registration = TRUE) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one | ||
# or more contributor license agreements. See the NOTICE file | ||
# distributed with this work for additional information | ||
# regarding copyright ownership. The ASF licenses this file | ||
# to you under the Apache License, Version 2.0 (the | ||
# "License"); you may not use this file except in compliance | ||
# with the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an | ||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
|
||
#' @include R6.R | ||
|
||
`arrow::ArrayData` <- R6Class("arrow::ArrayData", | ||
inherit = `arrow::Object`, | ||
active = list( | ||
type = function() `arrow::DataType`$dispatch(ArrayData__get_type(self)), | ||
length = function() ArrayData__get_length(self), | ||
null_count = function() ArrayData__get_null_count(self), | ||
offset = function() ArrayData__get_offset(self) | ||
) | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,46 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one | ||
# or more contributor license agreements. See the NOTICE file | ||
# distributed with this work for additional information | ||
# regarding copyright ownership. The ASF licenses this file | ||
# to you under the Apache License, Version 2.0 (the | ||
# "License"); you may not use this file except in compliance | ||
# with the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an | ||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
|
||
#' @include R6.R | ||
|
||
`arrow::ChunkedArray` <- R6Class("arrow::ChunkedArray", inherit = `arrow::Object`, | ||
public = list( | ||
length = function() ChunkedArray__length(self), | ||
null_count = function() ChunkedArray__null_count(self), | ||
num_chunks = function() ChunkedArray__num_chunks(self), | ||
chunk = function(i) `arrow::Array`$new(ChunkedArray__chunk(self, i)), | ||
chunks = function() purrr::map(ChunkedArray__chunks(self), `arrow::Array`$new), | ||
type = function() `arrow::DataType`$dispatch(ChunkedArray__type(self)), | ||
as_vector = function() ChunkedArray__as_vector(self), | ||
Slice = function(offset, length = NULL){ | ||
if (is.null(length)) { | ||
`arrow::ChunkedArray`$new(ChunkArray__Slice1(self, offset)) | ||
} else { | ||
`arrow::ChunkedArray`$new(ChunkArray__Slice2(self, offset, length)) | ||
} | ||
} | ||
) | ||
) | ||
|
||
#' create an arrow::Array from an R vector | ||
#' | ||
#' @param \dots Vectors to coerce | ||
#' | ||
#' @export | ||
chunked_array <- function(...){ | ||
`arrow::ChunkedArray`$new(ChunkedArray__from_list(rlang::list2(...))) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one | ||
# or more contributor license agreements. See the NOTICE file | ||
# distributed with this work for additional information | ||
# regarding copyright ownership. The ASF licenses this file | ||
# to you under the Apache License, Version 2.0 (the | ||
# "License"); you may not use this file except in compliance | ||
# with the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an | ||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
|
||
#' @include R6.R | ||
|
||
`arrow::Column` <- R6Class("arrow::Column", inherit = `arrow::Object`, | ||
public = list( | ||
length = function() Column__length(self), | ||
null_count = function() Column__null_count(self), | ||
type = function() `arrow::DataType`$dispatch(Column__type(self)), | ||
data = function() `arrow::ChunkedArray`$new(Column__data(self)) | ||
) | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one | ||
# or more contributor license agreements. See the NOTICE file | ||
# distributed with this work for additional information | ||
# regarding copyright ownership. The ASF licenses this file | ||
# to you under the Apache License, Version 2.0 (the | ||
# "License"); you may not use this file except in compliance | ||
# with the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an | ||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
|
||
#' @include R6.R | ||
|
||
`arrow::Field` <- R6Class("arrow::Field", | ||
inherit = `arrow::Object`, | ||
public = list( | ||
ToString = function() { | ||
Field__ToString(self) | ||
}, | ||
name = function() { | ||
Field__name(self) | ||
}, | ||
nullable = function() { | ||
Field__nullable(self) | ||
}, | ||
Equals = function(other) { | ||
inherits(other, "arrow::Field") && Field__Equals(self, other) | ||
} | ||
) | ||
) | ||
|
||
#' @export | ||
`==.arrow::Field` <- function(lhs, rhs){ | ||
lhs$Equals(rhs) | ||
} | ||
|
||
field <- function(name, type) { | ||
`arrow::Field`$new(Field__initialize(name, type)) | ||
} | ||
|
||
.fields <- function(.list){ | ||
assert_that( !is.null(nms <- names(.list)) ) | ||
map2(nms, .list, field) | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Licensed to the Apache Software Foundation (ASF) under one | ||
# or more contributor license agreements. See the NOTICE file | ||
# distributed with this work for additional information | ||
# regarding copyright ownership. The ASF licenses this file | ||
# to you under the Apache License, Version 2.0 (the | ||
# "License"); you may not use this file except in compliance | ||
# with the License. You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, | ||
# software distributed under the License is distributed on an | ||
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
# KIND, either express or implied. See the License for the | ||
# specific language governing permissions and limitations | ||
# under the License. | ||
|
||
#' @include R6.R | ||
|
||
`arrow::ListType` <- R6Class("arrow::ListType", | ||
inherit = `arrow::NestedType` | ||
) | ||
|
||
#' @rdname DataType | ||
#' @export | ||
list_of <- function(type) `arrow::ListType`$new(list__(type)) |
Oops, something went wrong.