Skip to content

Commit

Permalink
ARROW-5505: [R] Normalize file and class names, stop masking base R f…
Browse files Browse the repository at this point in the history
…unctions, add vignette, improve documentation

The main thrust of the changes are summarized in the new vignette:

> C++ is an object-oriented language, so the core logic of the Arrow library is encapsulated in classes and methods. In the R package, these classes are implemented as `R6` reference classes, most of which are exported from the namespace.
>
> In order to match the C++ naming conventions, the `R6` classes are in TitleCase, e.g. `RecordBatch`. This makes it easy to look up the relevant C++ implementations in the [code](https://github.com/apache/arrow/tree/master/cpp) or [documentation](https://arrow.apache.org/docs/cpp/). To simplify things in R, the C++ library namespaces are generally dropped or flattened; that is, where the C++ library has `arrow::io::FileOutputStream`, it is just `FileOutputStream` in the R package. One exception is for the file readers, where the namespace is necessary to disambiguate. So `arrow::csv::TableReader` becomes `CsvTableReader`, and `arrow::json::TableReader` becomes `JsonTableReader`.
>
> Some of these classes are not meant to be instantiated directly; they may be base classes or other kinds of helpers. For those that you should be able to create, use the `$create()` method to instantiate an object. For example, `rb <- RecordBatch$create(int = 1:10, dbl = as.numeric(1:10))` will create a `RecordBatch`. Many of these factory methods that an R user might most often encounter also have a `snake_case` alias, in order to be more familiar for contemporary R users. So `record_batch(int = 1:10, dbl = as.numeric(1:10))` would do the same as `RecordBatch$create()` above.
>
> The typical user of the `arrow` R package may never deal directly with the `R6` objects. We provide more R-friendly wrapper functions as a higher-level interface to the C++ library. An R user can call `read_parquet()` without knowing or caring that they're instantiating a `ParquetFileReader` object and calling the `$ReadFile()` method on it. The classes are there and available to the advanced programmer who wants fine-grained control over how the C++ library is used.

There are a few other fixes and cleanups rolled in here, named in the individual commit messages below.

I stopped short of more documentation consolidation because (1) this patch is already huge and (2) `R6` classes are really tedious to document because it's all manual. I did some searching around and found open issues from 2014 and 2015 about supporting R6 better in roxygen2.

Closes apache#5279 from nealrichardson/cleaner-class-names and squashes the following commits:

3c6f85b <Neal Richardson> 🐀
22c9d04 <Neal Richardson> More doc cleaning
01084ce <Neal Richardson> Factor out assert_is()
caf3265 <Neal Richardson> PR feedback from romain
adf1cf9 <Neal Richardson> File renaming (not case-sensitive)
35f00f5 <Neal Richardson> Rename Table.R to table.R
8bd52d7 <Neal Richardson> Rename Struct.R to struct.R
358290b <Neal Richardson> Rename Schema.R to schema.R
924edd1 <Neal Richardson> Rename List.R to list.R
0150d99 <Neal Richardson> Rename Field.R to field.R
8683f10 <Neal Richardson> Add content to vignette from blog post
e6b75f4 <Neal Richardson> Consolidate and document reader/writer classes; also fix ARROW-6449
495abf6 <Neal Richardson> Fill in documentation and standardize file naming
5fd49ef <Neal Richardson> Fix check failures
96873e1 <Neal Richardson> Factor out make_readable_file
3e4cfe7 <Neal Richardson> Clean up parquet classes and document the R6
85a8d36 <Neal Richardson> Start vignette draft explaining the class and naming conventions
71cac57 <Neal Richardson> Clean up Rd file names, experiment with documenting constructors, and start updating pkgdown
2d1b738 <Neal Richardson> Replace table() with Table()
b694511 <Neal Richardson> Remove defunct Column class
730313e <Neal Richardson> One more find/replace, esp. RecordBatch*
702a0b1 <Neal Richardson> Message
365fedc <Neal Richardson> feather
0e7877b <Neal Richardson> Drop ::ipc::
55607a6 <Neal Richardson> json
9bd708f <Neal Richardson> csv
fbebf27 <Neal Richardson> io
1711d3e <Neal Richardson> CastOptions
12031ad <Neal Richardson> Backfill some  methods
4075897 <Neal Richardson> compression
3b4b492 <Neal Richardson> ChunkedArray
bbf0799 <Neal Richardson> Buffer
3f1cd71 <Neal Richardson> Object
9fbecda <Neal Richardson> A few more backticks
8edf085 <Neal Richardson> Remove more backticks
1f6d154 <Neal Richardson> Replace array() with Array()
9f52490 <Neal Richardson> Progress commit renaming Array

Authored-by: Neal Richardson <[email protected]>
Signed-off-by: Neal Richardson <[email protected]>
  • Loading branch information
nealrichardson committed Sep 10, 2019
1 parent 0fbaff6 commit 9dec79b
Show file tree
Hide file tree
Showing 156 changed files with 2,510 additions and 3,169 deletions.
1 change: 1 addition & 0 deletions dev/release/rat_exclude_files.txt
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,7 @@ r/README.md
r/README.Rmd
r/man/*.Rd
r/cran-comments.md
r/vignettes/*.Rmd
.gitattributes
ruby/red-arrow/.yardopts
rust/arrow/test/data/*.csv
Expand Down
32 changes: 17 additions & 15 deletions r/DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,12 @@ Imports:
utils
Roxygen: list(markdown = TRUE)
RoxygenNote: 6.1.1
VignetteBuilder: knitr
Suggests:
covr,
fs,
hms,
knitr,
lubridate,
rmarkdown,
testthat,
Expand All @@ -49,33 +51,33 @@ Collate:
'enums.R'
'arrow-package.R'
'type.R'
'ArrayData.R'
'ChunkedArray.R'
'Column.R'
'Field.R'
'List.R'
'RecordBatch.R'
'RecordBatchReader.R'
'RecordBatchWriter.R'
'Schema.R'
'Struct.R'
'Table.R'
'array-data.R'
'array.R'
'arrowExports.R'
'buffer.R'
'chunked-array.R'
'io.R'
'compression.R'
'compute.R'
'csv.R'
'dictionary.R'
'feather.R'
'field.R'
'install-arrow.R'
'json.R'
'memory_pool.R'
'list.R'
'memory-pool.R'
'message.R'
'parquet.R'
'read_record_batch.R'
'read_table.R'
'read-record-batch.R'
'read-table.R'
'record-batch-reader.R'
'record-batch-writer.R'
'record-batch.R'
'reexports-bit64.R'
'reexports-tidyselect.R'
'write_arrow.R'
'schema.R'
'struct.R'
'table.R'
'util.R'
'write-arrow.R'
120 changes: 38 additions & 82 deletions r/NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,114 +1,74 @@
# Generated by roxygen2: do not edit by hand

S3method("!=","arrow::Object")
S3method("==","arrow::Array")
S3method("==","arrow::DataType")
S3method("==","arrow::Field")
S3method("==","arrow::RecordBatch")
S3method("==","arrow::Schema")
S3method("==","arrow::ipc::Message")
S3method(BufferReader,"arrow::Buffer")
S3method(BufferReader,default)
S3method(CompressedInputStream,"arrow::io::InputStream")
S3method(CompressedInputStream,character)
S3method(CompressedOutputStream,"arrow::io::OutputStream")
S3method(CompressedOutputStream,character)
S3method(FeatherTableReader,"arrow::io::RandomAccessFile")
S3method(FeatherTableReader,"arrow::ipc::feather::TableReader")
S3method(FeatherTableReader,character)
S3method(FeatherTableReader,raw)
S3method(FeatherTableWriter,"arrow::io::OutputStream")
S3method(FixedSizeBufferWriter,"arrow::Buffer")
S3method(FixedSizeBufferWriter,default)
S3method(MessageReader,"arrow::io::InputStream")
S3method(MessageReader,default)
S3method(RecordBatchFileReader,"arrow::Buffer")
S3method(RecordBatchFileReader,"arrow::io::RandomAccessFile")
S3method(RecordBatchFileReader,character)
S3method(RecordBatchFileReader,raw)
S3method(RecordBatchFileWriter,"arrow::io::OutputStream")
S3method(RecordBatchFileWriter,character)
S3method(RecordBatchStreamReader,"arrow::Buffer")
S3method(RecordBatchStreamReader,"arrow::io::InputStream")
S3method(RecordBatchStreamReader,raw)
S3method(RecordBatchStreamWriter,"arrow::io::OutputStream")
S3method(RecordBatchStreamWriter,character)
S3method(as.data.frame,"arrow::RecordBatch")
S3method(as.data.frame,"arrow::Table")
S3method(as.raw,"arrow::Buffer")
S3method(buffer,"arrow::Buffer")
S3method(buffer,complex)
S3method(buffer,default)
S3method(buffer,integer)
S3method(buffer,numeric)
S3method(buffer,raw)
S3method(csv_table_reader,"arrow::csv::TableReader")
S3method(csv_table_reader,"arrow::io::InputStream")
S3method(csv_table_reader,character)
S3method(csv_table_reader,default)
S3method(dim,"arrow::RecordBatch")
S3method(dim,"arrow::Table")
S3method(json_table_reader,"arrow::io::InputStream")
S3method(json_table_reader,"arrow::json::TableReader")
S3method(json_table_reader,character)
S3method(json_table_reader,default)
S3method(length,"arrow::Array")
S3method(names,"arrow::RecordBatch")
S3method(parquet_file_reader,"arrow::io::RandomAccessFile")
S3method(parquet_file_reader,character)
S3method(parquet_file_reader,raw)
S3method("!=",Object)
S3method("==",Array)
S3method("==",DataType)
S3method("==",Field)
S3method("==",Message)
S3method("==",RecordBatch)
S3method("==",Schema)
S3method(as.data.frame,RecordBatch)
S3method(as.data.frame,Table)
S3method(as.raw,Buffer)
S3method(dim,RecordBatch)
S3method(dim,Table)
S3method(length,Array)
S3method(names,RecordBatch)
S3method(print,"arrow-enum")
S3method(read_message,"arrow::io::InputStream")
S3method(read_message,"arrow::ipc::MessageReader")
S3method(read_message,InputStream)
S3method(read_message,MessageReader)
S3method(read_message,default)
S3method(read_record_batch,"arrow::Buffer")
S3method(read_record_batch,"arrow::io::InputStream")
S3method(read_record_batch,"arrow::ipc::Message")
S3method(read_record_batch,Buffer)
S3method(read_record_batch,InputStream)
S3method(read_record_batch,Message)
S3method(read_record_batch,raw)
S3method(read_schema,"arrow::Buffer")
S3method(read_schema,"arrow::io::InputStream")
S3method(read_schema,"arrow::ipc::Message")
S3method(read_schema,Buffer)
S3method(read_schema,InputStream)
S3method(read_schema,Message)
S3method(read_schema,raw)
S3method(read_table,"arrow::ipc::RecordBatchFileReader")
S3method(read_table,"arrow::ipc::RecordBatchStreamReader")
S3method(read_table,RecordBatchFileReader)
S3method(read_table,RecordBatchStreamReader)
S3method(read_table,character)
S3method(read_table,raw)
S3method(type,"arrow::Array")
S3method(type,"arrow::ChunkedArray")
S3method(type,"arrow::Column")
S3method(type,Array)
S3method(type,ChunkedArray)
S3method(type,Column)
S3method(type,default)
S3method(write_arrow,"arrow::ipc::RecordBatchWriter")
S3method(write_arrow,RecordBatchWriter)
S3method(write_arrow,character)
S3method(write_arrow,raw)
S3method(write_feather,"arrow::RecordBatch")
S3method(write_feather,data.frame)
S3method(write_feather,default)
S3method(write_feather_RecordBatch,"arrow::io::OutputStream")
S3method(write_feather_RecordBatch,character)
S3method(write_feather_RecordBatch,default)
export(Array)
export(Buffer)
export(BufferOutputStream)
export(BufferReader)
export(ChunkedArray)
export(CompressedInputStream)
export(CompressedOutputStream)
export(CompressionType)
export(DateUnit)
export(FeatherTableReader)
export(FeatherTableWriter)
export(Field)
export(FileMode)
export(FileOutputStream)
export(FixedSizeBufferWriter)
export(MemoryMappedFile)
export(MessageReader)
export(MessageType)
export(MockOutputStream)
export(ParquetFileReader)
export(ParquetReaderProperties)
export(RandomAccessFile)
export(ReadableFile)
export(RecordBatchFileReader)
export(RecordBatchFileWriter)
export(RecordBatchStreamReader)
export(RecordBatchStreamWriter)
export(Schema)
export(StatusCode)
export(Table)
export(TimeUnit)
export(Type)
export(array)
export(arrow_available)
export(bool)
export(boolean)
Expand Down Expand Up @@ -150,8 +110,6 @@ export(mmap_open)
export(null)
export(num_range)
export(one_of)
export(parquet_arrow_reader_properties)
export(parquet_file_reader)
export(read_arrow)
export(read_csv_arrow)
export(read_delim_arrow)
Expand All @@ -168,7 +126,6 @@ export(schema)
export(starts_with)
export(string)
export(struct)
export(table)
export(time32)
export(time64)
export(timestamp)
Expand All @@ -180,7 +137,6 @@ export(uint8)
export(utf8)
export(write_arrow)
export(write_feather)
export(write_feather_RecordBatch)
export(write_parquet)
importFrom(R6,R6Class)
importFrom(Rcpp,sourceCpp)
Expand Down
138 changes: 0 additions & 138 deletions r/R/RecordBatchReader.R

This file was deleted.

Loading

0 comments on commit 9dec79b

Please sign in to comment.