forked from apache/arrow
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathread_json_arrow.Rd
77 lines (67 loc) · 2.95 KB
/
read_json_arrow.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/json.R
\name{read_json_arrow}
\alias{read_json_arrow}
\title{Read a JSON file}
\usage{
read_json_arrow(
file,
col_select = NULL,
as_data_frame = TRUE,
schema = NULL,
...
)
}
\arguments{
\item{file}{A character file name or URI, literal data (either a single string or a \link{raw} vector),
an Arrow input stream, or a \code{FileSystem} with path (\code{SubTreeFileSystem}).
If a file name, a memory-mapped Arrow \link{InputStream} will be opened and
closed when finished; compression will be detected from the file extension
and handled automatically. If an input stream is provided, it will be left
open.
To be recognised as literal data, the input must be wrapped with \code{I()}.}
\item{col_select}{A character vector of column names to keep, as in the
"select" argument to \code{data.table::fread()}, or a
\link[tidyselect:eval_select]{tidy selection specification}
of columns, as used in \code{dplyr::select()}.}
\item{as_data_frame}{Should the function return a \code{data.frame} (default) or
an Arrow \link{Table}?}
\item{schema}{\link{Schema} that describes the table.}
\item{...}{Additional options passed to \code{JsonTableReader$create()}}
}
\value{
A \code{data.frame}, or a Table if \code{as_data_frame = FALSE}.
}
\description{
Wrapper around \link{JsonTableReader} to read a newline-delimited JSON (ndjson) file into a
data frame or Arrow Table.
}
\details{
If passed a path, will detect and handle compression from the file extension
(e.g. \code{.json.gz}).
If \code{schema} is not provided, Arrow data types are inferred from the data:
\itemize{
\item JSON null values convert to the \code{\link[=null]{null()}} type, but can fall back to any other type.
\item JSON booleans convert to \code{\link[=boolean]{boolean()}}.
\item JSON numbers convert to \code{\link[=int64]{int64()}}, falling back to \code{\link[=float64]{float64()}} if a non-integer is encountered.
\item JSON strings of the kind "YYYY-MM-DD" and "YYYY-MM-DD hh:mm:ss" convert to \code{\link[=timestamp]{timestamp(unit = "s")}},
falling back to \code{\link[=utf8]{utf8()}} if a conversion error occurs.
\item JSON arrays convert to a \code{\link[=list_of]{list_of()}} type, and inference proceeds recursively on the JSON arrays' values.
\item Nested JSON objects convert to a \code{\link[=struct]{struct()}} type, and inference proceeds recursively on the JSON objects' values.
}
When \code{as_data_frame = TRUE}, Arrow types are further converted to R types.
}
\examples{
\dontshow{if (arrow_with_json()) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
tf <- tempfile()
on.exit(unlink(tf))
writeLines('
{ "hello": 3.5, "world": false, "yo": "thing" }
{ "hello": 3.25, "world": null }
{ "hello": 0.0, "world": true, "yo": null }
', tf, useBytes = TRUE)
read_json_arrow(tf)
# Read directly from strings with `I()`
read_json_arrow(I(c('{"x": 1, "y": 2}', '{"x": 3, "y": 4}')))
\dontshow{\}) # examplesIf}
}