-
Notifications
You must be signed in to change notification settings - Fork 7
/
README.Rmd
414 lines (318 loc) · 9.07 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
---
output: github_document
---
<!-- badges: start -->
[![Lifecycle: experimental](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://www.tidyverse.org/lifecycle/#experimental)
[![Travis build status](https://travis-ci.org/moodymudskipper/nakedpipe.svg?branch=master)](https://travis-ci.org/moodymudskipper/nakedpipe)
[![Codecov test coverage](https://codecov.io/gh/moodymudskipper/nakedpipe/branch/master/graph/badge.svg)](https://codecov.io/gh/moodymudskipper/nakedpipe?branch=master)
<!-- badges: end -->
# nakedpipe <img src='man/figures/logo.png' align="right" height="139" />
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```
Pipe into a sequence of calls without repeating the pipe symbol.
This is inspired by Stefan Bache and Hadley Wickham's *magrittr* pipe and behaves
mostly consistently.
*nakedpipe* calls are more compact, and are intended to be more readable,
though it's expected that they will look surprising to new users.
The syntax allowed the development of many additional features that cannot be
implemented as ergonomically with *magrittr*.
An instant translation addin between *magrittr* and *nakedpipe* is included.
It's not yet on *CRAN* so you should install with :
``` r
remotes::install_github("moodymudskipper/nakedpipe")
```
## General principles
A basic *{nakedpipe}* call looks a lot like a *{magrittr}* pipe chain, except that the
piping symbol is not repeated, and that we surround the calls with `{}`
*{magrittr}* syntax :
```{r}
library(magrittr)
cars %>%
subset(speed < 6) %>%
transform(time = dist/speed)
```
*{nakedpipe}* syntax :
```{r}
library(nakedpipe)
cars %.% {
subset(speed < 6)
transform(time = dist/speed)
}
```
The dot insertion rules are identical to the ones used by *{magrittr}*, and likewise
if we surround a step in `{}`, no dot will be inserted.
It plays well with left to right assignment:
```{r}
cars %.% {
subset(speed < 6)
transform(time = dist/speed)
} -> res
```
Additional features include :
* Side effects, using `~~`, similar to ```magrittr::`%T>%```
* Temporary assignments and assignments to the calling environment
* Shorthands for most common data manipulation operations, namely `subset()`,
`transform()` and grouped transformations.
* Conditional steps using `if`
* Possibility to use *{data.table}* syntax for one step
* Additional pipes to debug, assign in place, print or clock each step...
## Side effects and assignments
Use `~~` for side effects:
```{r}
cars %.% {
subset(speed < 6)
~~ message("nrow:", nrow(.))
transform(time = dist/speed)
}
```
This include assignments :
```{r}
cars %.% {
subset(speed < 6)
~~ cars_h <- . # or ~~ . -> cars_h
transform(time = dist/speed)
}
cars_h
```
To assign to a temp variable, use a dotted name:
```{r}
cars %.% {
~~ .n <- 6
subset(speed < .n)
transform(time = dist/speed)
}
exists(".n")
```
## Data manipulation shorthands
For the very common `subset()` and `transform()` operations, shorthands are
available, so that for our first example we could simply write:
```{r}
cars %.% {
speed < 6 # any call to < > <= >= == != %in% & | is interpreted as a subset call
time = dist/speed # any call to = is interpreted as a transform call
}
```
## Conditional steps
Use `if` for conditional step. if the condition is not TRUE and there is no
`else` clause the data is unchanged:
```{r}
cars %.% {
subset(speed < 6)
if(ncol(.) < 5) transform(time = dist/speed)
}
cars %.% {
subset(speed < 6)
if(ncol(.) > 5) transform(time = dist/speed)
}
```
## Use *data.table* syntax
We can use *data.table* syntax for one step by using `.dt[...]`, the output
will be of the same class of the input (the temporary conversion to *data.table*
is invisible):
```{r}
cars %.% {
speed < 8
time = dist/speed
.dt[, .(mmean_time = mean(time)), by = speed]
}
```
We can chain *data.table* brackets too:
```{r}
cars %.% {
.dt[speed < 8][, time := dist/speed][,.(mmean_time = mean(time)), by = speed]
}
```
## Additional pipes
Assign in place using `%<.%`
```{r}
cars_copy <- cars
cars_copy %<.% {
head(2)
~~ message("nrow:", nrow(.))
transform(time = dist/speed)
}
cars_copy
```
Clock each step using `%L.%`
```{r}
cars %L.% {
head(2)
~~ Sys.sleep(1)
transform(time = dist/speed)
}
```
`print()` the output of each step using `%P.%`
```{r}
cars %P.% {
head(2)
transform(time = dist/speed)
}
```
`View()` the output of each step using `%V.%`
```{r, eval = FALSE}
cars %V.% {
head(2)
transform(time = dist/speed)
}
```
`%..%` is faster at the cost of using explicit dots
```{r}
cars %..% {
head(.,2)
transform(.,time = dist/speed)
}
```
It is better suited for programming and doesn't support side effect notation
but you can do :
```{r}
cars %..% {
head(.,2)
{message("nrow:", nrow(.)); .}
transform(.,time = dist/speed)
}
```
Create a function using `%F.%` on `.`
```{r}
fun <- . %F.% {
head(.,2)
transform(.,time = dist/speed)
}
fun(cars)
```
Apply a sequence of calls on all elements using `%lapply.%`
```{r}
replicate(2, cars, simplify = FALSE) %lapply.% {
head(.,2)
transform(.,time = dist/speed)
}
```
See `?"%.%"` and `?"%lapply.%"` to see all available pipes (including variants of
the above).
## Debugging
The `%D.%` pipe allows you to step through the calls one by one.
```{r, eval = FALSE}
# Debug the pipe using `%D.%`
cars %D.% {
head(2)
transform(time = dist/speed)
}
```
You could also inster a browser() call as a side effect at a chosen step.
```{r, eval = FALSE}
# Debug the pipe using `%D.%`
cars %D.% {
head(2)
~~ browser()
transform(time = dist/speed)
}
```
## *ggplot2*
It's a little known trick that you can use *magrittr*'s pipe with *ggplot2* if you pipe
to the `+` symbol. It is convenient if you want to use the ggplot object
as the input of another function without intermediate variables of bracket
overload :
```{r, eval = FALSE}
library(ggplot2)
path <- tempfile()
cars %>%
head() %>%
ggplot(aes(speed, dist)) %>%
+ geom_point() %>%
+ ggtitle("head(cars)") %>%
saveRDS(path)
# rather than
plt <- cars %>%
head() %>%
ggplot(aes(speed, dist)) +
geom_point() +
ggtitle("head(cars)")
saveRDS(plt, path)
```
The former case above shows operators on both sides, which looks a bit complicated,
the latter requires a temporary variable and we must look at the end of the
previous line to know what kind of piping was done.
In both cases additionally if I chose to comment out the `ggtitle("head(cars)")` line, I should
also comment the last operator at the end of the previous line.
With *nakedpipe* we can write :
```{r, eval = FALSE}
cars %.% {
head()
ggplot(aes(speed, dist))
+ geom_point()
+ ggtitle("head(cars)")
saveRDS(path)
}
```
`+` signs are neatly alligned, it's obvious where the *ggplot* chain starts and
ends, and trivial to pipe it to another instruction or to comment a line.
## Conversion to magrittr syntax and back
We provide an addin to ease the conversion.
![Alt Text](https://user-images.githubusercontent.com/18351714/84393270-add85b80-abfb-11ea-8a7d-3ec4c8c59d55.gif)
## Running partial selection
It's easy with standard pipes to run only the first steps of a pipe chain, by
selecting them and running selected code. With *{nakedpipe}* the closing `}` is
missing if we do the same. The addin "nakedpipe run incomplete call" allows one
to run the selection after adding the closing `}`.
## Benchmark
We're a bit faster than *{magrittr} 1.5*, if you want to be even faster use `%..%` with
explicit dots. Note that *magrittr*'s upcoming version is much faster than both,
though keep in mind these are micro seconds and that the fastest
solution is always not to use pipes at all. See below the benchmark using
`{magrittr} 2.0`.
```{r}
library(magrittr)
bench::mark(iterations = 10000,
`%>%` = cars %>%
identity %>%
identity() %>%
identity(.) %>%
{identity(.)},
`%.%` = cars %.% {
identity
identity()
identity(.)
{identity(.)}
},
`%..%` = cars %..% {
identity(.)
identity(.)
identity(.)
{identity(.)}
},
`base` = {
. <- cars
. <- identity(.)
. <- identity(.)
. <- identity(.)
. <- identity(.)
}
)
```
## Snippets
Runing `setup_nakedpipe_snippets()` will open RStudio's snippet file so you
can add our suggested snippets there. Follow the instructions and you'll be able to type :
```{r, eval = FALSE}
cars . # + 2 time the <tab> key
```
and display :
```{r, eval = FALSE}
cars %.% {
# with the cursor conveniently placed here
}
```
(or type `..` to get the `%..%` equivalent)
## Aknowledgements and similar efforts
*{nakedpipe}* is heavily inspired by *{magrittr}* and follows the same dot insertion
rules.
The functions from `*{dplyr}*` and the *tidyverse* in general had a big
influence on `*{nakedpipe}*`.
*{data.table}* is the package behind the `.dt[...]` syntax described above.
Alternative pipes are available on *CRAN*, at the time of writing and to my knowledge,
in packages *wrapr* and *pipeR*. The latter includes a function `pipeline()` that
allows piping a sequence of calls in a similar fashion as *nakedpipe*.