Skip to content

Commit

Permalink
major operator change (see issue #12, #13)
Browse files Browse the repository at this point in the history
  • Loading branch information
renkun-ken committed Jun 3, 2014
1 parent e6a0821 commit ec3193c
Show file tree
Hide file tree
Showing 7 changed files with 74 additions and 69 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@

# R project files
.Rproj.user

6 changes: 3 additions & 3 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: pipeR
Type: Package
Title: Pipeline operators for R
Version: 0.2-4
Version: 0.3
Author: Kun Ren <renkun@outlook.com>
Maintainer: Kun Ren <renkun@outlook.com>
Description: Provides operators for chaining
Expand All @@ -10,9 +10,9 @@ Description: Provides operators for chaining
and lambda piping.
Depends:
R (>= 2.14)
Date: 2014-05-31
Date: 2014-06-03
Suggests:
dplyr
plyr,dplyr
Enhances: magrittr
License: MIT + file LICENSE
URL: http://renkun.me/pipeR,
Expand Down
2 changes: 1 addition & 1 deletion NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Generated by roxygen2 (4.0.1): do not edit by hand

export("%>%")
export("%:>%")
export("%>>%")
export("%|>%")
22 changes: 11 additions & 11 deletions R/pipeR.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#' Pipe an object forward as the first argument to a function
#'
#' The \code{\%>\%} operator evaluates the function call on the right-hand side
#' The \code{\%>>\%} operator evaluates the function call on the right-hand side
#' with the left-hand side object being the first argument.
#'
#' @param . The object to be piped as the first argument
Expand All @@ -9,17 +9,17 @@
#' @export
#' @examples
#' \dontrun{
#' rnorm(100) %>% plot
#' rnorm(100) %>>% plot
#'
#' rnorm(100) %>% plot(col="red")
#' rnorm(100) %>>% plot(col="red")
#'
#' rnorm(1000) %>% sample(size=100,replace=F) %>% hist
#' rnorm(1000) %>>% sample(size=100,replace=F) %>>% hist
#' }
`%>%` <- .pipe
`%>>%` <- .pipe

#' Pipe an object forward as `.` to an expression
#'
#' The operator \code{\%>>\%} evaluates the expression on the right-hand side
#' The operator \code{\%:>\%} evaluates the expression on the right-hand side
#' with the left-hand side object referred to as \code{.}.
#'
#' @param . The object to be piped as represented by \code{.}
Expand All @@ -28,17 +28,17 @@
#' @export
#' @examples
#' \dontrun{
#' rnorm(100) %>>% plot(.)
#' rnorm(100) %:>% plot(.)
#'
#' rnorm(100) %>>% plot(.,col="red")
#' rnorm(100) %:>% plot(.,col="red")
#'
#' rnorm(1000) %>>% sample(.,size=length(.)*0.1,replace=FALSE)
#' rnorm(1000) %:>% sample(.,size=length(.)*0.1,replace=FALSE)
#'
#' rnorm(1000) %>>%
#' rnorm(1000) %:>%
#' sample(.,length(.)*0.1,FALSE) %>>%
#' plot(.,main=sprintf("length: %d",length(.)))
#' }
`%>>%` <- .fpipe
`%:>%` <- .fpipe

#' Pipe an object by lambda expression
#'
Expand Down
86 changes: 45 additions & 41 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,90 +36,90 @@ plot(diff(log(sample(rnorm(10000,mean=10,sd=1),size=100,replace=FALSE))),col="re

The code is neither straightforward for reading nor flexible for modification. It is because the functions in the first few steps are hiding in the nested brackets, and the written order of the functions goes against the order of logic.

pipeR borrows the idea of F# pipeline operator which allows you to write the *object* first and *pipe* it to a following *function*. This package defines three binary pipe operators that provide different types of forward-piping mechanisms: first-argument piping (`%>%`), free piping (`%>>%`), and lambda piping (`%|>%`). And the real magic of this kind of operators is chaining commands by the right order.
pipeR borrows the idea of F# pipeline operator which allows you to write the *object* first and *pipe* it to a following *function*. This package defines three binary pipe operators that provide different types of forward-piping mechanisms: first-argument piping (`%>>%`), free piping (`%:>%`), and lambda piping (`%|>%`). And the real magic of this kind of operators is chaining commands by the right order.

### First-argument piping: `%>%`
### First-argument piping: `%>>%`

The first-argument pipe operator `%>%` inserts the expression on the left-hand side to the first argument of the **function** on the right-hand side. In other words, `x %>% f(a=1)` will be transformed to and be evaluated as `f(x,a=1)`. This operator accepts both function call, e.g. `plot()` or `plot(col="red")`, and function name, e.g. `log` or `plot`.
The first-argument pipe operator `%>>%` inserts the expression on the left-hand side to the first argument of the **function** on the right-hand side. In other words, `x %>>% f(a=1)` will be transformed to and be evaluated as `f(x,a=1)`. This operator accepts both function call, e.g. `plot()` or `plot(col="red")`, and function name, e.g. `log` or `plot`.

```
rnorm(100) %>% plot
rnorm(100) %>>% plot
# plot(rnorm(100))
rnorm(100) %>% plot()
rnorm(100) %>>% plot()
# plot(rnorm(100))
rnorm(100) %>% plot(col="red")
rnorm(100) %>>% plot(col="red")
# plot(rnorm(100),col="red")
rnorm(100) %>% sample(size=100,replace=FALSE) %>% hist
rnorm(100) %>>% sample(size=100,replace=FALSE) %>>% hist
# hist(sample(rnorm(100),size=100,replace=FALSE))
```

With the first-argument pipe operator `%>%`, you may rewrite the first example as
With the first-argument pipe operator `%>>%`, you may rewrite the first example as

```
rnorm(10000,mean=10,sd=1) %>%
sample(size=100,replace=FALSE) %>%
log %>%
diff %>%
rnorm(10000,mean=10,sd=1) %>>%
sample(size=100,replace=FALSE) %>>%
log %>>%
diff %>>%
plot(col="red",type="l")
```

### Free piping: `%>>%`
### Free piping: `%:>%`

You may not always want to pipe the object to the first argument of the next function. Then you can use free pipe operator `%>>%`, which takes `.` to represent the piped object on the left-hand side and evaluate the *expression* on the right-hand side with `.` as the piped object. In other words, you have the right to decide where the object should be piped to.
You may not always want to pipe the object to the first argument of the next function. Then you can use free pipe operator `%:>%`, which takes `.` to represent the piped object on the left-hand side and evaluate the *expression* on the right-hand side with `.` as the piped object. In other words, you have the right to decide where the object should be piped to.

```
rnorm(100) %>>% plot(.)
rnorm(100) %:>% plot(.)
# plot(rnorm(100))
rnorm(100) %>>% plot(., col="red")
rnorm(100) %:>% plot(., col="red")
# plot(rnorm(100),col="red")
rnorm(100) %>>% sample(., size=length(.)*0.5)
rnorm(100) %:>% sample(., size=length(.)*0.5)
# (`.` is piped to multiple places)
mtcars %>>% lm(mpg ~ cyl + disp, data=.) %>% summary
mtcars %:>% lm(mpg ~ cyl + disp, data=.) %>>% summary
# summary(lm(mgp ~ cyl + disp, data=mtcars))
rnorm(100) %>>%
sample(.,length(.)*0.2,FALSE) %>>%
rnorm(100) %:>%
sample(.,length(.)*0.2,FALSE) %:>%
plot(.,main=sprintf("length: %d",length(.)))
# (`.` is piped to multiple places and mutiple levels)
rnorm(100) %>>% {
rnorm(100) %:>% {
par(mfrow=c(1,2))
hist(.,main="hist")
plot(.,col="red",main=sprintf("%d",length(.)))
}
# (`.` is piped to an enclosed expression)
rnorm(10000,mean=10,sd=1) %>>%
sample(.,size=length(.)/500,replace=FALSE) %>%
log %>%
diff %>>%
rnorm(10000,mean=10,sd=1) %:>%
sample(.,size=length(.)/500,replace=FALSE) %>>%
log %>>%
diff %:>%
plot(.,col="red",type="l",main=sprintf("length: %d",length(.)))
# (`%>%` and `%>>%` are used together. Be clear what they mean)
# (`%>>%` and `%:>%` are used together. Be clear what they mean)
```

### Lambda piping: `%|>%`

It can be confusing to see multiple `.` symbols in the same context. In some cases, they may represent different things in the same expression. Even though the expression mostly still works, it may not be a good idea to keep it in that way. Here is an example:

```
mtcars %>>%
lm(mpg ~ ., data=.) %>%
mtcars %:>%
lm(mpg ~ ., data=.) %>>%
summary
```

The code above works correctly with `%>>%` and `%>%`, even though the two dots in the second line have different meanings. `.` in formula `mpg ~ .` represents all variables other than `mpg` in data frame `mtcars`; `.` in `data=.` represents `mtcars` itself. One way to reduce ambiguity is to use *lambda expression* that names the piped object on the left of `~` and specifies the expression to evaluate on the right.
The code above works correctly with `%:>%` and `%>>%`, even though the two dots in the second line have different meanings. `.` in formula `mpg ~ .` represents all variables other than `mpg` in data frame `mtcars`; `.` in `data=.` represents `mtcars` itself. One way to reduce ambiguity is to use *lambda expression* that names the piped object on the left of `~` and specifies the expression to evaluate on the right.

A new pipe operator `%|>%` is defined, which works with lambda expression in the formula form `x ~ f(x)`. More specifically, the expression will be interpreted as *`f(x)` is evaluated with `x` being the piped object*. Therefore, the previous example can be rewritten with `%|>%` like this:

```
mtcars %|>%
(df ~ lm(mpg ~ ., data=df)) %>%
(df ~ lm(mpg ~ ., data=df)) %>>%
summary
```

Expand All @@ -143,14 +143,14 @@ All the pipe operators can be used together and each of them only works in their

```
mtcars %|>%
(df ~ lm(mpg ~ ., data=df)) %>%
summary %>>%
(df ~ lm(mpg ~ ., data=df)) %>>%
summary %:>%
.$fstatistic
```

### Piping with `dplyr` package

`dplyr` package provides a group of functions that make data transformation much easier. `%.%` is a built-in chain operator that pipes the previous result to the first-argument in the next function call. `%>%` is fully compatible with `dplyr` and can replace `%.%` with more consistency.
`dplyr` package provides a group of functions that make data transformation much easier. `%.%` is a built-in chain operator that pipes the previous result to the first-argument in the next function call. `%>>%` is fully compatible with `dplyr` and can replace `%.%` with more consistency.

The following code demonstrates mixed piping with `dplyr` functions.

Expand All @@ -160,14 +160,14 @@ library(hflights)
library(pipeR)
data(hflights)
hflights %>%
mutate(Speed=Distance/ActualElapsedTime) %>%
group_by(UniqueCarrier) %>%
hflights %>>%
mutate(Speed=Distance/ActualElapsedTime) %>>%
group_by(UniqueCarrier) %>>%
summarize(n=length(Speed),speed.mean=mean(Speed,na.rm = T),
speed.median=median(Speed,na.rm=T),
speed.sd=sd(Speed,na.rm=T)) %>%
mutate(speed.ssd=speed.mean/speed.sd) %>%
arrange(desc(speed.ssd)) %>>%
speed.sd=sd(Speed,na.rm=T)) %>>%
mutate(speed.ssd=speed.mean/speed.sd) %>>%
arrange(desc(speed.ssd)) %:>%
barplot(.$speed.ssd, names.arg = .$UniqueCarrier,
main=sprintf("Standardized mean of %d carriers", nrow(.)))
```
Expand All @@ -176,10 +176,14 @@ hflights %>%

The reason why the three operators are not "integrated" into one is that I want to make the functionality of each operator as clear and independent as possible, so that guessing and ambiguity could be sharply reduced. When you decide to use pipe operators to build a chain of expressions, you need to know clearly how you want to pipe your results to the next level. The following bullets are a brief summary:

1. `%>%` only pipes an object to the first-argument of the next *function*, that is, `x %>% f(...)` runs as `f(x,...)`.
2. `%>>%` only evaluates the next *expression* with `.` representing the object being piped, that is, `x %>>% f(a,.,g(.))` runs as `f(a,x,g(x))`.
1. `%>>%` only pipes an object to the first-argument of the next *function*, that is, `x %>>% f(...)` runs as `f(x,...)`.
2. `%:>%` only evaluates the next *expression* with `.` representing the object being piped, that is, `x %:>% f(a,.,g(.))` runs as `f(a,x,g(x))`.
3. `%|>%` only evaluates the *expression* on the right-hand side of `~` in the lambda expression formula with symbol on the left representing the object being piped, that is, `x %|>% (a ~ f(a,g(a)))` runs as `f(x,g(x))`.

## Performance

Since each pipe operators defined in this package specializes in its work and is made as simple as possible, the overhead is significantly lower than its peer implmentation in `magrittr` package. In general, `pipeR` is more than 3 times faster than `magrittr` and can be more than 30 times faster when the pipeline gets longer or when the data gets bigger. The detailed performance tests can be seen in issues.

## Help overview

```
Expand Down
12 changes: 6 additions & 6 deletions man/first-argument-piping.Rd
Original file line number Diff line number Diff line change
@@ -1,27 +1,27 @@
% Generated by roxygen2 (4.0.1): do not edit by hand
\name{first-argument piping}
\alias{\%>\%}
\alias{\%>>\%}
\alias{first-argument piping}
\title{Pipe an object forward as the first argument to a function}
\usage{
. \%>\% fun
. \%>>\% fun
}
\arguments{
\item{.}{The object to be piped as the first argument}

\item{fun}{The function call to evaluate with the piped object as the first argument.}
}
\description{
The \code{\%>\%} operator evaluates the function call on the right-hand side
The \code{\%>>\%} operator evaluates the function call on the right-hand side
with the left-hand side object being the first argument.
}
\examples{
\dontrun{
rnorm(100) \%>\% plot
rnorm(100) \%>>\% plot

rnorm(100) \%>\% plot(col="red")
rnorm(100) \%>>\% plot(col="red")

rnorm(1000) \%>\% sample(size=100,replace=F) \%>\% hist
rnorm(1000) \%>>\% sample(size=100,replace=F) \%>>\% hist
}
}

14 changes: 7 additions & 7 deletions man/free-piping.Rd
Original file line number Diff line number Diff line change
@@ -1,29 +1,29 @@
% Generated by roxygen2 (4.0.1): do not edit by hand
\name{free-piping}
\alias{\%>>\%}
\alias{\%:>\%}
\alias{free-piping}
\title{Pipe an object forward as `.` to an expression}
\usage{
. \%>>\% expr
. \%:>\% expr
}
\arguments{
\item{.}{The object to be piped as represented by \code{.}}

\item{expr}{The expression to evaluate with the piped object referred to as \code{.}}
}
\description{
The operator \code{\%>>\%} evaluates the expression on the right-hand side
The operator \code{\%:>\%} evaluates the expression on the right-hand side
with the left-hand side object referred to as \code{.}.
}
\examples{
\dontrun{
rnorm(100) \%>>\% plot(.)
rnorm(100) \%:>\% plot(.)

rnorm(100) \%>>\% plot(.,col="red")
rnorm(100) \%:>\% plot(.,col="red")

rnorm(1000) \%>>\% sample(.,size=length(.)*0.1,replace=FALSE)
rnorm(1000) \%:>\% sample(.,size=length(.)*0.1,replace=FALSE)

rnorm(1000) \%>>\%
rnorm(1000) \%:>\%
sample(.,length(.)*0.1,FALSE) \%>>\%
plot(.,main=sprintf("length: \%d",length(.)))
}
Expand Down

0 comments on commit ec3193c

Please sign in to comment.