-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
value vs magrittr? #13
Comments
See issue #12.
|
Thanks for your remind! I'm a F# developer and was a magrittr user. I'm happy to see that the idea of pipeline in F# is brought into R by the developers of magrittr. But finally I decide to create a light version that does not try to combine multiple pipe mechanisms together, which I find adds cognitive burdens and significantly lower the performance, because the operator must handle multiple mechanisms when I use one simple operator. Sometimes it is confusing to read code in which one operator undertakes the responsibility to run multiple pipe mechanisms. And that's why I create this package to follow different principles:
I believe for piping each operator should be as simple as possible. If you review the code of magrittr, you will find that its performance cannot be very high. Although it is a bit unfair to ask pipe operators for performance, it is still useful for those who heavily use it. Here is a set of benchmark tests for the two packages at their latest dev version: First-argument piping:
and the result is
Piping with
and the result is
Nested piping:
the result is
Lambda piping:
the result is
You see that pipeR is almost 3-4 times as fast as magrittr. I know the design purpose of piping is not for performance but for easiness to write fluent code. And it is also unfair to compare a toolbox-like operator in magrittr with each specialized operator in pipeR. But as I heavily use it, I hope to use multiple operators that I exactly know what a simple thing each of them does and want less overhead. That's why I create this package: to write fluent code with readability and robustness and without much overhead and ambiguity. |
In addition,
because
which is quite equal but pipeR is a bit simpler. |
Thanks for the thorough response! This is actually quite interesting/useful. |
I have done a group of performance tests on different aspects of these two packages and find that pipeR is generally 3-7 times faster than magrittr, and can be much much faster than it when the chain is long. I will fire an issue at dplyr on this saying that the implementation of magrittr potentially reduces the overall performance when the chain is long and the data is big. More specifically, the computing time of magrittr is polynomial to chain length while pipeR is slower than linear. Here's a benchmark result from first-argument piping for 50 steps:
Another from dot piping for 50 steps.
Although a chaining operation of 50 steps looks ridiculous, the result really means something. |
Performance can be essential, and one can definitely argue that there is room for
However, the latter is rarely seen in practice. But one might do it in situations where speed matters, e.g. simulations. (again, my main concern is that |
As a potential user of pipeR, I agree that it would be better if pipeR used a different operator in place of %>%. I for one would be more likely to use it if it did not conflict with magrittr, which I use extensively. I would like to be able to use both, rather than being forced to choose (or being very careful with the way that the two packages are loaded, which is an unnecessary hassle). I like certain aspects of pipeR, but using %>% seems to create unnecessary competition, where I would prefer coexistence (as an ecologist, I know that coexistence is promoted by reducing overlap in resources). Just my two cents. |
My exact point in #12 |
Thanks for your kind replies! This issue should be only about the value of this package. Let's talk about naming operators in issue #12. |
Out of curiosity, are you familiar with the magrittr package. Considering it has an already robust implementation for piping and is incorporated into Hadley's dplyr and ggvis packages it seems that your dev effort could be better off rollingyour additional ideas into that package?see here for a link to the package.
You could also take a look how he handled lambdas and aliases.
The text was updated successfully, but these errors were encountered: