diff --git a/README.Rmd b/README.Rmd
index bdbc2650..ef624402 100644
--- a/README.Rmd
+++ b/README.Rmd
@@ -23,7 +23,7 @@ knitr::opts_chunk$set(
## Overview
-dtplyr provides a [data.table](http://r-datatable.com/) backend for dplyr. The goal of dtplyr is to allow you to write dplyr code that is automatically translated to the equivalent, but usually much faster, data.table code.
+dtplyr provides a [data.table](http://r-datatable.com/) backend for dplyr. The goal of dtplyr is to allow you to write dplyr code that is automatically translated to the equivalent, but usually much faster, data.table code.
See `vignette("translation")` for details of the current translations, and [table.express](https://github.com/asardaes/table.express) and [rqdatatable](https://github.com/WinVector/rqdatatable/) for related work.
@@ -52,7 +52,7 @@ library(dtplyr)
library(dplyr, warn.conflicts = FALSE)
```
-Then use `lazy_dt()` to create a "lazy" data table that tracks the operations performed on it.
+Then use `lazy_dt()` to create a "lazy" data table that tracks the operations performed on it.
```{r}
mtcars2 <- lazy_dt(mtcars)
@@ -61,21 +61,21 @@ mtcars2 <- lazy_dt(mtcars)
You can preview the transformation (including the generated data.table code) by printing the result:
```{r}
-mtcars2 %>%
- filter(wt < 5) %>%
+mtcars2 %>%
+ filter(wt < 5) %>%
mutate(l100k = 235.21 / mpg) %>% # liters / 100 km
- group_by(cyl) %>%
+ group_by(cyl) %>%
summarise(l100k = mean(l100k))
```
But generally you should reserve this only for debugging, and use `as.data.table()`, `as.data.frame()`, or `as_tibble()` to indicate that you're done with the transformation and want to access the results:
```{r}
-mtcars2 %>%
- filter(wt < 5) %>%
+mtcars2 %>%
+ filter(wt < 5) %>%
mutate(l100k = 235.21 / mpg) %>% # liters / 100 km
- group_by(cyl) %>%
- summarise(l100k = mean(l100k)) %>%
+ group_by(cyl) %>%
+ summarise(l100k = mean(l100k)) %>%
as_tibble()
```
@@ -83,13 +83,13 @@ mtcars2 %>%
There are two primary reasons that dtplyr will always be somewhat slower than data.table:
-* Each dplyr verb must do some work to convert dplyr syntax to data.table
- syntax. This takes time proportional to the complexity of the input code,
+* Each dplyr verb must do some work to convert dplyr syntax to data.table
+ syntax. This takes time proportional to the complexity of the input code,
not the input _data_, so should be a negligible overhead for large datasets.
- [Initial benchmarks][benchmark] suggest that the overhead should be under
+ [Initial benchmarks][benchmark] suggest that the overhead should be under
1ms per dplyr call.
-* To match dplyr semantics, `mutate()` does not modify in place by default.
+* To match dplyr semantics, `mutate()` does not modify in place by default.
This means that most expressions involving `mutate()` must make a copy
that would not be necessary if you were using data.table directly.
(You can opt out of this behaviour in `lazy_dt()` with `immutable = FALSE`).
diff --git a/README.md b/README.md
index 7c7fb568..bbaa64fd 100644
--- a/README.md
+++ b/README.md
@@ -14,10 +14,10 @@ coverage](https://codecov.io/gh/tidyverse/dtplyr/branch/main/graph/badge.svg)](h
## Overview
-dtplyr provides a [data.table](http://r-datatable.com/) backend for
-dplyr. The goal of dtplyr is to allow you to write dplyr code that is
-automatically translated to the equivalent, but usually much faster,
-data.table code.
+dtplyr
+provides a [data.table](http://r-datatable.com/) backend for dplyr. The
+goal of dtplyr is to allow you to write dplyr code that is automatically
+translated to the equivalent, but usually much faster, data.table code.
See `vignette("translation")` for details of the current translations,
and [table.express](https://github.com/asardaes/table.express) and
@@ -62,10 +62,10 @@ You can preview the transformation (including the generated data.table
code) by printing the result:
``` r
-mtcars2 %>%
- filter(wt < 5) %>%
+mtcars2 %>%
+ filter(wt < 5) %>%
mutate(l100k = 235.21 / mpg) %>% # liters / 100 km
- group_by(cyl) %>%
+ group_by(cyl) %>%
summarise(l100k = mean(l100k))
#> Source: local data table [3 x 2]
#> Call: `_DT1`[wt < 5][, `:=`(l100k = 235.21/mpg)][, .(l100k = mean(l100k)),
@@ -85,11 +85,11 @@ But generally you should reserve this only for debugging, and use
you’re done with the transformation and want to access the results:
``` r
-mtcars2 %>%
- filter(wt < 5) %>%
+mtcars2 %>%
+ filter(wt < 5) %>%
mutate(l100k = 235.21 / mpg) %>% # liters / 100 km
- group_by(cyl) %>%
- summarise(l100k = mean(l100k)) %>%
+ group_by(cyl) %>%
+ summarise(l100k = mean(l100k)) %>%
as_tibble()
#> # A tibble: 3 × 2
#> cyl l100k
diff --git a/man/figures/dt-seal.png b/man/figures/dt-seal.png
new file mode 100644
index 00000000..1a1aacdb
Binary files /dev/null and b/man/figures/dt-seal.png differ