Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add collector-level na support #541

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

khusmann
Copy link

This PR adds support for collector-level na args (#532). This way, different lists of missing values can be specified for each column, overriding the global na arg in the call to vroom().

Example:

vroom(
  I("a,b,c\na,foo,REFUSED\nb,REFUSED,MISSING\nOMITTED,bar,OMITTED\n"),
  col_types = cols(
    a = col_character(na = "OMITTED"),
    b = col_character(na = "REFUSED"),
    c = col_character()
  ),
  na = "MISSING"
)
#> # A tibble: 3 × 3
#>   a     b     c      
#>   <chr> <chr> <chr>  
#> 1 a     foo   REFUSED
#> 2 b     NA    NA     
#> 3 NA    bar   OMITTED

Without this PR, it is very difficult to efficiently read columns with different lists of missing values. Instead, they have to be loaded as character vectors, then parsed with readr::parse_*() or readr::type_convert(). There are two problems with this:

I'm hoping you'll consider this PR for inclusion to vroom – it only requires a few changes, is 100% backwards compatible, and adds a feature that cannot otherwise be implemented in a separate package (without duplicating all of vroom's internals). Please let me know if there is anything more I can do to advocate for it. Thank you for your consideration!

@khusmann
Copy link
Author

Note that this is failing the check for windows-latest (3.6) because the runner is grabbing the latest version of evaluate, which now requires R >= 4.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant