Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

should type_convert() erase the 'spec' attribute? #1032

Closed
wibeasley opened this issue Oct 9, 2019 · 6 comments · Fixed by #1033
Closed

should type_convert() erase the 'spec' attribute? #1032

wibeasley opened this issue Oct 9, 2019 · 6 comments · Fixed by #1033

Comments

@wibeasley
Copy link
Contributor

wibeasley commented Oct 9, 2019

When readr::type_convert() is applied to a data.frame, should it update the 'spec' attribute? Currently, it does not, and potentially returns a misleading result.

library(magrittr); library(readr)
d1 <- 
  read_csv(
    readr_example("mtcars.csv"),
    col_types = cols(.default = col_character())
  )
str(d1)

After the initial creation, the data.frame's spec attribute is correct (i.e., all columns are character).

Classes ‘spec_tbl_df’, ‘tbl_df’, ‘tbl’ and 'data.frame':	32 obs. of  11 variables:
 $ mpg : chr  "21" "21" "22.8" "21.4" ...
 ...
 $ carb: chr  "4" "4" "1" "1" ...
 - attr(*, "spec")=
  .. cols(
  ..   .default = col_character(),
  ..   mpg = col_character(),
  ..   cyl = col_character(),
  ...
  ..   carb = col_character()
  .. )

Then type_convert() correctly converts everything to doubles.

d2 <- data %>% 
  type_convert()
Parsed with column specification:
cols(
  mpg = col_double(),
  cyl = col_double(),
  ...
  carb = col_double()
)

But the spec still shows the character vectors.

spec(d2)
cols(
  .default = col_character(),
  mpg = col_character(),
  cyl = col_character(),
  ...
  carb = col_character()
)

BTW, I really like this function. It was nice when I was stacking a bunch of dataframe on top of each other, but some columns in some sub-dataframes were empty to the data types weren't compatible during the stack.

@jimhester
Copy link
Collaborator

jimhester commented Oct 9, 2019

spec() is really meant to capture the specification of the original import, I don't think we should change it after the fact. I would be more on board with removing the spec attribute entirely after using type_convert().

@wibeasley wibeasley changed the title should type_convert() set the 'spec' attribute? should type_convert() ~~set~~ erase the 'spec' attribute? Oct 9, 2019
@wibeasley wibeasley changed the title should type_convert() ~~set~~ erase the 'spec' attribute? should type_convert() erase the 'spec' attribute? Oct 9, 2019
@wibeasley
Copy link
Contributor Author

@jimhester, makes sense to me. I changed the issue title to reflect that.

Do you want me to try a PR?

@jimhester
Copy link
Collaborator

Sure, you can actually just change the last line to be df[] and it should drop the spec attribute.

@wibeasley
Copy link
Contributor Author

@jimhester, thanks for the suggestion. I forgot that was even an option. Would you prefer instead attr(df, "spec") <- NULL? (Placed immediately before df is returned to the caller?)

Does df[] drop all attributes, including others that you may want to retain? (Even if they hadn't been programmed yet.) I'm wondering if someone might call readr::type_convert() on datasets created by something like haven, where they do want to keep those attributes.

@jimhester
Copy link
Collaborator

Fair questions, I am fine with doing attr(df, "spec") <- NULL as it is more explicit.

jimhester pushed a commit that referenced this issue Oct 17, 2019
closes #1032

(I realize I'm bending the [acknowledgement style](https://style.tidyverse.org/news.html#acknowledgement) by including Jim in this bullet.  But it was his idea to remove the spec attribute, instead of update it.)
@lock
Copy link

lock bot commented Apr 15, 2020

This old issue has been automatically locked. If you believe you have found a related problem, please file a new issue (with reprex) and link to this issue. https://reprex.tidyverse.org/

@lock lock bot locked and limited conversation to collaborators Apr 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants