Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coalesce() silently coerces NaN to NA #6833

Closed
thisisnic opened this issue Apr 24, 2023 · 2 comments
Closed

coalesce() silently coerces NaN to NA #6833

thisisnic opened this issue Apr 24, 2023 · 2 comments

Comments

@thisisnic
Copy link
Contributor

I was fixing some Arrow test failures which are the result of the dev version of waldo correctly distinguishing between NA_real_ and NaN, and came across this behaviour where NaN values are silently coerced by dplyr::coalesce() into NAs.

I'm unsure if this is a bug or just a difference in implementations/interpretations, but the second row in the cwx column in the example below is making me think it could be a bug?

library(dplyr)
library(arrow)

df <- tibble(
  w = c(NaN, NaN, NA_real_),
  x = c(NA_real_, NaN, 3.3),
  y = c(NA_real_, 2.2, 3.3),
  z = c(1.1, 2.2, 3.3)
)

df %>%
  mutate(
    cw = coalesce(w),
    cz = coalesce(z),
    cwx = coalesce(w, x),
    cwxy = coalesce(w, x, y),
    cwxyz = coalesce(w, x, y, z)
  )
#> # A tibble: 3 × 9
#>       w     x     y     z    cw    cz   cwx  cwxy cwxyz
#>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1   NaN  NA    NA     1.1    NA   1.1  NA    NA     1.1
#> 2   NaN NaN     2.2   2.2    NA   2.2  NA     2.2   2.2
#> 3    NA   3.3   3.3   3.3    NA   3.3   3.3   3.3   3.3

df %>%
  arrow_table() %>%
  mutate(
    cw = coalesce(w),
    cz = coalesce(z),
    cwx = coalesce(w, x),
    cwxy = coalesce(w, x, y),
    cwxyz = coalesce(w, x, y, z)
  ) %>%
  collect()
#> # A tibble: 3 × 9
#>       w     x     y     z    cw    cz   cwx  cwxy cwxyz
#>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1   NaN  NA    NA     1.1   NaN   1.1  NA    NA     1.1
#> 2   NaN NaN     2.2   2.2   NaN   2.2 NaN     2.2   2.2
#> 3    NA   3.3   3.3   3.3    NA   3.3   3.3   3.3   3.3

Created on 2023-04-24 with reprex v2.0.2

@thisisnic thisisnic changed the title coalesce() slient coerces NaN to NA coalesce() silently coerces NaN to NA Apr 24, 2023
@hadley
Copy link
Member

hadley commented Jun 28, 2023

is.na(NaN) is TRUE so I think this is probably correct?

@thisisnic
Copy link
Contributor Author

At any rate, it's not causing anyone actual issues as far as I'm aware, so I'll close this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants