Description
I've encountered an error when combining max(..., na.rm = TRUE) with any(is.na(...)) in the same grouped operation. The operation fails with a type error, but works fine when na.rm = FALSE.
Expected behavior
The second operation below should work the same as the first one, just handling NAs differently via na.rm = TRUE.
Observed behavior
The operation fails with a type error suggesting column type inconsistency across groups, even though all input columns are integers.
Column 1 of result for group 3 is type 'double' but expecting type 'integer'. Column types must be consistent for each group.
Minimal reproducible example
Create sample data
library(data.table)
dt_test = data.table(
hh_id = rep(1:3, each = 3),
age = c(65L, 65L, NA_integer_,
45L, 45L, 45L,
NA_integer_, NA_integer_, NA_integer_),
disability = sample(0L:1L, 9, replace = T)
)
Works (when na.rm = FALSE)
dt_test[, .(
age = max(age, na.rm = FALSE)
,any_disability = max(disability, na.rm = FALSE)
,any_disabilityNA = any(is.na(disability))
), by = hh_id]
Fails (when set na.rm = TRUE)
dt_test[, .(
age = max(age, na.rm = TRUE)
,any_disability = max(disability, na.rm = TRUE)
,any_disabilityNA = any(is.na(disability))
), by = hh_id, verbose = TRUE]
Works (when comment out the any(...) )
dt_test[, .(
age = max(age, na.rm = TRUE)
, any_disability = max(disability, na.rm = TRUE)
#, any_disabilityNA = any(is.na(disability))
), by = hh_id]
## Works (when comment out the max(age) )
dt_test[, .(
# age = max(age, na.rm = TRUE),
any_disability = max(disability, na.rm = TRUE)
, any_disabilityNA = any(is.na(disability))
), by = hh_id]
Output of sessionInfo()
> sessionInfo()
R version 4.4.1 (2024-06-14 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 10 x64 (build 19045)
Matrix products: default
locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8 LC_NUMERIC=C LC_TIME=English_United States.utf8
time zone: America/Los_Angeles
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.16.2
loaded via a namespace (and not attached):
[1] compiler_4.4.1 tools_4.4.1
Description
I've encountered an error when combining
max(..., na.rm = TRUE)withany(is.na(...))in the same grouped operation. The operation fails with a type error, but works fine whenna.rm = FALSE.Expected behavior
The second operation below should work the same as the first one, just handling NAs differently via na.rm = TRUE.
Observed behavior
The operation fails with a type error suggesting column type inconsistency across groups, even though all input columns are integers.
Minimal reproducible example
Create sample data
Works (when na.rm = FALSE)
Fails (when set na.rm = TRUE)
Works (when comment out the any(...) )
Output of sessionInfo()