At the moment, the help documentation for %between% states that
between is equivalent to x >= lower & x <= upper when incbounds=TRUE, or x > lower & y < upper when FALSE. With a caveat that NA in lower or upper are taken as a missing bound and return TRUE not NA.
While this appears to be true for numeric vectors, for character vectors, NA is not treated as a missing bound and yields NA, as seen below, and this difference is not noted in the documentation. Since (theoretically) the lexicographic order is a total order, one might expect the same or similar behavior unless indicated otherwise, and the documentation appears to imply this, given it does not mention character vectors are treated differently. The different behavior on character vectors should probably be either (a) documented if this is intended or (b) altered to treat NA as a missing bound (as with numeric vectors).
# Minimal reproducible example]
library(data.table)
numbers <- 1:26
numbers %between% c(13, NA)
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE
# [14] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
letters %between% c("m", NA_character_)
# [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE NA
# [14] NA NA NA NA NA NA NA NA NA NA NA NA NA
# Output of sessionInfo()
R version 3.6.0 (2019-04-26)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.5
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] data.table_1.12.3
loaded via a namespace (and not attached):
[1] compiler_3.6.0 tools_3.6.0
data.table package is the most recent stable development version pulled today using install.packages.
At the moment, the help documentation for
%between%states thatWhile this appears to be true for numeric vectors, for character vectors,
NAis not treated as a missing bound and yieldsNA, as seen below, and this difference is not noted in the documentation. Since (theoretically) the lexicographic order is a total order, one might expect the same or similar behavior unless indicated otherwise, and the documentation appears to imply this, given it does not mention character vectors are treated differently. The different behavior on character vectors should probably be either (a) documented if this is intended or (b) altered to treatNAas a missing bound (as with numeric vectors).#Minimal reproducible example]#Output of sessionInfo()data.tablepackage is the most recent stable development version pulled today usinginstall.packages.