Skip to content

Strange behavior when subsetting with column names quoted with backticks in I of data.table #2931

@mt1022

Description

@mt1022

Subsetting with complex column names quoted with backticks will throw errors. See the following reproducible example:

library(data.table)

DT <- data.table(id = letters[1:3], `counts(a>=0)` = 1:3)

DT[`counts(a>=0)` == 2] 
#> Error in `[.data.table`(DT, `counts(a>=0)` == 2): Column(s) [counts(a] not found in x

This has been posted on Stackoverflow here. This seems to be an issue exists in v1.11.+ (as found by David Arenburg and PKumar) but not previous release. I tried the developmental version and got the same error. David Arenburg has investigated this issue and found it might be caused by .prepareFastSubset (see the comments in the linked SO post for details), which I think would be very helpful to fix this issue.

Some workarounds like DT[as.numeric(`counts(a>=0)`) == 2] and DT[(`counts(a>=0)`) == 2] by Aurelien callens and Marius work fine.

Here is the session info:

R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /usr/lib/atlas-base/atlas/libblas.so.3.0
LAPACK: /usr/lib/atlas-base/atlas/liblapack.so.3.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] data.table_1.11.5

loaded via a namespace (and not attached):
[1] compiler_3.4.4 tools_3.4.4    yaml_2.1.17

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions