Skip to content

which=NA inconsistent with ?data.table #4411

@MichaelChirico

Description

@MichaelChirico

Was answering this question:

https://stackoverflow.com/q/61554066/3576984

and trying to mirror base behavior with which, i.e., for which to return NA elements of i as NA:

DT = data.table(
  A = c(NA, 3, 5, 0, 1, 2),
  B = c("foo", "foo", "foo", "bar", "bar", "bar")
)
DT[A  > 1, which=TRUE]
# [1] 2 3 6

whereas

DT[ , .I[A > 1]]
# [1] NA  2  3  6

We can use e.g. na.omit to mirror the data.table behavior like .I[na.omit(A > 1)] but doing the reverse seems awkward:

idx = DT[is.na(A) | A > 1, which = TRUE]
is.na(idx) <- is.na(DT[idx]$A)
idx
# [1] NA  2  3  6

I thought this in ?data.table might be helpful:

which

TRUE returns the row numbers of x that i matches to. If NA, returns the row numbers of i that have no match in x. By default FALSE and the rows in x that match are returned.

Sort of reads like DT[A > 1, which = NA] should return 1L, but it errors instead:

DT[A > 1, which = NA]

Error in if (which) return(if (is.null(irows)) seq_len(nrow(x)) else irows) :

missing value where TRUE/FALSE needed

Should we change the documentation, or is this a bug?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions