Scope for %plike%

Just reading the new dev notes and noticed  #3333. I was going to actually feature request `%likep%` (would make sense to conform to `%plike%`) the other day, but decided against it (thought maybe the consensus was that less convenience wrappers were more ideal for `data.table`. Any particular reason why data.table can't incorporate another, leveraging the `perl = TRUE` argument?

Often you get considerable speed improvements, and a bunch of [other features / behaviors](https://stackoverflow.com/questions/47240375/regular-expressions-in-base-r-perl-true-vs-the-default-pcre-vs-tre)

```R
# Following packages required .
# install.packages(c("stringi", "microbenchmark")

# load data.table.
library(data.table)

# Create a data.table of 100,000 random strings (20 chars in length).
DT = data.table(x = stringi::stri_rand_strings(100000, 20))

# Define a trivial regex pattern.
regex_pattern = "car|blah|far|nah"

# Create an alternative to %like% that sets `perl = TRUE`.
`%likep%` = function (vector, pattern) {
    if (is.factor(vector)) {
        as.integer(vector) %in% grep(pattern, levels(vector), perl = TRUE)
    }
    else {
        grepl(pattern, vector, perl = TRUE)
    }
}

# Microbenchmark the results to demonstrate speed improvements.
microbenchmark::microbenchmark(like = {(DT[x %like% regex_pattern])}, likep = (DT[x %likep% regex_pattern]))
# Unit: milliseconds
#   expr     min       lq     mean   median       uq      max neval
#   like 84.1235 86.56265 91.51547 87.74410 91.16710 159.6292   100
#  likep 16.0932 16.64750 17.81476 16.95985 17.82195  34.1415   100
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scope for %plike% #3702

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Scope for %plike% #3702

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions