Skip to content

Adding plike#4129

Merged
mattdowle merged 13 commits intoRdatatable:masterfrom
KyleHaynes:adding_plike
Aug 4, 2021
Merged

Adding plike#4129
mattdowle merged 13 commits intoRdatatable:masterfrom
KyleHaynes:adding_plike

Conversation

@KyleHaynes
Copy link
Copy Markdown
Contributor

Added %plike%.

Closes #3702

@codecov
Copy link
Copy Markdown

codecov bot commented Dec 18, 2019

Codecov Report

Merging #4129 (2099862) into master (2791043) will increase coverage by 0.13%.
The diff coverage is 100.00%.

❗ Current head 2099862 differs from pull request most recent head c476ec7. Consider uploading reports for the commit c476ec7 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #4129      +/-   ##
==========================================
+ Coverage   99.47%   99.60%   +0.13%     
==========================================
  Files          75       72       -3     
  Lines       14808    13918     -890     
==========================================
- Hits        14730    13863     -867     
+ Misses         78       55      -23     
Impacted Files Coverage Δ
R/like.R 100.00% <100.00%> (ø)
src/fmelt.c 99.00% <0.00%> (-1.00%) ⬇️
src/ijoin.c 95.29% <0.00%> (-0.18%) ⬇️
src/fsort.c 95.83% <0.00%> (-0.10%) ⬇️
src/cj.c 100.00% <0.00%> (ø)
R/fcast.R 100.00% <0.00%> (ø)
R/fmelt.R 100.00% <0.00%> (ø)
R/frank.R 100.00% <0.00%> (ø)
R/fread.R 100.00% <0.00%> (ø)
... and 48 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 60a4553...c476ec7. Read the comment docs.

Comment thread NEWS.md Outdated

5. `nafill` and `setnafill` gain `nan` argument to say whether `NaN` should be considered the same as `NA` for filling purposes, [#4020](https://github.com/Rdatatable/data.table/issues/4020). Prior versions had an implicit value of `nan=NaN`; the default is now `nan=NA`, i.e., `NaN` is treated as if it's missing. Thanks @AnonymousBoba for the suggestion. Also, while `nafill` still respects `getOption('datatable.verbose')`, the `verbose` argument has been removed.

6. New convenience function `%plike%` which map the existing `like()` argument `perl`, [#3702](https://github.com/Rdatatable/data.table/issues/3702). `%plike%` uses Perl-compatible regular expression (PCRE) which extends on TRE and is more efficient. Thanks @KyleHaynes for the suggestion and PR.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to say "more efficient", it might help to either point to someone else's benchmark or add our own. I'm not sure it's widely known that perl=TRUE is faster (I certainly didn't).

PS I prefer to disambiguate -- memory efficient, or computationally efficient?

Comment thread R/like.R
# Don't use * or % like SQL's like. Uses regexpr syntax - more powerful.
# returns 'logical' so can be combined with other where clauses.
like = function(vector, pattern, ignore.case = FALSE, fixed = FALSE) {
like = function(vector, pattern, ignore.case = FALSE, fixed = FALSE, perl = FALSE) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's common enough to need another infix for it, but we might as well add useBytes as an argument too? Comes up occasionally when working with messy strings.

Or maybe even just change the signature to function(vector, pattern, ...) and pass it on, although grepl is more limited than grep...

@MichaelChirico
Copy link
Copy Markdown
Member

Travis error is due to man/ issues:

Codoc mismatches from documentation object 'like':
like
  Code: function(vector, pattern, ...)
  Docs: function(vector, pattern, ignore.case = FALSE, fixed = FALSE,
                 perl = FALSE)
  Argument names in code not in docs:
    ...
  Argument names in docs not in code:
    ignore.case fixed perl
  Mismatches in argument names:
    Position: 3 Code: ... Docs: ignore.case

@jangorecki
Copy link
Copy Markdown
Member

@KyleHaynes are you planning to work out this PR?

@KyleHaynes
Copy link
Copy Markdown
Contributor Author

@jangorecki, Sorry for the delay. I've updated to a point I'm happy with it.

I note that the codecov/project isn't passing (not overly familiar with codecov) and not sure why (seems to reference some code commited by Matt quite some time ago)?

@MichaelChirico
Copy link
Copy Markdown
Member

@KyleHaynes yea don't worry about that Codecov bit. A line got uncovered when moving to R 4.1... you'll see all the other PRs failing for the same reason

Comment thread R/like.R Outdated
Comment thread NEWS.md Outdated

14. Added support for `round()` and `trunc()` to extend functionality of `ITime`. `round()` and `trunc()` can be used with argument units: "hours" or "minutes". Thanks to @JensPederM for the suggestion and PR.

15. New convenience function `%plike%` which map the existing `like()` argument `perl`, [#3702](https://github.com/Rdatatable/data.table/issues/3702). `%plike%` uses Perl-compatible regular expression (PCRE) which extends on TRE and is computationally more efficient. Thanks @KyleHaynes for the suggestion and PR.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it seem like it's unequivocally better, but I've definitely found cases where perl=TRUE is slower, e.g.:

#4447

@MichaelChirico MichaelChirico added this to the 1.14.1 milestone May 10, 2021
@mattdowle
Copy link
Copy Markdown
Member

Sorry for the very long delay here.
I've added you to DESCRIPTION and invited you to be a project member which amongst other things allows you to create branches in the main project which makes it easier for others to push to the branch. The invite should be a button in your GitHub profile or projects page that you need to click to accept.
Many thanks.

@KyleHaynes
Copy link
Copy Markdown
Contributor Author

Thanks @mattdowle (and @MichaelChirico); a small contribution, but thanks for accepting it!

Keep up the good work on such an invaluable package!

@mattdowle mattdowle merged commit d610642 into Rdatatable:master Aug 4, 2021
@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scope for %plike%

4 participants