Performance improvement of dvcignore#3967
Conversation
1. add new command `dvc remote rename <old> <new>`. 2. add tests for this new command.
Co-authored-by: Peter Rowlands (변기호) <peter@pmrowla.com>
fix#3869 1.Use big regex.
There was a problem hiding this comment.
LGTM, though I would not close #3869 just yet.
@karajan1001 you mentioned the flashtext library, and I could not find the description of the experiment conducted, that gave the author conclusion that it makes sense to use trie-based approach only above some number of keywords. What about matching few gitignore patterns against hundreds of files? Maybe there could be gain too.
Also, I wonder whether re.compile has limit of pattern size. Checked against 100k, seems to not be a problem. I did not find resources pointing to the limit.
|
@karajan1001 Windows build is failing. |
Maybe this is because |
|
@pared @Suor This is the result after this submit. The new version took 24ms, and the old one took 417ms. (17.35 times faster) And In the worst case where the rules in the order that ignore rules and include rules are separated from each other. And there is no big regex exists(Hardly could this happened in real). The new version took 84ms, and the old one took 416ms. (4.95 times faster) |
Suor
left a comment
There was a problem hiding this comment.
This one looks good. We should merge it.
The thing is bothering me is that this doesn't help with many ignore files. Can we make a bench for that? How many ignore files do we need to be in trouble?
|
@karajan1001, thanks for this PR! Now, that implementation of |


Fix #3869
❗ I have followed the Contributing to DVC checklist.
📖 If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here. If the CLI API is changed, I have updated tab completion scripts.
❌ I will check DeepSource, CodeClimate, and other sanity checks below. (We consider them recommendatory and don't expect everything to be addressed. Please fix things that actually improve code or fix bugs.)
Thank you for the contribution - we'll try to review it as soon as possible. 🙏