[std.regex] Bit-NFA kickstart engine#4286
Conversation
45da00b to
3577a07
Compare
|
Win32 is blocked by https://issues.dlang.org/show_bug.cgi?id=15989 |
c747d99 to
9416854
Compare
cfe7a48 to
e42788c
Compare
|
|
Now with immutable regexes! |
|
Looks like you have a typo in the issue number. 9381 isn't about regex. |
Right my bad issue is 9391. |
e42788c to
87847ed
Compare
|
All green! |
andralex
left a comment
There was a problem hiding this comment.
This is terrific work, and a large project on its own. I can't pretend to be able to review the logic of the additions in a reasonable time. Hopefully a paper or article will follow :o).
Two things:
- Some lines that seem important are not covered in unittests.
- Although this is in a way a project with its own life, it would be good to bring at least newly added code in alignment with the prevalent Phobos style. There, we use spaces around all operators; here, sometimes spaces are present but some other times they're omitted.
| break; | ||
| case IR.Trie: | ||
| if (charsets.length && charsets[ir[0].data].byInterval.length <= 8) | ||
| if (charsets.length && charsets[ir[0].data].length <= 8) |
There was a problem hiding this comment.
This line seems to be not covered.
Also has two spaces before <= :)
| break; | ||
| slot += 1; | ||
| if (slot == table.length) | ||
| slot = 0; |
| break; | ||
| case LookaheadStart, NeglookaheadStart, LookbehindStart, | ||
| NeglookbehindStart: | ||
| paths.push(j + IRL!LookaheadStart + ir[j].data + IRL!LookaheadEnd); |
There was a problem hiding this comment.
uncovered (this would be important, right?)
| i += (s-1)*IRL!OrChar; | ||
| bitCount++; | ||
| if (bitCount == 32) | ||
| break outer; |
| finalMask |= 1u<<bitMapping[i]; | ||
| break; | ||
| case Any: | ||
| uint mask = 1u<<bitMapping[i]; |
|
@DmitryOlshansky by the new flow you can merge your work now, optionally with follow-up work after the review. |
std/regex/package.d
Outdated
| @@ -320,13 +320,11 @@ public alias StaticRegex(Char) = std.regex.internal.ir.StaticRegex!(Char); | |||
|
|
|||
| Throws: $(D RegexException) if there were any errors during compilation. | |||
| +/ | |||
There was a problem hiding this comment.
This doc comment needs to be moved down to regex(); now it refers to the newly inserted regexPure().
|
@andralex Will try to resolve the issues first then merge. |
|
Auto-merge toggled on |
|
Seems like this caused a regression w/ building Higgs. |
The new version is more demanding on memory during CTFE so yeah, most likely it now fails for some patterns. Dunno what to do here would really love to see the new engine sometime soon. |
|
Seems to me we should disable that engine in CTFE for now. |
- consumes too much memory, introduced by e98fa4a (dlang#4286)
|
Seems to have caused another regression. |
|
@MartinNowak will look into it, though sounds highly suspicious. |

This adds an alternative "kickstart" engine to locate a prefix of a given pattern.
It is far more general the the original ShiftOr while a being few times slower,
still it's many times faster then any full-blow engine.
UPDATE: it's even ~x2 times faster then grep! Now that's some news :)