Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 50 additions & 46 deletions Readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,22 +13,27 @@ Unique algorithms that was implemented on native unmanaged C++ but easily access

[![Build history](https://buildstats.info/appveyor/chart/3Fs/regxwild-github?buildCount=20&includeBuildsFromPullRequest=true&showStats=true)](https://ci.appveyor.com/project/3Fs/regxwild-github/history)

Samples [⏯](regXwildTest/EssSamplesTest.cpp) | regXwild filter | n
----------------------|----------------------|---------
number = '1271'; | number = '????'; | 0 - 4
year = '2020'; | '##'\|'####' | 2; 4
year = '20'; | = '##??' | 2; 4
number = 888; | number = +??; | 1 - 3


Samples [⏯](regXwildTest/EssSamplesTest.cpp) | regXwild filter
----------------------|----------------------
everything is ok | ^everything\*ok$
systems | system?
systems | sys###s
A new 'X1' project | ^A\*'+' pro?ect
professional system | pro\*system
regXwild in action | pro?ect$\|open\*source+act\|^regXwild

```cpp
= searchEssC(L"number = '1271';", L"number = '????';", true);
```
```cpp
= searchEss(data, _T("^main*is ok$"));
= searchEss(data, _T("^new*pro?ection"));
= searchEss(data, _T("pro*system"));
= searchEss(data, _T("sys###s"));
= searchEss(data, _T("new+7+system"));
= searchEss(data, _T("some project$|open*and*star|^system"));
```

## Why regXwild ?

It was designed to be faster than just fast, when using more features that usually go beyond the typical wildcards.
It was designed to be faster than just fast for features that usually go beyond the typical wildcards. Seriously, We love regex, I love, You love; 2013 far behind but regXwild still relevant for speed and powerful wildcards-like features, such as `##??` (which means 2 or 4) ...

### 🔍 Easy to start

Expand Down Expand Up @@ -58,43 +63,42 @@ using(var l = new ConariL("regXwild.dll"))

ESS version (advanced EXT version)

```cpp
enum MetaSymbols
{
MS_ANY = _T('*'), // {0, ~}
MS_SPLIT = _T('|'), // str1 or str2 or ...
MS_ONE = _T('?'), // {0, 1}, ??? {0, 3}, ...
MS_BEGIN = _T('^'), // [str... or [str1... |[str2...
MS_END = _T('$'), // ...str] or ...str1]| ...str2]
MS_MORE = _T('+'), // {1, ~}, +++ {3, ~}, ...
MS_SINGLE = _T('#'), // {1}, ## {2}, ### {3}, ...
MS_ANYSP = _T('>'), // as [^/]*
};
```
metasymbol | meaning
-----------|----------------
\* | {0, ~}
\| | str1 or str2 or ...
? | {0, 1}, ??? {0, 3}, ...
^ | [str... or [str1... |[str2...
$ | ...str] or ...str1]| ...str2]
\+ | {1, ~}, +++ {3, ~}, ...
\# | {1}, ## {2}, ### {3}, ...
\> | as [^/]*

EXT version (more simplified than ESS)

```cpp
enum MetaSymbols
{
MS_ANY = _T('*'),
MS_ANYSP = _T('>'), //as [^/\\]+
MS_SPLIT = _T('|'),
MS_ONE = _T('?'),
};
```

🧮 Quantifiers

regex | regXwild
---------|----------
.* | *
.+ | +
.? | ?
.{1} | #
.{2} | ##
.{2, } | ++
.{0, 2} | ??
metasymbol | meaning
-----------|----------------
\* | {0, ~}
\> | as [^/\\]+
\| | str1 or str2 or ...
? | {0, 1}, ??? {0, 3}, ...


### 🧮 Quantifiers

regex | regXwild | n
----------------|------------|---------
.\* | \* | 0+
.+ | + | 1+
.? | ? | 0; 1
.{1} | # | 1
.{2} | ## | 2
.{2, } | ++ | 2+
.{0, 2} | ?? | 0 - 2
.{2, 4} | ++?? | 2 - 4
(?:.{2}\|.{4}) | ##?? | 2; 4
.{3, 4} | +++? | 3 - 4
(?:.{1}\|.{3}) | #?? | 1; 3

and similar ...

Expand Down
109 changes: 61 additions & 48 deletions regXwild.nuspec
Original file line number Diff line number Diff line change
Expand Up @@ -3,35 +3,48 @@
<metadata>
<id>regXwild</id>
<version>1.2.0</version>
<title>[ regXwild ] Fast advanced wildcards</title>
<title>[ regXwild ] Fast Advanced wildcards</title>
<authors>github.com/3F/regXwild</authors>
<license type="file">LICENSE</license>
<owners>reg</owners>
<licenseUrl>https://aka.ms/deprecateLicenseUrl</licenseUrl>
<projectUrl>https://github.com/3F/regXwild</projectUrl>
<repository type="git" url="https://github.com/3F/regXwild" />
<requireLicenseAcceptance>false</requireLicenseAcceptance>
<description>
Small and super Fast advanced wildcards! `*,|,?,^,$,+,#,&gt;` in addition to slow regex engine and more.
<description>Small and super Fast Advanced wildcards! `*,|,?,^,$,+,#,&gt;` in addition to slow regex engines and more.

Unique algorithms that was implemented on native unmanaged C++ but easily accessible also in .NET
through Conari (recommended due to caching of 0x29 opcodes and other related optimization).

Samples [⏯](run) | regXwild filter | n
----------------------|----------------------|---------
number = '1271'; | number = '????'; | 0 - 4
year = '2020'; | '##'|'####' | 2; 4
year = '20'; | = '##??' | 2; 4
number = 888; | number = +??; | 1 - 3
...
----------------------|----------------------
everything is ok | ^everything*ok$
systems | system?
systems | sys###s
A new 'X1' project | ^A*'+' pro?ect
professional system | pro*system
regXwild in action | pro?ect$|open*source+act|^regXwild


This package contains x64 + x32 Unicode + MultiByte modules
and provides both support of the unmanaged and managed projects:

* For native: .\lib\native\{Platform}-(Unicode or MultiByte)\ ~ regXwild.dll, regXwild.lib, regXwild.exp, include\*.h
* For .NET it will put x32 &amp; x64 regXwild into (TargetDir). Use it with your .net modules through Conari ( https://github.com/3F/Conari ) and so on.

```
= searchEssC(L"number = '1271';", L"number = '????';", true);
```

## Why regXwild ?

It was designed to be faster than just fast, when using more features that usually go beyond the typical wildcards.
It was designed to be faster than just fast for features that usually go beyond the typical wildcards.
Seriously, We love regex, I love, You love; 2013 far behind but regXwild still relevant for speed and powerful wildcards-like features,
such as `##??` (which means 2 or 4) ...

🔍 Easy to start:
🔍 Easy to start

Unmanaged native C++ or managed .NET project. It doesn't matter, just use it:

Expand All @@ -55,61 +68,61 @@
}
```

🏄 Amazing meta symbols:
🏄 Amazing meta symbols

ESS version (advanced EXT version)

```cpp
enum MetaSymbols
{
MS_ANY = _T('*'), // {0, ~}
MS_SPLIT = _T('|'), // str1 or str2 or ...
MS_ONE = _T('?'), // {0, 1}, ??? - {0, 3}, ...
MS_BEGIN = _T('^'), // [str... or [str1... |[str2...
MS_END = _T('$'), // ...str] or ...str1]| ...str2]
MS_MORE = _T('+'), // {1, ~}
MS_SINGLE = _T('#'), // {1}
MS_ANYSP = _T('&gt;'), // as [^/]*
};
```

EXT version (more simplified than ESS)

```cpp
enum MetaSymbols
{
MS_ANY = _T('*'),
MS_ANYSP = _T('&gt;'), //as [^/\\]+
MS_SPLIT = _T('|'),
MS_ONE = _T('?'),
};
```

Check it with our actual **Unit-Tests**.

🚀 Awesome speed:
metasymbol | meaning
-----------|----------------
* | {0, ~}
| | str1 or str2 or ...
? | {0, 1}, ??? {0, 3}, ...
^ | [str... or [str1... |[str2...
$ | ...str] or ...str1]| ...str2]
+ | {1, ~}, +++ {3, ~}, ...
# | {1}, ## {2}, ### {3}, ...
&gt; | as [^/]*

🧮 Quantifiers

regex | regXwild | n
----------------|------------|---------
.* | * | 0+
.+ | + | 1+
.? | ? | 0; 1
.{1} | # | 1
.{2} | ## | 2
.{2, } | ++ | 2+
.{0, 2} | ?? | 0 - 2
.{2, 4} | ++?? | 2 - 4
(?:.{2}|.{4}) | ##?? | 2; 4
.{3, 4} | +++? | 3 - 4
(?:.{1}|.{3}) | #??? | 1; 3

and similar ...

Play with our actual **Unit-Tests**.

🚀 Awesome speed

* [~2000 times faster when C++](https://github.com/3F/regXwild#speed).
* For .NET (including modern .NET Core), [Conari](https://github.com/3F/Conari) provides optional caching of 0x29 opcodes (Calli) and more to get a similar result as possible.

🍰 Open and Free:
🍰 Open and Free

Open Source project; MIT License, Enjoy 🎉

- - - - - - - - - - - - - - - -
https://github.com/3F/regXwild
- - - - - - - - - - - - - - - -

~~~~~~~~
Get it via GetNuTool:
===================================
=======================================
gnt /p:ngpackages="regXwild/1.2.0"
===================================
* https://github.com/3F/GetNuTool

================== https://github.com/3F/GetNuTool

</description>
<summary>Small and super Fast advanced wildcards! `*,|,?,^,$,+,#,&gt;` in addition to slow regex engine and more. https://github.com/3F/regXwild</summary>
<tags>wildcards advanced-wildcards fast-wildcards fast-regex extended-wildcards strings text filter search matching search-in-text regex filters powerful-wildcards regexp cpp c dotnet dotnetcore csharp Conari regXwild native</tags>
<summary>Small and super Fast advanced wildcards! `*,|,?,^,$,+,#,&gt;` in addition to slow regex engines and more. https://github.com/3F/regXwild</summary>
<tags>wildcards advanced-wildcards fast-wildcards fast-regex extended-wildcards strings text filter search matching search-in-text regex glob filters powerful-wildcards regexp cpp c dotnet dotnetcore csharp Conari regXwild native</tags>
<releaseNotes> changelog: https://github.com/3F/regXwild/blob/master/changelog.txt </releaseNotes>
<copyright>Copyright (c) 2013-2014, 2016-2017, 2020 Denis Kuzmin &lt; x-3F@outlook.com &gt; GitHub/3F</copyright>
</metadata>
Expand Down
47 changes: 36 additions & 11 deletions regXwild/core/ESS/AlgorithmEss.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -117,24 +117,22 @@ bool AlgorithmEss::search(const tstring& text, const tstring& filter, bool ignor
if(rewindToNextBlock(it)){ continue; } return false;
}

// ++?? and ##??
if(item.mask.prev & (MORE | SINGLE) && item.mask.curr & ONE)
{
item.mixpos = item.overlay + 1;
item.mixms = item.mask.prev;
}

// Sequential combinations of #, ?, +
if((item.mask.curr & SINGLE && item.mask.prev & SINGLE)
|| (item.mask.curr & ONE && item.mask.prev & ONE)
|| (item.mask.curr & MORE && item.mask.prev & MORE))
{
++item.overlay;
++item.overlay;
}
else{ item.overlay = 0; }

// disable all combinations for SINGLE. TODO: stub - _stubSINGLECombination()
if( (item.mask.prev & (BOL | EOL)) == 0 &&
(
(item.mask.curr & SINGLE && (item.mask.prev & SINGLE) == 0) ||
(item.mask.prev & SINGLE && (item.mask.curr & SINGLE) == 0) ))
{
if(rewindToNextBlock(it)){ continue; } return false;
}

++item.pos;

// TODO: implement non-sequential MS combinations ~ unsigned short int ..
Expand Down Expand Up @@ -211,8 +209,10 @@ bool AlgorithmEss::search(const tstring& text, const tstring& filter, bool ignor
}

item.pos = item.left;
if(item.mask.curr & SPLIT) {
if(item.mask.curr & SPLIT)
{
words.left = 0;
item.mixpos = 0;
item.mask.prev = BOL;
continue; //to next block
}
Expand Down Expand Up @@ -253,6 +253,31 @@ bool AlgorithmEss::search(const tstring& text, const tstring& filter, bool ignor

udiff_t AlgorithmEss::interval()
{
// ++?? or ##??
if(item.mask.prev & ONE && item.mixpos > 0)
{
size_t len = item.prev.length();
diff_t delta = words.found - words.left;

diff_t min = item.mixpos;
diff_t max = min + item.overlay + 1;

if(item.mixms & SINGLE && delta != min && delta != max) {
return tstring::npos;
}

if(delta < min || delta > max) {
return tstring::npos;
}

if(_text.substr(words.found - len - delta, len).compare(item.prev) == 0) {
return words.found;
}

return tstring::npos;

}

// "#"
if(item.mask.prev & SINGLE)
{
Expand Down
7 changes: 5 additions & 2 deletions regXwild/core/ESS/AlgorithmEss.h
Original file line number Diff line number Diff line change
Expand Up @@ -91,17 +91,20 @@ namespace net { namespace r_eg { namespace regXwild { namespace core { namespace
Mask mask;
unsigned short int overlay;

unsigned short int mixpos; // ++??
MetaOperation mixms;

/** enough of this.. */
tstring prev;

void reset()
{
pos = left = delta = overlay = 0;
pos = left = delta = overlay = mixpos = 0;
mask.curr = mask.prev = BOL;
curr.clear();
prev.clear();
};
Item(): pos(0), left(0), delta(0), overlay(0) { };
Item(): pos(0), left(0), delta(0), overlay(0), mixpos(0) { };
} item;

/**
Expand Down
Loading