Skip to content

Not all matches trigger a callback when using vectored mode #202

@dagardner-nv

Description

@dagardner-nv

Using the vectored mode example from the Quickstart guide:

from typing import Any, Optional

import hyperscan

def on_match(id: int, from_: int,
    to: int,
    flags: int,
    context: Optional[Any] = None
) -> Optional[bool]:
    print(f"id={id}, from={from_} - to={to}")


db = hyperscan.Database(mode=hyperscan.HS_MODE_VECTORED)
patterns = (
    # expression,  id, flags
    (br'fo+',      0,  0),
    (br'^foobar$', 1,  hyperscan.HS_FLAG_CASELESS),
    (br'BAR',      2,  hyperscan.HS_FLAG_CASELESS
                       | hyperscan.HS_FLAG_SOM_LEFTMOST),
)
expressions, ids, flags = zip(*patterns)
db.compile(
    expressions=expressions, ids=ids, elements=len(patterns), flags=flags
)
print(db.info().decode())
buffers = [
    bytearray(b'xxxfooxxx'),
    bytearray(b'xxfoxbarx'),
    bytearray(b'barxxxxxx'),
]

db.scan(buffers, match_event_handler=on_match)

Output:

Version: 5.4.11 Features:  Mode: VECTORED
id=0, from=0 - to=5
id=0, from=0 - to=6
id=2, from=9 - to=12

Unless I'm not understanding how vectored mode works. I would have expected expression id=2 to have matched twice once on bytearray(b'xxfoxbarx') and on bytearray(b'barxxxxxx'). In addition to this the range of 9:12 doesn't appear to match either occurrence of bar.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions