Skip to content

Implement CombiMatch to improve reasoner performance #411

@Sophietje

Description

@Sophietje

In GitLab by @han.kruiger.tno.nl on Feb 10, 2023, 17:00

Based on today's discussions.

For the reasoner to work, it needs to combine graph patterns that represent different 'pieces' of knowledge.

A match can be found by looking at one or more triple patterns from two graph patterns, and observing that they are compatible. In essence, this means that these parts of knowledge can 'flow into eachother'.

Currently, we already have a Match data structure, which describes how some triple patterns match to other triple patterns, including how any variables in the first triple pattern map to variables/resources in the second.

In the matching algorithm, these matches are linked (or linkable) to rule nodes (which have a antecedent and consequent graph pattern) on both sides.

Example of a match

Input:

?s <type> ?t .
?b <type> <Sensor> .

This is the match (using the toString implementation of the Match):

|?s| type ?t=|?b| type Sensor,
?s type |?t|=?b type |Sensor|

This match conveys that the first triple pattern of the first graph pattern matches with the first triple pattern of the second graph pattern with two mappings: ?s maps to ?b and ?t maps to <Sensor>.

A more involved example of a match

Input:

?s <type> ?t .
?s <hasVal> ?d .
?b <type> <Sensor> .
?b <hasVal> ?v .

These are the matches (using the toString implementation of the Match):

|?s| type ?t=|?b| type Sensor,
?s type |?t|=?b type |Sensor|

(Same as the previous example)

|?s| hasVal ?d=|?b| hasVal ?v,
?s hasVal |?d|=?b hasVal |?v|

This match conveys that the second triple pattern of the first graph pattern matches with the second triple pattern of the second graph pattern with two mappings: ?s maps to ?b and ?d maps to ?v.

|?s| hasVal ?d=|?b| hasVal ?v,
?s hasVal |?d|=?b hasVal |?v|,
|?s| type ?t=|?b| type Sensor,
?s type |?t|=?b type |Sensor|

(This final match is not elementary; it is the combination of the first two.)

This match conveys that the first triple pattern of the first graph pattern matches with the first triple pattern of the second graph pattern with two mappings: ?s maps to ?b and ?t maps to <Sensor>.

Unfruitful matches

It turned out that, in the current implementation, a lot of matches are being generated that do not contribute to a result that fully covers a goal graph pattern.
Let's call these matches unfruitful.

In many use cases of the reasoner, we know that the goal is to produce full bindings for a graph pattern, so these unfruitful matches are not that useful.

So to improve performance, unfruitful matches can be disregarded at some convenient moment in the algorithm.

To detect these unfruitful matches, we think the data structure CombiMatch will be useful.

Match is only for the matches between a pair of graph patterns (let's call the LHS "goal" and the RHS "candidate"), whereas a CombiMatch is for a (non-empty) collection of "candidate" graph patterns with respect to a "goal" graph pattern.

A possible moment in the algorithm to do use this is after having merged matches between the graph pattern pairs.

All of these Match objects can be trivially converted into a CombiMatch with only a single candidate graph pattern, and then the CombiMatches from different candidate graph patterns can be attempted to be merged together.

Then, after exhaustively merging the CombiMatches, we can discard all CombiMatches that do not fully cover the goal graph pattern, and only keep the Matches that contributed to the CombiMatches that remained.

We expect that this will result in significantly fewer matches.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions