FILTER keyword for latest anchor queries (planner/memory)#150
Merged
thiagovas merged 26 commits intogoogle:masterfrom Oct 16, 2020
Merged
FILTER keyword for latest anchor queries (planner/memory)#150thiagovas merged 26 commits intogoogle:masterfrom
thiagovas merged 26 commits intogoogle:masterfrom
Conversation
Open
thiagovas
approved these changes
Oct 7, 2020
58f6a07 to
338df90
Compare
rbkloss
reviewed
Oct 8, 2020
…ter test FILTER) Two new triples were added so we could have, for both predicate and object positions, examples on which two triples would have the same latest anchor (expecting 2 rows for "FILTER latest" in this case, in the place of only 1 as it was in the usual case tested so far). For that, the triples added were the one with predicate ""bought"@[2016-04-01T00:00:00-08:00]" and the other with object ""turned"@[2016-04-01T00:00:00-08:00]". The third triple was added so that both "/u<peter>" and "/u<paul>" would have in common two temporal predicates - this way we can test if only one "FILTER latest(?p)" is working as expected for multiple graph clauses inside of WHERE (if they share that same binding "?p"). For this, the triple added was the one with predicate ""bought"@[2016-01-01T00:00:00-08:00]".
… anchor only for now)
…ake FILTER for latest return multiple triples if they share the same predicate and same latest anchor
…n be applied to, improving error checking, and also move verification for supported filter functions to the planner level
…dition of multiple filter functions in the future)
…-prone when implementing the driver)
…ne when implementing the driver)
9dc5c4a to
f8c694c
Compare
thiagovas
approved these changes
Oct 14, 2020
…into auxiliar functions)
…peration" anymore
72a44d4 to
5fd5a95
Compare
jlsotomayorm
approved these changes
Oct 15, 2020
…ide "compatibleBindingsInClauseForFilterOperation"
This was referenced Nov 18, 2020
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR comes as a part of #129, finishing to set up a
FILTERkeyword in BadWolf at the planner and memory levels. The firstFILTERfunction chosen to be implemented is thelatestone, solving what was requested by #86.It would be very useful to have in BQL a
FILTERkeyword that could allow us to filter out part of the results of a query in a level closer to the storage (closer to the driver), improving performance. This is exactly what this PR finishes introducing, completing what started with the PR #149. This PR also showcases the full implementation of how this keyword could work on the driver/storage side as well (exemplified with the changes added to the volatile open-source driver inmemory.gobelow).Then, now the user can specify, inside of
WHERE, which bindings they want to apply aFILTERto, proceeding with a more fine-grained lookup on storage, avoiding unnecessary retrieval of data and optimizing query performance.To illustrate, queries such as the one below are now possible:
That would return all the temporal triples of the
?testgraph that have the latest timestamp of the time series they are part of (a recorrent use case in BadWolf), skipping immutable triples found along the way. This FILTER function also works for objects?oin the case of reification/blank nodes: in this case the returned triples would be the ones on which the object is necessarily a temporal predicate with latest timestamp among the predicates with that same predicate ID (in the "object" position of the clause), analogously to what happened with?p.This
FILTERforlatestanchor above also works for alias bindings obtained with theASkeyword for predicates and objects too.At the moment only one
FILTERis supported for each graph clause inside ofWHERE, and only oneFILTERis supported for each given binding as well.In the future, to add a new
FILTERfunction, the steps to follow are:filter.go;SupportedOperationsmap infilter.goto map the lowercase string of theFILTERfunction being added to its correspondentfilter.Operationelement;Stringmethod ofOperationinfilter.go;compatibleBindingsInClauseForFilterOperationinplanner.goto specify for which fields and bindings of a clause the newly addedfilter.Operationcan be applied to;