FILTER keyword for latest anchor queries#146
FILTER keyword for latest anchor queries#146rogerlucena wants to merge 34 commits intogoogle:masterfrom
Conversation
d09f413 to
2206197
Compare
2206197 to
6578e67
Compare
| defer func() { | ||
| lo.FilterOptions = (*storage.FilteringOptions)(nil) | ||
| }() |
There was a problem hiding this comment.
this is cleaning lo.FilterOptions, right?
could you add a comment?
the comment could state what is being done and why it is necessary
There was a problem hiding this comment.
I have just added the comments, but this is only being necessary here because we are not deprecating the bool LatestAnchor from lookupOptions yet.
This way, just to keep making the tests related to this bool passing in memory_test.go (only place where it is used), I decided the best approach for now was to keep support for it in the volatile OSS driver, but using a FILTER latest(?p) behind the drapes as the behavior is the same.
About the cleaning of lo.FilterOptions, it should not be necessary as I already do this cleaning on the planner level, with resetFilterOptions there. But since for LatestAnchor I am artificially creating a new lo.FilterOptions in the driver level, it is safer to also guarantee that it will be cleaned to nil in the driver level too, in the case someone think about adding a test in memory_test.go that concatenates a LatestAnchor with a normal call to the driver or a FILTER for something else there (without calling the planner function resetFilterOptions then). This does not happen at the moment (so if I do not add this defer there no tests would break) and I doubt someone will do it in the future, but I thought it was safer to add it.
Given that, it brings to light the discussion on when do we plan to deprecate this LatestAnchor bool from lookupOptions. Can we do this in a follow up PR or should we wait a bit more? This could allow us to simplify multiple parts of the code in memory.go, deleting all the boilerplate there related to LatestAnchor, and also delete a number of tests in memory_test.go that at the moment are just retesting FILTER.
There was a problem hiding this comment.
I have talked with Thiago about and it seems that it will be necessary some refactoring in the internal side to deprecate this LatestAnchor bool from lookupOptions. Then, for now, I will be keeping the support for it in the OSS side as well.
699b5f3 to
374eac0
Compare
c169600 to
30a5072
Compare
…ter test FILTER) Two new triples were added so we could have, for both predicate and object positions, examples on which two triples would have the same latest anchor (expecting 2 rows for "FILTER latest" in this case, in the place of only 1 as it was in the usual case tested so far). For that, the triples added were the one with predicate ""bought"@[2016-04-01T00:00:00-08:00]" and the other with object ""turned"@[2016-04-01T00:00:00-08:00]". The third triple was added so that both "/u<peter>" and "/u<paul>" would have in common two temporal predicates - this way we can test if only one "FILTER latest(?p)" is working as expected for multiple graph clauses inside of WHERE (if they share that same binding "?p"). For this, the triple added was the one with predicate ""bought"@[2016-01-01T00:00:00-08:00]".
30a5072 to
c447e3e
Compare
c447e3e to
d6698a8
Compare
d6698a8 to
3102dcb
Compare
3102dcb to
e241717
Compare
… anchor only for now)
…ake FILTER for latest return multiple triples if they share the same predicate and same latest anchor
…n be applied to, improving error checking, and also move verification for supported filter functions to the planner level
…dition of multiple filter functions in the future)
…-prone when implementing the driver)
…ne when implementing the driver)
e241717 to
3c5c9ff
Compare
This PR comes as a part of #129, setting up a
FILTERkeyword in BadWolf. The firstFILTERfunction chosen to be implemented is thelatestone, solving what was requested by #86.It would be very useful to have in BQL a
FILTERkeyword that could allow us to filter out part of the results of a query in a level closer to the storage (closer to the driver), improving performance. This is exactly what this PR comes to introduce, with the full implementation of how this keyword could work on the driver/storage side as well (exemplified with the changes added to the volatile open-source driver inmemory.gobelow).Then, now the user can specify, inside of
WHERE, which bindings they want to apply aFILTERto, proceeding with a more fine-grained lookup on storage, avoiding unnecessary retrieval of data and optimizing query performance.To illustrate, queries such as the one below are now possible:
That would return all the temporal triples of the
?testgraph that have the latest timestamp of the time series they are part of (a recorrent use case in BadWolf), skipping immutable triples found along the way. This FILTER function also works for objects?oin the case of reification/blank nodes: in this case the returned triples would be the ones on which the object is necessarily a temporal predicate with latest timestamp among the predicates with that same predicate ID (in the "object" position of the clause), analogously to what happened with?p.This
FILTERforlatestanchor above also works for alias bindings obtained with theASkeyword for predicates and objects too.At the moment only one
FILTERis supported for each graph clause inside ofWHERE, and only oneFILTERis supported for each given binding as well.Regarding their position inside
WHERE,FILTERclauses must come after all the graph pattern clauses, just by the end of theWHERE(closer to its closing bracket). Regarding trailing dots, aFILTERclause is understood just like any other graph clause inside ofWHERE: the trailing dot is mandatory at the end of each clause (FILTERor graph ones indistinguishably), with the exception of the last one (for which the dot is optional).In the future, to add a new
FILTERfunction, the steps to follow are:filter.go;SupportedOperationsmap infilter.goto map the lowercase string of theFILTERfunction being added to its correspondentfilter.Operationelement;Stringmethod ofOperationinfilter.go;compatibleBindingsInClauseForFilterOperationinplanner.goto specify for which fields and bindings of a clause the newly addedfilter.Operationcan be applied to;