It would be great to have the ability to express a filter on a dimension with and extractionFn.
Problem
It is currently not easy to express:
LOOKUP(`some_id`, 'some_lookup') REGEXP "^lol"
As a Druid filter. See this issues: implydata/pivot#119
Note: this is technically possible with the newly added Cascade extractionsFn by using a javascript extractionFn, and the extraction filter like so: Cascade(lookup(some_lookup), javascript(d.match(/^lol/))) == "true" but that is cumbersome and while Plywood could be programmed to do it users using the Druid API directly will have a hard time working that one out.
Sugested solution
Every filter that takes a dimension should be able to also take an extraction function also. I.e. the filter dimension would become a DimensionSpec
So:
{
"type": "regex",
"dimension": "city",
"pattern": "^lol"
}
Would be equivalent to:
{
"type": "regex",
"dimension": {
"type": "default",
"dimension": "city"
},
"pattern": "^lol"
}
But would could also write:
{
"type": "regex",
"dimension": {
"type": "extraction",
"extractionFn": {
"type": "lookup",
"lookup": ...
}
},
"pattern": "^lol"
}
This would apply to all filters.
Note that this change will make the extraction filter redundant as you could simply write:
{
"type": "selector",
"dimension": {
"type": "extraction",
"extractionFn": ...
},
"value": "blah"
}
It would be great to have the ability to express a filter on a dimension with and extractionFn.
Problem
It is currently not easy to express:
As a Druid filter. See this issues: implydata/pivot#119
Note: this is technically possible with the newly added Cascade extractionsFn by using a javascript extractionFn, and the extraction filter like so:
Cascade(lookup(some_lookup), javascript(d.match(/^lol/))) == "true"but that is cumbersome and while Plywood could be programmed to do it users using the Druid API directly will have a hard time working that one out.Sugested solution
Every filter that takes a dimension should be able to also take an extraction function also. I.e. the filter
dimensionwould become aDimensionSpecSo:
{ "type": "regex", "dimension": "city", "pattern": "^lol" }Would be equivalent to:
{ "type": "regex", "dimension": { "type": "default", "dimension": "city" }, "pattern": "^lol" }But would could also write:
{ "type": "regex", "dimension": { "type": "extraction", "extractionFn": { "type": "lookup", "lookup": ... } }, "pattern": "^lol" }This would apply to all filters.
Note that this change will make the extraction filter redundant as you could simply write:
{ "type": "selector", "dimension": { "type": "extraction", "extractionFn": ... }, "value": "blah" }