Skip to content

Ability to use extractionFns in any filter #2643

@vogievetsky

Description

@vogievetsky

It would be great to have the ability to express a filter on a dimension with and extractionFn.

Problem

It is currently not easy to express:

LOOKUP(`some_id`, 'some_lookup') REGEXP "^lol"

As a Druid filter. See this issues: implydata/pivot#119

Note: this is technically possible with the newly added Cascade extractionsFn by using a javascript extractionFn, and the extraction filter like so: Cascade(lookup(some_lookup), javascript(d.match(/^lol/))) == "true" but that is cumbersome and while Plywood could be programmed to do it users using the Druid API directly will have a hard time working that one out.

Sugested solution

Every filter that takes a dimension should be able to also take an extraction function also. I.e. the filter dimension would become a DimensionSpec

So:

{
  "type": "regex",
  "dimension": "city",
  "pattern": "^lol"
}

Would be equivalent to:

{
  "type": "regex",
  "dimension": {
    "type": "default",
    "dimension": "city"
  },
  "pattern": "^lol"
}

But would could also write:

{
  "type": "regex",
  "dimension": {
    "type": "extraction",
    "extractionFn": {
        "type": "lookup",
        "lookup": ...
     }
  },
  "pattern": "^lol"
}

This would apply to all filters.

Note that this change will make the extraction filter redundant as you could simply write:

{
  "type": "selector",
  "dimension": {
    "type": "extraction",
    "extractionFn": ...
  },
  "value": "blah"
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions