Skip to content

[Rust] [DataFusion] Add better and faster support for dictionary types #24639

@asfimport

Description

@asfimport

Usecases: Efficiently process large columns of low cardinality Strings
 

  • BatchIterator should accept both DictionaryBatch and RecordBatch
  • Type Coercion optimizer rule should inject expression for converting dictionary value types to index types (for equality expressions, and IN(values, ...)
  • Physical expression would lookup index for dictionary values referenced in the query so that at runtime, only indices are being compared per batch

Reporter: Andy Grove / @andygrove

Related issues:

Note: This issue was originally created as ARROW-8464. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions