Skip to content

Future of experimental optimizer datafusion-tokomak #440

@Dandandan

Description

@Dandandan

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
This issue is for discussing the future of datafusion-tokomak, an experimental optimizer using the egg library.

It currently allows to optimize Exprs and contains many optimizations currently not done in DataFusion.
I envision it could be extended to support a logical plan or physical plan too.

The optimizer using egg has the following nice properties, which are hard to achieve otherwise:

  • Development of new rules is really easy, most of them can be added in one line, or they could hook up to a trait.
  • Performs more aggressive optimizations than a handwritten optimizer, as it applies multiple rules at once, and can apply and remember rewrites that don't reduce the cost (such as reordering expressions).
  • Supports custom cost functions. It now uses the size of the AST, but could be easily changed to use something else.
  • Is fast for big programs / trees.
  • Is written in Rust, so integrates really well

Some material about it here https://egraphs-good.github.io/

Describe the solution you'd like
Some options:

Integrate it into DataFusion, as an optional feature

Add to DataFusion as separate crate

Keep it in separate repo as is, do some releases to crates.io in sync with DataFusion releases

Add as experimental repo / branch under the Apache organization

Describe alternatives you've considered
n/a

Additional context
Add any other context or screenshots about the feature request here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions