Skip to content

[Rust] [DataFusion] Implement "coalesce batches" operator #18437

@asfimport

Description

@asfimport

When we have a FilterExec in the plan, it can produce lots of small batches and we therefore lose efficiency of vectorized operations.

We should implement a new CoalesceBatchExec and wrap every FilterExec with one of these so that small batches can be recombined into larger batches to improve the efficiency of upstream operators.

Reporter: Andy Grove / @andygrove
Assignee: Andy Grove / @andygrove

PRs and other links:

Note: This issue was originally created as ARROW-11058. Please see the migration documentation for further details.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions