-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Labels
enhancementNew feature or requestNew feature or request
Description
Is your feature request related to a problem or challenge?
EXPLAIN SELECT c1, c2, c3 FROM sink_table WHERE c3 > 0 LIMIT 5;
----
logical_plan
Limit: skip=0, fetch=5
--Filter: sink_table.c3 > Int16(0)
----TableScan: sink_table projection=[c1, c2, c3]
physical_plan
GlobalLimitExec: skip=0, fetch=5
--CoalescePartitionsExec
----CoalesceBatchesExec: target_batch_size=8192
------FilterExec: c3@2 > 0
--------RepartitionExec: partitioning=RoundRobinBatch(3), input_partitions=1
----------StreamingTableExec: partition_sizes=1, projection=[c1, c2, c3], infinite_source=true
The example query in repartition.slt waits until the target_batch_size of CoalesceBatchesExec fills. That causes a delay in the observation of the query result. We can push-down limit into CoalesceBatchesExec here.
Describe the solution you'd like
There exists a similar rule in logical planning. We can have a physical optimizer rule that pushes down the limit count until facing with some limit breaker operators (joins, windows, sorts). Once the limit hits a CoalesceBatchesExec before that, it can set a new requested batch size.
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request