What is the problem the feature request solves?
I added some debug logging to CometNativeIterator to show the size of batches being processed when running TPC-H q14 and I see lots of small batches being processed.
Creating batch with 97 rows
Creating batch with 86 rows
Creating batch with 87 rows
Creating batch with 72 rows
Creating batch with 80 rows
...
The query processes 73,456 batches with fewer than 1000 rows and 2,448 batches with at least 1000 rows.
I wonder if there would be a performance benefit in coalescing these small batches into larger batches (if that is even possible -- I do not have full information on the context yet).
My theory is that we have some overhead per batch and that we could reduce that overhead if we had larger batches. This issue is for analyzing this and writing up some findings.
Describe the potential solution
No response
Additional context
No response
What is the problem the feature request solves?
I added some debug logging to
CometNativeIteratorto show the size of batches being processed when running TPC-H q14 and I see lots of small batches being processed.The query processes 73,456 batches with fewer than 1000 rows and 2,448 batches with at least 1000 rows.
I wonder if there would be a performance benefit in coalescing these small batches into larger batches (if that is even possible -- I do not have full information on the context yet).
My theory is that we have some overhead per batch and that we could reduce that overhead if we had larger batches. This issue is for analyzing this and writing up some findings.
Describe the potential solution
No response
Additional context
No response