-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Now that we have a good spilling implementation in #18207, do we still want bounded channels for the in-memory data? This was first introduced in #4867.
My feeling is that we could probably drop it. The one situation I am worried about is when we fill RepartitionExec's buffers and consume the entire memory budget then the query fails. i.e. if we could do "cooperative" spilling RepartitionExec would be the ideal candidate to spill and this would not be a problem (an upstream GroupBy could ask other operators to spill, RepartitionExec would spill easily and free up memory). But today that's not the case.
One way to collect more information is to run ClickBench (and other benchmarks) w/o the Distribution infrastructure and compare runtimes, peak memory use and behavior under constrained memory budgets.