-
-
Notifications
You must be signed in to change notification settings - Fork 748
Closed
Labels
Description
Calculating ArrayRechunkRun._slicing takes a long time during the initialization of the worker-side state and creates a huge mapping that can grows quickly to several gigabytes or more. From an algorithmic perspective, we only need the slicing information of a single input chunk at any given point in time. To improve the memory footprint and startup time of P2P rechunking, we should calculate the slicing information on the fly.
Note that we must also generate the information about where each slice belongs in the output chunk efficiently.