Description
When miniblock encoding is used in a Lance file, reading the file with the v2 FileReader via the read_stream_projected API can become inefficient if the provided ReadBatchParams::Indices contains many nearby but non-contiguous row indices.
For example:
29, 168, 180, 194, 376, 559, 574, 665, 666, 667, ..., 968, 969, 970, 973, 975, ...
This kind of access pattern causes the same chunk to be decoded repeatedly, resulting in slow performance and high CPU usage.
Description
When miniblock encoding is used in a Lance file, reading the file with the v2 FileReader via the
read_stream_projectedAPI can become inefficient if the providedReadBatchParams::Indicescontains many nearby but non-contiguous row indices.For example:
This kind of access pattern causes the same chunk to be decoded repeatedly, resulting in slow performance and high CPU usage.