perf: avoid re-open shard indices and small reads#6026
Conversation
BubbleCal
commented
Feb 26, 2026
- reuse the readers and avoid reopening them in each iteration
- read multiple partitions in one read to avoid small IOs
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
Code ReviewSummary: Performance optimization to avoid re-opening shard readers and batch partition reads using a 32-partition window. FeedbackP1 - Test Coverage Gap: The existing
Minor observation: Holding all The implementation logic appears correct with good error handling for row count mismatches. |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
just out of curiosity how did you choose 512 as the partition window size? how big should we expect that to be in memory? |
i assume we have 4K rows per partition roughly (the default index param) good catch, just added a test for testing that window size |
1. reuse the readers and avoid reopening them in each iteration 2. read multiple partitions in one read to avoid small IOs --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com> Co-authored-by: Lei Xu <lei@lancedb.com>