Skip to content

perf: improve parallelism of data_stats#5990

Merged
wjones127 merged 2 commits intolance-format:mainfrom
wkalt:task/improve-data-stats-parallelism
Feb 23, 2026
Merged

perf: improve parallelism of data_stats#5990
wjones127 merged 2 commits intolance-format:mainfrom
wkalt:task/improve-data-stats-parallelism

Conversation

@wkalt
Copy link
Copy Markdown
Contributor

@wkalt wkalt commented Feb 23, 2026

Prior to this commit, data_stats did a serial loop over fragment metadata with one IO per fragmnet. For datasets with large numbers of fragments, this can take a large amount of time.

This commit parallelises this call over the parallelism of the object store.

Prior to this commit, data_stats did a serial loop over fragment
metadata with one IO per fragmnet. For datasets with large numbers of
fragments, this can take a large amount of time.

This commit parallelises this call over the parallelism of the object
store.
@wjones127 wjones127 merged commit d27efcc into lance-format:main Feb 23, 2026
23 of 28 checks passed
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 23, 2026

Codecov Report

❌ Patch coverage is 0% with 33 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/statistics.rs 0.00% 19 Missing ⚠️
rust/lance/src/dataset/fragment.rs 0.00% 14 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants