perf: intern RowDatasetVersionMeta inline bytes to reduce manifest memory#6499
Merged
jackye1995 merged 4 commits intolance-format:mainfrom Apr 14, 2026
Merged
Conversation
80da537 to
6db1001
Compare
…mory Change RowDatasetVersionMeta::Inline from Vec<u8> to Arc<[u8]> and deduplicate identical byte payloads during manifest deserialization via DataFileFieldInterner. Post-compaction, all fragments are stamped with the same version metadata, so at 20M fragments this saves ~480 MB of redundant heap allocations for each of last_updated_at and created_at version meta. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Measures deserialization throughput and memory usage with and without interning for DataFile fields, column_indices, and RowDatasetVersionMeta inline bytes. At 100K fragments (10 fields): Without interning: 39.47 MB, 24.3 ms With interning: 29.74 MB, 28.2 ms Savings: 9.73 MB (24.6%), +16% deser time At 100K fragments (50 fields): Without interning: 69.99 MB, 30.9 ms With interning: 29.74 MB, 35.3 ms Savings: 40.24 MB (57.5%), +14% deser time Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace pure Vec cache with InternCache that uses linear scan for <=16 entries and upgrades to HashMap for larger caches. This keeps the common case fast (1-3 unique values → no hashing) while avoiding O(n) scan degradation with many unique values (e.g., 500 distinct version payloads from many small appends). At 100K fragments (10 fields): Uniform (1 unique): 24.5ms (no intern) → 17.9ms (intern), 27% faster Diverse (100 unique): 26.0ms (no intern) → 23.4ms (intern), 10% faster Diverse (500 unique): 26.0ms (no intern) → 22.8ms (intern), 12% faster Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6db1001 to
1b27f53
Compare
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
jackye1995
approved these changes
Apr 13, 2026
Contributor
jackye1995
left a comment
There was a problem hiding this comment.
the feature overall looks good to me, pending CI fix
CI clippy denies print_stderr; benchmarks use eprintln! to report memory stats alongside criterion output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2ebfa69 to
8304946
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RowDatasetVersionMeta::InlinefromVec<u8>toArc<[u8]>so that fragments with identical version metadata share a single heap allocationDataFileFieldInternerto deduplicate these inline byte payloads during manifest deserializationInternCache<T>: a hybrid cache that uses Vec linear scan for ≤16 entries and upgrades to HashMap for larger cachesSerialize/Deserializeimpls forRowDatasetVersionMetato handleArc<[u8]>transparentlyMotivation
Follow-up to #6477 (interning
DataFile.fields/column_indices). After a compaction, all fragments are stamped with the same version metadata (bothlast_updated_at_version_metaandcreated_at_version_meta), but each fragment previously owned its ownVec<u8>copy.Per-fragment memory breakdown (before)
last_updated_at_version_meta: Inline(Vec<u8>)created_at_version_meta: Inline(Vec<u8>)After this change
With interning, all 20M fragments share a single
Arc<[u8]>allocation per unique payload.Benchmark results
Microbenchmark at 100K fragments (10 fields per fragment):
Both memory and speed improve across all scenarios. The hybrid
InternCacheuses fast Vec scan for the common case (1-3 unique values) and upgrades to HashMap when diversity exceeds 16 entries.Run with:
cargo bench -p lance-table --bench manifest_internChanges
rust/lance-table/src/rowids/version.rs—Inline(Vec<u8>)→Inline(Arc<[u8]>), custom serde impls, updated protobuf conversionsrust/lance-table/src/format/fragment.rs—InternCache<T>(Vec/HashMap hybrid), extendedDataFileFieldInternerwith version meta interningrust/lance-table/benches/manifest_intern.rs— Microbenchmark covering uniform and diverse scenariosCompatibility
Arc<[u8]>as[u8])from_sequence()still works as before (converts internally)Test plan
cargo check --workspace --testspassescargo clippy -p lance-table -p lance -- -D warningspasseslance-tabletests passcargo fmt --all -- --checkpasses🤖 Generated with Claude Code