Skip to content

refactor: use exact base-scoped store bindings#6422

Merged
Xuanwo merged 3 commits intomainfrom
xuanwo/per-base-runtime-store-params
Apr 15, 2026
Merged

refactor: use exact base-scoped store bindings#6422
Xuanwo merged 3 commits intomainfrom
xuanwo/per-base-runtime-store-params

Conversation

@Xuanwo
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo commented Apr 7, 2026

This changes per-base runtime configuration to use exact ObjectStoreParams bindings keyed by BasePath.path instead of per-base storage option overrides. Dataset-level and write-level store params now act only as fallbacks, while reads, target-base writes, and external blob resolution all consult the same base-scoped binding model.

This keeps provider-specific runtime state out of the manifest and follows the direction in discussion #6307 to keep BasePath focused on identity.

@github-actions github-actions Bot added the enhancement New feature or request label Apr 7, 2026
@Xuanwo Xuanwo marked this pull request as ready for review April 7, 2026 08:51
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 7, 2026

Codecov Report

❌ Patch coverage is 82.46445% with 37 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/builder.rs 44.18% 24 Missing ⚠️
rust/lance/src/dataset/write.rs 88.50% 10 Missing ⚠️
rust/lance/src/dataset.rs 88.88% 1 Missing and 1 partial ⚠️
rust/lance/src/dataset/blob.rs 98.38% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be ok as is but I have a few questions.

Is it possible to set an option for the default base? Or would you need to override that option in every base?
Can you mask an option to hide it from the bases? For example, maybe the default storage options has a proxy_url but you don't want to use that in the non-default bases?

@Xuanwo
Copy link
Copy Markdown
Collaborator Author

Xuanwo commented Apr 14, 2026

Thank you @westonpace for the review, I polished the design a bit. Can you talk another look?

Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the simpler and more flexible approach (ability to set other object store params). I still don't see any way to "unset" a storage option using an override but this is not a blocking concern.

Comment thread rust/lance/src/dataset/builder.rs Outdated
Comment on lines +455 to +458
/// Set runtime-only object store params for a specific registered base path.
///
/// These params are not persisted in the manifest. They are used whenever
/// the dataset resolves an object store for the given `BasePath.path`.
Copy link
Copy Markdown
Member

@westonpace westonpace Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we describe how the two sets of options are merged better? We have two ObjectStoreParams instances now, the dataset store params and the override params. What actually gets used?

If I'm reading the code correctly I think:

  • All options that are not storage_options (e.g. block_size, use_constant_size_upload_parts) are replaced entirely by the override.
  • Storage options are handled differently. The value is the union of the two sets of params. If a storage_option is set in both then the value in the override is preferred.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that the old wording was ambiguous. In the current implementation, per-base bindings are exact ObjectStoreParams keyed by BasePath.path, and dataset-level params are only the fallback. There is no merge between dataset-level and per-base params. I updated the comments in the PR to make that explicit.

@Xuanwo Xuanwo changed the title feat: add runtime per-base store params for dataset reads refactor: use exact base-scoped store bindings Apr 14, 2026
@Xuanwo Xuanwo merged commit e5ceacb into main Apr 15, 2026
30 of 32 checks passed
@Xuanwo Xuanwo deleted the xuanwo/per-base-runtime-store-params branch April 15, 2026 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants