Skip to content

feat: make blob v2 dedicated threshold configurable#5719

Merged
Xuanwo merged 2 commits intolance-format:mainfrom
yanghua:feat-dedicated-threshold-env
Jan 20, 2026
Merged

feat: make blob v2 dedicated threshold configurable#5719
Xuanwo merged 2 commits intolance-format:mainfrom
yanghua:feat-dedicated-threshold-env

Conversation

@yanghua
Copy link
Copy Markdown
Collaborator

@yanghua yanghua commented Jan 15, 2026

No description provided.

@github-actions github-actions Bot added the enhancement New feature or request label Jan 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Code Review

P1 - Thread Safety Issue in Test

The test test_dedicated_threshold_env_parsing_only manipulates environment variables using std::env::set_var and std::env::remove_var. Since Rust tests run in parallel by default, this can cause race conditions with other tests that may also read environment variables or use get_dedicated_threshold().

Suggested fix: Either:

  1. Add #[serial] attribute using the serial_test crate to ensure this test runs in isolation
  2. Or use a test-specific approach that doesn't rely on global env state (e.g., test the parsing logic in isolation with a helper that takes the value as a parameter)

The overall implementation approach is reasonable. The threshold is read once at BlobPreprocessor construction time, which avoids repeated env lookups during processing.

Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this, I hope this can be a column metadata like blob_dedicated_size_threshold instead.

@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@yanghua
Copy link
Copy Markdown
Collaborator Author

yanghua commented Jan 15, 2026

Thank you for working on this, I hope this can be a column metadata like blob_dedicated_size_threshold instead.

OK, got it.

@yanghua yanghua force-pushed the feat-dedicated-threshold-env branch from df3cb95 to e82609c Compare January 16, 2026 03:35
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Others LGTM

Comment thread rust/lance-arrow/src/lib.rs
Comment thread rust/lance/src/dataset/blob.rs Outdated
@yanghua
Copy link
Copy Markdown
Collaborator Author

yanghua commented Jan 20, 2026

@Xuanwo I have addressed your comments.

Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this!

@Xuanwo Xuanwo merged commit 277b369 into lance-format:main Jan 20, 2026
29 checks passed
jackye1995 pushed a commit to jackye1995/lance that referenced this pull request Jan 21, 2026
vivek-bharathan pushed a commit to vivek-bharathan/lance that referenced this pull request Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants