Skip to content

test: standardize IO tests and improve assertions#4923

Merged
wjones127 merged 6 commits intolance-format:mainfrom
wjones127:test/better-io-tests
Oct 10, 2025
Merged

test: standardize IO tests and improve assertions#4923
wjones127 merged 6 commits intolance-format:mainfrom
wjones127:test/better-io-tests

Conversation

@wjones127
Copy link
Copy Markdown
Contributor

@wjones127 wjones127 commented Oct 9, 2025

  • IOTracker is now public in lance-io, so it can be re-used throughout the codebase.
  • Added new assertions assert_io_eq!(), assert_io_lt!() and assert_io_gt!() which will print out the list of requests in case of failure:
thread 'dataset::tests::test_load_manifest_iops' panicked at rust/lance/src/dataset.rs:2882:9:
assertion failed: `(left == right)`: Expected read_iops to be 3, got 2. Requests: [
    IORequest(method=list, path="test/_versions"),
    IORequest(method=get_opts, path="test/_versions/1.manifest"),
]

Diff < left / right > :
<2
>3

@github-actions github-actions Bot added the chore label Oct 9, 2025
@wjones127 wjones127 marked this pull request as ready for review October 10, 2025 17:25
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Oct 10, 2025

Codecov Report

❌ Patch coverage is 71.05263% with 66 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.67%. Comparing base (a3ed68d) to head (58b6f11).
⚠️ Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
rust/lance-io/src/utils/tracking_store.rs 56.37% 65 Missing ⚠️
rust/lance/src/dataset/write/commit.rs 96.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4923      +/-   ##
==========================================
+ Coverage   81.64%   81.67%   +0.03%     
==========================================
  Files         333      334       +1     
  Lines      131594   132492     +898     
  Branches   131594   132492     +898     
==========================================
+ Hits       107444   108219     +775     
- Misses      20550    20637      +87     
- Partials     3600     3636      +36     
Flag Coverage Δ
unittests 81.67% <71.05%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely an improvement, thanks!

I think the only thing I still find annoying is the need to setup the object_store_wrapper. Thoughts for future PR: I think (could be wrong) that we wrap all object stores today with the tracing object store anyways. I wonder if it would be possible to just put IOP tracking inside the tracing object store. Then the dataset can just have an io_stats method that returns stats for the dataset's usage of the object store (or even global object store stats, if the object store is shared amongst datasets).

Comment on lines +343 to +372
fn list(&self, prefix: Option<&Path>) -> BoxStream<'static, OSResult<ObjectMeta>> {
let _guard = self.hop_guard();
self.record_read("list", prefix.cloned().unwrap_or_default(), 0, None);
self.target.list(prefix)
}

fn list_with_offset(
&self,
prefix: Option<&Path>,
offset: &Path,
) -> BoxStream<'static, OSResult<ObjectMeta>> {
self.record_read(
"list_with_offset",
prefix.cloned().unwrap_or_default(),
0,
None,
);
self.target.list_with_offset(prefix, offset)
}

async fn list_with_delimiter(&self, prefix: Option<&Path>) -> OSResult<ListResult> {
let _guard = self.hop_guard();
self.record_read(
"list_with_delimiter",
prefix.cloned().unwrap_or_default(),
0,
None,
);
self.target.list_with_delimiter(prefix).await
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't have to be this PR but I wonder if we should start recording lists as a separate operation entirely (e.g. not reads)

@wjones127
Copy link
Copy Markdown
Contributor Author

I think the only thing I still find annoying is the need to setup the object_store_wrapper.

It's not just that. You also have to to use the memory:// or file-object-store:// URI, since we bypass object store APIs often when you just use file paths. So it's a bit annoying.

Thoughts for future PR: I think (could be wrong) that we wrap all object stores today with the tracing object store anyways. I wonder if it would be possible to just put IOP tracking inside the tracing object store.

Hmm, now that's an interesting idea. Yeah we could have some sort of test feature that enables collection of IO requests. And then it wouldn't require much setup. I'll think more about that.

@wjones127 wjones127 merged commit 7e65e8b into lance-format:main Oct 10, 2025
28 of 29 checks passed
jackye1995 pushed a commit to jackye1995/lance that referenced this pull request Jan 21, 2026
* `IOTracker` is now public in `lance-io`, so it can be re-used
throughout the codebase.
* Added new assertions `assert_io_eq!()`, `assert_io_lt!()` and
`assert_io_gt!()` which will print out the list of requests in case of
failure:

```rust
thread 'dataset::tests::test_load_manifest_iops' panicked at rust/lance/src/dataset.rs:2882:9:
assertion failed: `(left == right)`: Expected read_iops to be 3, got 2. Requests: [
    IORequest(method=list, path="test/_versions"),
    IORequest(method=get_opts, path="test/_versions/1.manifest"),
]

Diff < left / right > :
<2
>3
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants