Skip to content

fix: split index_statistics to reduce rustc query depth#5894

Merged
Xuanwo merged 2 commits intolance-format:mainfrom
Xuanwo:luban/frequent-impulse
Feb 5, 2026
Merged

fix: split index_statistics to reduce rustc query depth#5894
Xuanwo merged 2 commits intolance-format:mainfrom
Xuanwo:luban/frequent-impulse

Conversation

@Xuanwo
Copy link
Copy Markdown
Collaborator

@Xuanwo Xuanwo commented Feb 5, 2026

This refactors DatasetIndexExt::index_statistics into smaller helper functions to reduce the async state-machine complexity.

This avoids rustc query depth overflow when compiling with nightly + coverage instrumentation and many enabled features.


Parts of this PR were drafted with assistance from Codex (with gpt-5.2) and fully reviewed and edited by me. I take full responsibility for all changes.

@github-actions github-actions Bot added the bug Something isn't working label Feb 5, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 5, 2026

Code Review

This is a well-structured refactoring that addresses the rustc query depth overflow issue when compiling with nightly + coverage instrumentation.

Summary

The PR successfully breaks down the large index_statistics async function into smaller helper functions:

  • sum_indexed_rows_per_delta - synchronous row counting
  • unique_indexed_fragment_count - synchronous fragment deduplication
  • serialize_index_statistics - JSON serialization helper
  • migrate_and_recompute_index_statistics - migration handling
  • index_statistics_frag_reuse / index_statistics_mem_wal - special index handlers
  • index_statistics_scalar - main scalar index statistics
  • collect_scalar_indices_statistics - per-index statistics collection
  • fragment_statistics_or_migrate - fragment statistics with migration fallback

Verification

The behavior is preserved correctly. I noted that the original code tracked index_typename but never used it in the output JSON (only index_type was used), so its removal is appropriate cleanup.

LGTM

This avoids rustc 'queries overflow the depth limit' errors in nightly llvm-cov builds by boxing a few deep futures so the async state machines stay small.
Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for investigating into this. I saw the CI error and had no idea what to do 😆. I have a few naming nits but they aren't absolutely required.

Comment thread rust/lance/src/index.rs Outdated
Comment thread rust/lance/src/index.rs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 5, 2026

Codecov Report

❌ Patch coverage is 73.24841% with 42 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/index.rs 73.24% 29 Missing and 13 partials ⚠️

📢 Thoughts on this report? Let us know!

@Xuanwo Xuanwo force-pushed the luban/frequent-impulse branch from e457959 to 62658cd Compare February 5, 2026 16:26
Use clearer helper names per review feedback.
@Xuanwo Xuanwo merged commit 5fb1c70 into lance-format:main Feb 5, 2026
32 checks passed
jackye1995 added a commit that referenced this pull request Feb 12, 2026
Cherry-pick from main to fix CI recursion limit error.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
jackye1995 added a commit that referenced this pull request Feb 12, 2026
Cherry-pick from main to fix CI recursion limit error.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants