-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[opt](inverted index) Enhance I/O statistics collection for the inverted index in file cache scenarios #48950
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
75a9b79 to
2dc3954
Compare
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
…ted index in file cache scenarios
airborne12
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
PR approved by at least one committer and no changes requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds IOContext propagation to various inverted index reader and query components and extends file cache statistics and profiling to separately track inverted-index-specific I/O metrics.
- Pass
io::IOContextthrough inverted index readers, visitors, and query classes for context-aware I/O. - Introduce new fields in
FileCacheStatisticsandFileCacheProfileReporterfor inverted-index local/remote I/O counts, bytes, and timers. - Update
CachedRemoteFileReaderto increment the new inverted-index counters when reading data.
Reviewed Changes
Copilot reviewed 24 out of 24 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| be/src/olap/rowset/segment_v2/inverted_index_reader.{h,cpp} | Add io_ctx parameter paths and store in InvertedIndexVisitor. |
| be/src/olap/rowset/segment_v2/inverted_index/util/*.h | Add io_ctx to ensure_term_* helper signatures. |
| be/src/olap/rowset/segment_v2/inverted_index/query/*.{h,cpp} | Store and use _io_ctx when calling Lucene readers. |
| be/src/io/io_common.h | Define new inverted-index-specific statistics fields. |
| be/src/io/cache/cached_remote_file_reader.cpp | Update inverted-index I/O stats in _update_stats. |
| be/src/io/cache/block_file_cache_profile.h | Add and update profiling counters for inverted-index I/O metrics. |
Comments suppressed due to low confidence (3)
be/src/io/cache/block_file_cache_profile.h:129
- [nitpick] The counter name 'inverted_index_bytes_scanned_from_cache' is inconsistent with the statistic field 'inverted_index_bytes_read_from_local'. Consider renaming one to match the other (e.g., use 'bytes_read_from_cache' or rename the struct field to 'bytes_scanned_from_cache').
inverted_index_bytes_scanned_from_cache = ADD_CHILD_COUNTER_WITH_LEVEL(
be/src/io/cache/block_file_cache_profile.h:159
- [nitpick] Updating 'inverted_index_bytes_scanned_from_cache' from 'statistics->inverted_index_bytes_read_from_local' highlights the naming mismatch. Aligning these names will reduce confusion when analyzing profiling output.
COUNTER_UPDATE(inverted_index_bytes_scanned_from_cache,
be/src/io/cache/cached_remote_file_reader.cpp:357
- New inverted-index-specific counters are updated here. Consider adding or updating unit tests to verify that 'inverted_index_num_local_io_total', 'inverted_index_num_remote_io_total', and related byte/timer counters are correctly incremented under both cache hits and misses.
if (is_inverted_index) {
|
run buildall |
TPC-H: Total hot run time: 33874 ms |
TPC-DS: Total hot run time: 192977 ms |
ClickBench: Total hot run time: 29.41 s |
…ted index in file cache scenarios (apache#48950) Problem Summary: add io statistics in file cache stats for inverted index
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)