-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Optimize] Take 'tablet scan frequency' into consideration when selecting a tablet for compaction #4837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Optimize] Take 'tablet scan frequency' into consideration when selecting a tablet for compaction #4837
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -66,7 +66,9 @@ Tablet::Tablet(TabletMetaSharedPtr tablet_meta, DataDir* data_dir, | |
| _last_cumu_compaction_success_millis(0), | ||
| _last_base_compaction_success_millis(0), | ||
| _cumulative_point(K_INVALID_CUMULATIVE_POINT), | ||
| _cumulative_compaction_type(cumulative_compaction_type) { | ||
| _cumulative_compaction_type(cumulative_compaction_type), | ||
| _last_record_scan_count(0), | ||
| _last_record_scan_count_timestamp(time(nullptr)) { | ||
| // construct _timestamped_versioned_tracker from rs and stale rs meta | ||
| _timestamped_version_tracker.construct_versioned_tracker(_tablet_meta->all_rs_metas(), | ||
| _tablet_meta->all_stale_rs_metas()); | ||
|
|
@@ -1311,4 +1313,16 @@ void Tablet::generate_tablet_meta_copy_unlocked(TabletMetaSharedPtr new_tablet_m | |
| new_tablet_meta->init_from_pb(tablet_meta_pb); | ||
| } | ||
|
|
||
| double Tablet::calculate_scan_frequency() { | ||
| time_t now = time(nullptr); | ||
| int64_t current_count = query_scan_count->value(); | ||
| double interval = difftime(now, _last_record_scan_count_timestamp); | ||
| double scan_frequency = (current_count - _last_record_scan_count) * 60 / interval; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why multi 60?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
It means the average count of tablet scans for each minute, Otherwise it will be the average count of tablet scans for each second . |
||
| if (interval >= config::tablet_scan_frequency_time_node_interval_second) { | ||
| _last_record_scan_count = current_count; | ||
| _last_record_scan_count_timestamp = now; | ||
| } | ||
| return scan_frequency; | ||
| } | ||
|
|
||
| } // namespace doris | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -180,6 +180,24 @@ Metrics: {"filtered_rows":0,"input_row_num":3346807,"input_rowsets_count":42,"in | |
|
|
||
| ### `column_dictionary_key_size_threshold` | ||
|
|
||
| ### `compaction_tablet_compaction_score_factor` | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why no document for these 2 configs?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
done. |
||
|
|
||
| * 类型:int32 | ||
| * 描述:选择tablet进行compaction时,计算 tablet score 的公式中 compaction score的权重。 | ||
| * 默认值:1 | ||
|
|
||
| ### `compaction_tablet_scan_frequency_factor` | ||
|
|
||
| * 类型:int32 | ||
| * 描述:选择tablet进行compaction时,计算 tablet score 的公式中 tablet scan frequency 的权重。 | ||
| * 默认值:0 | ||
|
|
||
| 选择一个tablet执行compaction任务时,可以将tablet的scan频率作为一个选择依据,对当前最近一段时间频繁scan的tablet优先执行compaction。 | ||
| tablet score可以通过以下公式计算: | ||
|
|
||
| tablet_score = compaction_tablet_scan_frequency_factor * tablet_scan_frequency + compaction_tablet_scan_frequency_factor * compaction_score | ||
|
|
||
|
|
||
| ### `compress_rowbatches` | ||
|
|
||
| ### `create_tablet_worker_count` | ||
|
|
@@ -607,6 +625,12 @@ Stream Load 一般适用于导入几个GB以内的数据,不适合导入过大 | |
|
|
||
| ### `tablet_meta_checkpoint_min_new_rowsets_num` | ||
|
|
||
| ### `tablet_scan_frequency_time_node_interval_second` | ||
|
|
||
| * 类型:int64 | ||
| * 描述:用来表示记录 metric 'query_scan_count' 的时间间隔。为了计算当前一段时间的tablet的scan频率,需要每隔一段时间记录一次 metric 'query_scan_count'。 | ||
| * 默认值:300 | ||
|
|
||
| ### `tablet_stat_cache_update_interval_second` | ||
|
|
||
| ### `tablet_rowset_stale_sweep_time_sec` | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do they need to be normalized? If needed, you should define them as double.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normalization is not required.