Skip to content

[Proposal] Take 'tablet scan frequency' into consideration when selecting a tablet for compaction #4834

@weizuo93

Description

@weizuo93

A large number of small segment files will lead to low efficiency for scan operations. Multiple small files can be merged into a large file by compaction operation. So we could take the tablet scan frequency into consideration when selecting an tablet for compaction and preferentially do compaction for those tablets which are scanned frequently during a latest period of time at the present.

Using the compaction strategy of Kudufor reference, scan frequency can be calculated for tablet during a latest period of time at the present and be taken into consideration when calculating compaction score. New compaction score can be calculated like this:

new_compaction_score = k1 * tablet_scan_frequency + k2 * old_compaction_score

k1andk2can be set dynamically through http interface /api/update_config.
We can add a metric query_scan_count for each tablet which records the scan count of the tablet. Thus, tablet scan frequency can be calculated like this:
tablet_scan_frequency = (now_query_scan_count - last_query_scan_count) / (now_time - last_time)
last_query_scan_count and last_time will be updated every time an interval passes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions