[Proposal] The execution model of compaction needs to improve

### Current model
Each dir creates M threads for base compaction, N threads for cumulative compaction. And compaction threads execute one compaction in one cycle(may skip execution because no best tablet).
Too many compaction tasks may run out of memory, so we limit the max concurrency of running compaction tasks by semaphore. 
#### Problem
It only limits the thread number. If the running threads cost too much memory, we can't defense it.
If we reduce concurrency to avoid OOM, we can't do compaction in time. We may meet more heavy compactions.
So concurrency limitation is not enough.

### Proposal
The most desirable solution is limiting the memory. But this solution assumes that we can estimate the memory usage of one compaction. It's diffcult. 
So we can only refer to the tablet score(the segments num). It has positive correlation with memory, but can't simply estimate the mem usage by a scale factor.
What about a model of scores limitation?
A compaction needs to acquire the permits(equals to it's score), and release the permits when it finished. So it wiil be low concurrency when high score compactions running, and high concurrency when low score compactions running.

 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] The execution model of compaction needs to improve #3624

Current model

Problem

Proposal

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Proposal] The execution model of compaction needs to improve #3624

Description

Current model

Problem

Proposal

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions