-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Current model
Each dir creates M threads for base compaction, N threads for cumulative compaction. And compaction threads execute one compaction in one cycle(may skip execution because no best tablet).
Too many compaction tasks may run out of memory, so we limit the max concurrency of running compaction tasks by semaphore.
Problem
It only limits the thread number. If the running threads cost too much memory, we can't defense it.
If we reduce concurrency to avoid OOM, we can't do compaction in time. We may meet more heavy compactions.
So concurrency limitation is not enough.
Proposal
The most desirable solution is limiting the memory. But this solution assumes that we can estimate the memory usage of one compaction. It's diffcult.
So we can only refer to the tablet score(the segments num). It has positive correlation with memory, but can't simply estimate the mem usage by a scale factor.
What about a model of scores limitation?
A compaction needs to acquire the permits(equals to it's score), and release the permits when it finished. So it wiil be low concurrency when high score compactions running, and high concurrency when low score compactions running.