[Improvement] DataSegment intern improvement (reduce 60% memory consume on coordinator)#8165
[Improvement] DataSegment intern improvement (reduce 60% memory consume on coordinator)#8165pzhdfy wants to merge 3 commits intoapache:masterfrom
Conversation
| private static final Interner<String> STRING_INTERNER = Interners.newWeakInterner(); | ||
| private static final Interner<List<String>> DIMENSIONS_INTERNER = Interners.newWeakInterner(); | ||
| private static final Interner<List<String>> METRICS_INTERNER = Interners.newWeakInterner(); | ||
| private static final Interner<DataSegment> DATA_SEGMENT_INTERNER = Interners.newWeakInterner(); |
There was a problem hiding this comment.
This humongous Map with millions of weak references would be a problem for GC in itself: see #6357 for context.
You should adopt the design with BatchServerInverntoryView and SQLSegmentMetadataManager probing into each other's memory, similarly to what is explained here: #7395 (comment)
| private final Map<String, Object> loadSpec; | ||
| private final List<String> dimensions; | ||
| private final List<String> metrics; | ||
| private volatile Map<String, Object> loadSpec; |
There was a problem hiding this comment.
Making DataSegment non-immutable was ruled out in this discussion: #7571. Please read it in full. I proposed a solution here: #7571 (comment). Please check if you can implement it in this PR.
| { | ||
| DataSegment result = DATA_SEGMENT_INTERNER.intern(dataSegment); | ||
| if (updateLoadSpec) { | ||
| result.dimensions = dataSegment.dimensions; |
There was a problem hiding this comment.
I think DataSegment shouldn't become mutable. However, it would be nice if you would solve this problem in this PR: #6358.
|
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the dev@druid.apache.org list. Thank you for your contributions. |
|
This pull request/issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |
Description
In our druid cluster, we have 10 millions active segments. And we set load rule with 2 replications.
Then we find coordinator consume 50GB memory and cause GC problem.
We dump the JVM and analysis the memory, then we found about 30 millions DataSegment objects.
This is because one segment will generate 3 DataSegment objects.
One is from poll DB in SQLMetadataSegmentManager
The other two is from zookeeper announcement(2 replications) in BatchServerInventoryView
This three DataSegment objects are usually the same,
So can use
Interner DATA_SEGMENT_INTERNER = Interners.newWeakInterner();
to deduplicate.
1.When poll from DB or read from Znode, we use DATA_SEGMENT_INTERNER to deduplicate.
2.When poll from DB, always update loadSpec in DataSegment, this is useful when deep storage migration.
3.When read from Znode, skip intern the realtime node.Because segment from realtime is short-time living and has incorrect size, dimensions,loadSpect
With this improvement.
The memory consume in coordinator reduces to 20GB.
And this is also useful in broker, from 35G to 18GB.