-
Notifications
You must be signed in to change notification settings - Fork 963
[bk-gc] Prevent OOM: Restrict number of entry-loggers extraction #1938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
I think a better long-term fix here is to remove the entyLogMetadata maps from the bookie memory and instead store them on disk (eg: in rocksdb). That will ensure that the amount of memory required by a bookie is completely independent from the amount of data stored in it. |
could be good idea. I will make the change. |
|
@rdhabalia to begin with, this is a major change touching one of the core component - gc. Functional tests validating the correctness and tests proving there wont be regression would be helpful. |
|
@rdhabalia correct me if I'm wrong with the following math first of all I'm surprised that you have 10K entrylog files in a Bookie. What is your 'logSizeLimit'? the default value is 1 GB. If it is the default value then 10K entrylog file would take 10 TB. lets assume you have 10 K entrylog files, for each entrylog file there will be EntryLogMetadata. Each EntryLogMetadata mainly contains Map of <long ledgerid, long ledgersize>. Lets assume on average in each entrylog file there will be entries of 100 ledgers. So it would be 100 * (8+8) = 1.6 KB. Including other overhead lets make it 4 KB. So 10 K * 4 KB = 40 MB. Even if we assume that there are entries of 1000 ledgers in an entrylog, including overhead memory size of EntryLogMetaData should be 40 KB, so 10K * 40KB = 400 MB. So I'm not sure how memory needed for entryLogMetaMap could become bottleneck. |
| entryLogMetaMap = extractMetaFromEntryLogs(entryLogMetaMap, isIteration); | ||
|
|
||
| // gc inactive/deleted ledgers | ||
| doGcLedgers(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is very expensive operation. It iterates over all the ledger ranges in the metadata server/all the ledgers in the bookie, this shouldn't be called multiple times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, that's correct. therefore, have provided option maxEntryLoggersPerScan=2000. In normal case, bookie will have < 500 entryLog files so, it will not require multiple iteration but it will target issue where bookie is not able to recover with large number of entrylog files. and therefore, @merlimat suggested to persist entryLogmetadataMap into rocksDB so, I have created PR: #1949
| doGcLedgers(); | ||
|
|
||
| // gc entry logs | ||
| doGcEntryLogs(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method iterates over all the entries in entryLogMetaMap, again calling this multiple times is not sensible.
|
@reddycharan |
|
if you are planning to do - #1949 , then is this change needed? |
|
@merlimat @reddycharan what's the status of this patch ? |
…1949) ### Motivation It addresses #1938, where it provides interface for `entyLogMetadataMap` so, `entrylogMetadata` can be cached into RocksDB instead main-memory. ### Modification - introduced RocksDB entryLogMetadataMap to cache entry-log metadata. - it can fallback to in-memory implementation using `gcPersistentEntrylogMetadataMapEnabled` flag and that can be removed in future release and keep only rocksDB implementation in future. ### Result - It will help to avoid storing entryLogMetadata into main-memory.
…pache#1949) ### Motivation It addresses apache#1938, where it provides interface for `entyLogMetadataMap` so, `entrylogMetadata` can be cached into RocksDB instead main-memory. ### Modification - introduced RocksDB entryLogMetadataMap to cache entry-log metadata. - it can fallback to in-memory implementation using `gcPersistentEntrylogMetadataMapEnabled` flag and that can be removed in future release and keep only rocksDB implementation in future. ### Result - It will help to avoid storing entryLogMetadata into main-memory.
Motivation
Due to #1937, bookies may not run GC that can lead to have large number of entry-logger files (10K) on the disk. After restarting bookie, bookie can trigger and GC tries to extract and loads all entrylog-metadata into memory by adding into entryLogMetaMap which eventually causes OOM in bookie and bookies keep restarting. this issue can be easily reproduced by adding large number of entrylogmetadata objects into the map and jvm will go OOM.
Changes
Bookie should extract entry-loggers in chunk to avoid memory pressure and OOM issues.