[fix][broker] Use expireAfterAccess instead of expireAfterWrite for metadata cache#21095
[fix][broker] Use expireAfterAccess instead of expireAfterWrite for metadata cache#21095codelipenghui wants to merge 1 commit intoapache:masterfrom
Conversation
|
/pulsarbot run-failure-checks |
horizonzy
left a comment
There was a problem hiding this comment.
We should also override the
expireAfterAccess value in the LeaderElectionImpl
Maybe we don't want to expire the load data? |
If we don't want to expire the load data. After this PR, it will expire by |
| */ | ||
| @Builder.Default | ||
| private final long expireAfterWriteMillis = 2 * DEFAULT_CACHE_REFRESH_TIME_MILLIS; | ||
| private final long expireAfterWriteMillis = 0L; |
There was a problem hiding this comment.
Change the default value to -1. Use -1 to disable the expire maybe better to understand.
+1 |
### Modifications When upgraded the pulsar version from 2.9.2 to 2.10.3, and the isolated group feature not work anymore. Finally, we found the problem. In IsolatedBookieEnsemblePlacementPolicy, when it gets the bookie rack from the metadata store cache, uses future.isDone() to avoid sync operation. If the future is incomplete, return empty blacklists. The cache may expire due to the caffeine cache `getExpireAfterWriteMillis` config, if the cache expires, the future may be incomplete. (#21095 will correct the behavior) In 2.9.2, it uses the sync to get data from the metadata store, we should also keep the behavior.
### Modifications When upgraded the pulsar version from 2.9.2 to 2.10.3, and the isolated group feature not work anymore. Finally, we found the problem. In IsolatedBookieEnsemblePlacementPolicy, when it gets the bookie rack from the metadata store cache, uses future.isDone() to avoid sync operation. If the future is incomplete, return empty blacklists. The cache may expire due to the caffeine cache `getExpireAfterWriteMillis` config, if the cache expires, the future may be incomplete. (#21095 will correct the behavior) In 2.9.2, it uses the sync to get data from the metadata store, we should also keep the behavior.
When upgraded the pulsar version from 2.9.2 to 2.10.3, and the isolated group feature not work anymore. Finally, we found the problem. In IsolatedBookieEnsemblePlacementPolicy, when it gets the bookie rack from the metadata store cache, uses future.isDone() to avoid sync operation. If the future is incomplete, return empty blacklists. The cache may expire due to the caffeine cache `getExpireAfterWriteMillis` config, if the cache expires, the future may be incomplete. (#21095 will correct the behavior) In 2.9.2, it uses the sync to get data from the metadata store, we should also keep the behavior.
|
The pr had no activity for 30 days, mark with Stale label. |
|
Close this PR since the expireAfterAccess method will cost more CPU resources for tracking the access. |
Motivation
#14154 introduced a metadata cache expiration policy to avoid a potential memory leak in the Metadata caches. But it uses the
expireAfterWriteto force expire the caches no matter if the entry has been actively accessed or not.We also found another related regression from 2.9 to 2.10. The BookKeeper placement policy doesn't work as expected if the cache expires. The code itself also has the problem, but without the
expireAfterWrite, it works great.pulsar/pulsar-broker-common/src/main/java/org/apache/pulsar/bookie/rackawareness/IsolatedBookieEnsemblePlacementPolicy.java
Lines 189 to 198 in b69f4ef
For most cases, disabling the
expireAfterWriteand enablingexpireAfterAccesswill resolve the issue because the new ledgers will happen more frequently than 10 mins if have some topics.Modification
expireAfterWriteby defaultexpireAfterAccessby defaultDoes this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
docdoc-requireddoc-not-neededdoc-complete