Eagerly unmap during resize by shikhar · Pull Request #7540 · apache/kafka

shikhar · 2019-10-17T02:44:59Z

The mmap field gets re-initialized here, so the old object will become garbage-collectible. We should be eagerly unmapping here too, not just for Windows support.

shikhar · 2019-10-17T02:49:51Z

cc @ijuma since you were involved in such PRs

Motivation for this change is that we are seeing pauses on the brokers which seem a lot like https://issues.apache.org/jira/browse/KAFKA-4614. That has a fix version of 0.10.2.0 (PR was #2352). I also saw that you made an improvement here #5757 which has been in Kafka since 2.2.

We are on Kafka 2.2.

shikhar · 2019-10-18T00:01:38Z

After further investigation today it doesn't look like the pauses are during GC, but also other safepoints - time-to-safepoint is high, which seems like can easily bite with memory-mapped IO and page faults to disks in cloud environments.

Interesting threads
https://groups.google.com/forum/m/#!msg/mechanical-sympathy/htQ3Rc1JEKk/ThuVpe5kBgAJ
https://groups.google.com/forum/#!msg/mechanical-sympathy/tepoA7PRFRU/wyKeIyCjBwAJ

UPDATE: For reference our safepoint issues, and high IO throttling to cloud disks, went away with tweaks to virtual memory sysctls

vm.dirty_ratio=60
vm.dirty_background_ratio=5
vm.dirty_expire_centisecs=500

mlex · 2020-07-10T11:53:20Z

We are running into a similar issue and see lots of references to already deleted index files held by the kafka process. Is there a reason why the safeForceUnmap shouldn't be called inside the resize method? With the current code, the trimToValidSize will always leave behind a mmap reference that's not unmapped (and will be dealt with by gc - which can lead to exactly the problem described in https://issues.apache.org/jira/browse/KAFKA-4614).

ijuma · 2021-07-17T21:00:55Z

The challenge is that we don't lock on operations like lookup on Linux. So, we have to ensure no reference to mmap is held before we can unmap.

Eagerly unmap during resize

4dcd9a5

The mmap field gets re-initialized here, so the old object will become garbage-collectible. We should be eagerly unmapping here too, not just for Windows support.

shikhar closed this Sep 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eagerly unmap during resize#7540

Eagerly unmap during resize#7540
shikhar wants to merge 1 commit intoapache:trunkfrom
shikhar:patch-1

shikhar commented Oct 17, 2019

Uh oh!

shikhar commented Oct 17, 2019

Uh oh!

shikhar commented Oct 18, 2019 •

edited

Loading

Uh oh!

mlex commented Jul 10, 2020

Uh oh!

ijuma commented Jul 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

shikhar commented Oct 17, 2019

Uh oh!

shikhar commented Oct 17, 2019

Uh oh!

shikhar commented Oct 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mlex commented Jul 10, 2020

Uh oh!

ijuma commented Jul 17, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shikhar commented Oct 18, 2019 •

edited

Loading