Skip to content

[fix][plugin][pulsar] Clean message context after message is recycled to prevent memory leak.#405

Merged
wu-sheng merged 3 commits intoapache:mainfrom
tjiuming:dev/pulsar_msg_recycle
Dec 7, 2022
Merged

[fix][plugin][pulsar] Clean message context after message is recycled to prevent memory leak.#405
wu-sheng merged 3 commits intoapache:mainfrom
tjiuming:dev/pulsar_msg_recycle

Conversation

@tjiuming
Copy link
Copy Markdown
Contributor

@tjiuming tjiuming commented Dec 7, 2022

Currently, we set new MessageEnhanceRequiredInfo() into MessageImpl to store the tracing context.
But Pulsar has a Recycle mechanism, see: https://github.com/apache/pulsar/blob/master/pulsar-client/src/main/java/org/apache/pulsar/client/impl/MessageImpl.java#L126
The MessageImpl will not GC by JVM, so the new MessageEnhanceRequiredInfo() settled into MessageImpl won't be GC too, it will lead to memory leak.
The PR will clean the new MessageEnhanceRequiredInfo() settled into MessageImpl to prevent memory leak.

  • If this pull request closes/resolves/fixes an existing issue, replace the issue number. Closes #.
  • Update the CHANGES log.

@tjiuming tjiuming marked this pull request as draft December 7, 2022 11:23
@tjiuming tjiuming marked this pull request as ready for review December 7, 2022 11:25
@wu-sheng wu-sheng added bug Something isn't working plugin labels Dec 7, 2022
@wu-sheng
Copy link
Copy Markdown
Member

wu-sheng commented Dec 7, 2022

The MessageImpl will not GC by JVM, so the new MessageEnhanceRequiredInfo() settled into MessageImpl won't be GC too, it will lead to memory leak.

If you mean MessageImpl objects are pooled, then the current codes could lead to dirty data. But leak? I doubt that.

@tisonkun @codelipenghui Could any of you confirm this?

@wu-sheng wu-sheng added this to the 8.14.0 milestone Dec 7, 2022
@tjiuming
Copy link
Copy Markdown
Contributor Author

tjiuming commented Dec 7, 2022

The MessageImpl will not GC by JVM, so the new MessageEnhanceRequiredInfo() settled into MessageImpl won't be GC too, it will lead to memory leak.

If you mean MessageImpl objects are pooled, then the current codes could lead to dirty data. But leak? I doubt that.

in my opinion, I just feeling that release the tracing context as soon as the message recycled will be helpful to saving the heap memory.
My words are wrong, it's not memory leak, because the MessageImpl is reachable and usable.

@tjiuming
Copy link
Copy Markdown
Contributor Author

tjiuming commented Dec 7, 2022

there is also a problem, I'm not sure that if fix it is better:
we enhance ConsumerImpl#messageProccessed to generate Consumer/Receive event, but usually, the duration of executing ConsumerImpl#messageProccessed is veeeery short.
uses are using Consumer#receive()/receive(long,TimeUnit)/receiveAsync()/batchReceive()/batchReceiveAsync() cannot get the actual duration

@wu-sheng
Copy link
Copy Markdown
Member

wu-sheng commented Dec 7, 2022

we enhance ConsumerImpl#messageProccessed to generate Consumer/Receive event, but usually, the duration of executing ConsumerImpl#messageProccessed is veeeery short.
uses are using Consumer#receive()/receive(long,TimeUnit)/receiveAsync()/batchReceive()/batchReceiveAsync() cannot get the actual duration

About how to get the duration accurate, I think you are better than me. Pulsar plugin mostly was added in 2019, and there was no async-span concept back then.
Now, if you want to make a span duration accurate, we could open the span, and declare it is async and finish it in the callback or another thread when you need. The tracing context would be held by the agent kernel until you finish the span async.

Ref doc, https://skywalking.apache.org/docs/skywalking-java/next/en/setup/service-agent/java-agent/java-plugin-development-guide/#advanced-apis

@tjiuming
Copy link
Copy Markdown
Contributor Author

tjiuming commented Dec 7, 2022

There is one point I need to explain:
I searched the new MessageEnhanceRequiredInfo() settled into MessageImpl usage, in PulsarConsumerInterceptor, it checked if the result of (MessageEnhanceRequiredInfo) msg .getSkyWalkingDynamicField() is null. see: https://github.com/apache/skywalking-java/blob/main/apm-sniffer/apm-sdk-plugin/pulsar-common/src/main/java/org/apache/skywalking/apm/plugin/pulsar/common/PulsarConsumerInterceptor.java#L87
it won't lead to NPE

@wu-sheng
Copy link
Copy Markdown
Member

wu-sheng commented Dec 7, 2022

This field is initialized as null. Basically, this should be fine.

@wu-sheng wu-sheng merged commit 1a7d2f7 into apache:main Dec 7, 2022
@wu-sheng
Copy link
Copy Markdown
Member

wu-sheng commented Dec 7, 2022

@tjiuming Oops, you seem missed the changes.md update. Please submit another PR to add one.

@tjiuming
Copy link
Copy Markdown
Contributor Author

tjiuming commented Dec 7, 2022

@tjiuming Oops, you seem missed the changes.md update. Please submit another PR to add one.

OK

@tjiuming tjiuming deleted the dev/pulsar_msg_recycle branch December 7, 2022 15:58
@tjiuming tjiuming mentioned this pull request Dec 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants