Fix crash in Http2ClientSession::release_netvc#4147
Merged
shinrich merged 1 commit intoapache:masterfrom Aug 27, 2018
Merged
Conversation
SolidWallOfCode
approved these changes
Aug 24, 2018
Member
Author
|
This fix has been running in one of our colos for 4 days and this crash has not appeared again. |
Contributor
|
I'm going to remove the 7.1.5 Project on this, since going forward, we will need to make a second PR against the 7.1.x for proposed back ports. |
Contributor
|
@masaori335 @maskit Is this a fix that we want / need for 7.1.x ? |
Contributor
|
IMO, we need this for 7.1.x. |
Contributor
|
Cool. This doesn't cherry-pick cleanly to 7.1.x, can either of you make a PR against the 7.1.x branch for this please? |
Contributor
|
We face same crash on 8.0.3 (rc). We need to backport this on 8.0.x. |
Contributor
|
Cherry-picked to 8.0.x |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We've seen this crash every few days for a while. Finally think I figured out the issue.
The stack is confusing when you look at the code be cause it appears that we are calling through a null client_vc, but the line above we check that client_vc is not null. I think the real problem is that client_vc has been closed and frees in the ProxyClientSession::handle_api_return() call down the stack.
If the original event came in on client_vc, this would have been ok because the vc's recursion count would be > 0 and the actual free wouldn't occur until we unwound the stack. But in this case the EOS event came in on the server VC, and this client VC is accessed form the HttpSM shutdown.
Just reordering the calls to vc->do_io_close and this->release_netvc in ProxyClientSession::handle_api_return should fix the issue. But looking at the bigger picture, I don't think that release_netvc is needed at all at this point. All it does is make sure that there is no lingering client session data on the netvc. But it is always called right next to a call to close the netvc, so at this point it adds no value.
I'm putting up this PR, but will be running this change internally for a bit to verify that it does address the issue.