Call initiating_close and delay delete_stream#3716
Call initiating_close and delay delete_stream#3716shinrich wants to merge 1 commit intoapache:masterfrom
Conversation
|
I have one concern with replacing
On the other hand, |
|
Yes, I'm not entirely sure this change is necessary. I'm testing with this backed out of the environment that is exhibiting the leak. Will close this PR assuming that the other changes make this good. |
|
Ran without these fixes last night and the zombie event triggered multiple times on the INACTIVITY_TIMEOUT. At the time of the zombie assert, fini_received is true, total_client_stream_count is 1, and stream_list is empty. I'll adjust my test build to change the EOS to ERROR in initiating_close() but not make the other changes. If that doesn't work, I'll try making the change from delete_stream to initiating_close only for cleanup_streams. Sadly, the only machine I have access to exhibiting this behavior is in a far time zone and this issue only triggers at high traffic time, so I probably have to wait 15 hours to see if my current attempt fixes the issue. |
|
Just changing the EOS to ERROR did not suffice. I'm running now with changing delete_stream to initiating_close only for the cleanup_streams method. |
3a8ec5e to
35aa746
Compare
|
Pushed a new version. My problem machine ran all weekend and one business day with this fix and did not trigger the zombie debug crash. We only replace delete_stream with initiating close on the cleanup_streams case which gets called in the cases when the session is coming down hard. I also changed the destroy to transaction_done so the case where the HttpSM has already been cleared follows the more standard clean up process. My last asserts were actually this case. The current_reader (HttpSM) was null so it was never getting freed and termimate_stream was never being set. Our version did not have the destroy in place. Actually it looks like @masaori335 already caught this with commit 469ccb4. I'll go ahead and try my problem case with the destroy added instead. |
|
Closing this for now. I need to get back to this scenario and reverify. |
Another issue found while tracking down leaked Http2ClientSessions using the zombie event (PR #3713).
Some cases the stream would be removed from the stream_list but then never deleted. Replacing the calls to delete_stream with a call to stream->initiating_close solved the problem. The delete_stream gets called from the stream destroys.