Skip to content

Bug 4156: Work around commStartHalfClosedMonitor() assertion#1442

Closed
uhliarik wants to merge 1 commit intosquid-cache:masterfrom
uhliarik:bug4156
Closed

Bug 4156: Work around commStartHalfClosedMonitor() assertion#1442
uhliarik wants to merge 1 commit intosquid-cache:masterfrom
uhliarik:bug4156

Conversation

@uhliarik
Copy link
Contributor

@uhliarik uhliarik commented Aug 4, 2023

No description provided.

@squid-prbot
Copy link
Collaborator

Can one of the admins verify this patch?

@uhliarik
Copy link
Contributor Author

uhliarik commented Aug 4, 2023

I have easily reproduced the crash with the latest squid 6.1:

# echo "half_closed_clients on" >> /etc/squid/squid.conf
# systemctl start squid
# cat request
GET http://www.example.com/ HTTP/1.0
Host: www.example.com


# nc localhost 3128 <request

GDB backtrace:

(gdb) bt
#0  0x00007fce82e90824 in __pthread_kill_implementation () from /lib64/libc.so.6
#1  0x00007fce82e3e8ee in raise () from /lib64/libc.so.6
#2  0x00007fce82e268ff in abort () from /lib64/libc.so.6
#3  0x0000558f9167b72a in xassert (msg=<optimized out>, file=<optimized out>, line=<optimized out>) at debug/debug.cc:1269
#4  0x0000558f915827a0 in commStartHalfClosedMonitor (fd=10) at /usr/src/debug/squid-6.1-2.fc39.x86_64/src/comm.cc:1579
#5  0x0000558f915592e6 in commMarkHalfClosed (fd=<optimized out>) at servers/../../src/comm.h:94
#6  Server::doClientRead (this=0x558f92f886b8, io=...) at servers/Server.cc:162
#7  0x0000558f91327f1c in CommCbMemFunT<Server, CommIoCbParams>::doDial (this=<optimized out>) at /usr/src/debug/squid-6.1-2.fc39.x86_64/src/CommCalls.h:190
#8  JobDialer<Server>::dial (call=..., this=<optimized out>) at base/AsyncJobCalls.h:175
#9  AsyncCallT<CommCbMemFunT<Server, CommIoCbParams> >::fire (this=0x558f92f8fd20) at ../src/base/AsyncCall.h:147
#10 0x0000558f91571500 in AsyncCallQueue::fire (this=0x558f92a66240) at base/../../src/base/RefCount.h:73
#11 0x0000558f913669c2 in EventLoop::dispatchCalls (this=0x7ffc951f6c80) at base/AsyncCallQueue.cc:42
#12 EventLoop::runOnce (this=0x7ffc951f6c80) at /usr/src/debug/squid-6.1-2.fc39.x86_64/src/EventLoop.cc:121
#13 0x0000558f914617d8 in EventLoop::run (this=0x7ffc951f6c80) at /usr/src/debug/squid-6.1-2.fc39.x86_64/src/EventLoop.cc:83
#14 SquidMain (argc=argc@entry=6, argv=argv@entry=0x7ffc951f6fb8) at /usr/src/debug/squid-6.1-2.fc39.x86_64/src/main.cc:1661
#15 0x0000558f9130d106 in SquidMainSafe (argv=0x7ffc951f6fb8, argc=6) at /usr/src/debug/squid-6.1-2.fc39.x86_64/src/main.cc:1353
#16 main (argc=6, argv=0x7ffc951f6fb8) at /usr/src/debug/squid-6.1-2.fc39.x86_64/src/main.cc:1341

@uhliarik
Copy link
Contributor Author

uhliarik commented Aug 4, 2023

After applying patch from https://bugs.squid-cache.org/show_bug.cgi?id=4156#c18 , squid is not crashing anymore.

@rousskov rousskov changed the title Bug4156 - Fix for crash with half_closed_client on Bug 4156: Work around commStartHalfClosedMonitor() assertion Aug 4, 2023
}

debugs(5, 5, "adding FD " << fd << " to " << *TheHalfClosed);
assert(isOpen(fd) && !commHasHalfClosedMonitor(fd));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to work around this assertion without fixing the underlying bug, then the condition of the proposed new if statement would have to change to become specific to the condition(s) in this assertion. And the handling of that condition would have to be changed to printing an Squid BUG 4156: ... ERROR, probably followed by return; (but that needs to be tested -- I do not know enough about this bug to propose the exact workaround).

Do you know what assertion condition is failing, isOpen(fd) or !commHasHalfClosedMonitor(fd)? If you do not, you can find out by splitting that assert() into two, the way it should have been written:

Suggested change
assert(isOpen(fd) && !commHasHalfClosedMonitor(fd));
assert(isOpen(fd));
assert(!commHasHalfClosedMonitor(fd));

One of the above assertions should fail in your tests (without the proposed if statement workaround!). Once we know which assertion fails (without the proposed if statement), we can start working on adjusting the proposed if statement accordingly.

Needless to say, it would be much better to actually fix the bug, but I do not have enough free time right now to triage and volunteer a proper fix, unfortunately.

{
if (!TheHalfClosed->empty()) {
debugs(5, 5, HERE << "half_closed is on fd = " << fd );
commCallCloseHandlers(fd);
Copy link
Contributor

@rousskov rousskov Aug 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commStartHalfClosedMonitor() should not Call Close Handlers. Close handlers should be called on read errors. And they should remain set until the connection is actually being closed. I know this change stops the assertion from happening, but so does disabling the obviously broken half_closed_client feature. The change is trading a loud assertion for silent uncertainty. This is a bad tradeoff IMO.

I am suggesting a path towards a possibly acceptable workaround in a dedicated change request.

@rousskov rousskov added the S-waiting-for-author author action is expected (and usually required) label Aug 4, 2023
@uhliarik
Copy link
Contributor Author

uhliarik commented Aug 4, 2023

Hello Alex!

Thank you for your explanation. I have divided the condition into two parts so now I can say for sure which function call is triggering the assertion - it is !commHasHalfClosedMonitor(fd) call:

(gdb) bt
#0  0x00007f3d69690824 in __pthread_kill_implementation () from /lib64/libc.so.6
#1  0x00007f3d6963e8ee in raise () from /lib64/libc.so.6
#2  0x00007f3d696268ff in abort () from /lib64/libc.so.6
#3  0x000056502616f77a in xassert (msg=<optimized out>, file=<optimized out>, line=<optimized out>) at debug/debug.cc:1269
#4  0x00005650260767b9 in commStartHalfClosedMonitor (fd=10) at /usr/src/debug/squid-6.1-3.fc39.x86_64/src/comm.cc:1580
#5  0x000056502604d316 in commMarkHalfClosed (fd=<optimized out>) at servers/../../src/comm.h:94
#6  Server::doClientRead (this=0x565027d2b6b8, io=...) at servers/Server.cc:162
#7  0x0000565025e1bf0c in CommCbMemFunT<Server, CommIoCbParams>::doDial (this=<optimized out>) at /usr/src/debug/squid-6.1-3.fc39.x86_64/src/CommCalls.h:190
#8  JobDialer<Server>::dial (call=..., this=<optimized out>) at base/AsyncJobCalls.h:175
#9  AsyncCallT<CommCbMemFunT<Server, CommIoCbParams> >::fire (this=0x565027d32d20) at ../src/base/AsyncCall.h:147
#10 0x0000565026065510 in AsyncCallQueue::fire (this=0x565027809240) at base/../../src/base/RefCount.h:73
#11 0x0000565025e5a9b2 in EventLoop::dispatchCalls (this=0x7ffc35c90e50) at base/AsyncCallQueue.cc:42
#12 EventLoop::runOnce (this=0x7ffc35c90e50) at /usr/src/debug/squid-6.1-3.fc39.x86_64/src/EventLoop.cc:121
#13 0x0000565025f557f8 in EventLoop::run (this=0x7ffc35c90e50) at /usr/src/debug/squid-6.1-3.fc39.x86_64/src/EventLoop.cc:83
#14 SquidMain (argc=argc@entry=6, argv=argv@entry=0x7ffc35c91188) at /usr/src/debug/squid-6.1-3.fc39.x86_64/src/main.cc:1661
#15 0x0000565025e01106 in SquidMainSafe (argv=0x7ffc35c91188, argc=6) at /usr/src/debug/squid-6.1-3.fc39.x86_64/src/main.cc:1353
#16 main (argc=6, argv=0x7ffc35c91188) at /usr/src/debug/squid-6.1-3.fc39.x86_64/src/main.cc:1341
(gdb) f 4
#4  0x00005650260767b9 in commStartHalfClosedMonitor (fd=10) at /usr/src/debug/squid-6.1-3.fc39.x86_64/src/comm.cc:1580
1580	    assert(!commHasHalfClosedMonitor(fd));
(gdb) list
1575	void
1576	commStartHalfClosedMonitor(int fd)
1577	{
1578	    debugs(5, 5, "adding FD " << fd << " to " << *TheHalfClosed);
1579	    assert(isOpen(fd));
1580	    assert(!commHasHalfClosedMonitor(fd));
1581	    (void)TheHalfClosed->add(fd); // could also assert the result
1582	    fd_table[fd].codeContext = CodeContext::Current();
1583	    commPlanHalfClosedCheck(); // may schedule check if we added the first FD
1584	}
(gdb) 

@rousskov
Copy link
Contributor

rousskov commented Aug 4, 2023

I have divided the condition into two parts so now I can say for sure which function call is triggering the assertion - it is !commHasHalfClosedMonitor(fd) call

Thank you for sharing that information. I posted a fix for this bug as PR #1443.

@rousskov rousskov added S-waiting-for-PR Closure of other PR(s), current or future, is expected (and usually required) and removed S-waiting-for-author author action is expected (and usually required) labels Aug 4, 2023
@rousskov
Copy link
Contributor

@uhliarik, please consider closing this workaround PR because the comprehensive fix in #1443 has landed (master/v7 commit 5da786e).

@uhliarik
Copy link
Contributor Author

@uhliarik, please consider closing this workaround PR because the comprehensive fix in #1443 has landed (master/v7 commit 5da786e).

Hi Alex. I have already backported the fix from #1443 to RHEL and Fedora. I'm closing this PR and thanks for your help!

@uhliarik uhliarik closed this Sep 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-for-PR Closure of other PR(s), current or future, is expected (and usually required)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants