Fixed clientside_mark and client port logging in TPROXY mode by gozzy · Pull Request #150 · squid-cache/squid

gozzy · 2018-02-13T09:41:21Z

The clientside_mark ACL was not working with TPROXY because a
conntrack query could not find connmark without a true client port.

Ip::Intercept::Lookup() must return true client address, but its
TproxyTransparent() component was reseting the client port. We should
use zero port when we compute the source address for the Squid-to-peer
connection instead.

squid-prbot · 2018-02-13T09:41:22Z

Can one of the admins verify this patch?

gozzy · 2018-02-13T09:43:58Z

Actually, the initial behavior looks strange: why change client's port? If we do really transparent proxying, we should bind client's "address:port" pair to a socket. Instead the port was changed for some reasons and squid->server connection used a random port. I have not seen any recommendations to choose arbitrary port in TPROXY mode and the change made in this commit does not seem to break anything, at least on our setup.

yadij · 2018-02-13T13:32:20Z

The "reasons" for setting port to 0 on TPROXY traffic is that TCP/IP considers each unique set of src-IP:port, dst-IP:port to be a connection and that is enforced by modern kernels and/or NIC device drivers - attempts to reuse the same IP:port for anything result in "address already in use" errors from the networking layers.

Since dst-IP:port is identical for both the client->Squid and Squid->server connections, and the TPROXY is all about spoofing the src-IP to also be identical on both. That leaves the src-port as the only possible way for TCP/IP to avoid dropping one of the connections as invalid or routing packets on it to the wrong recipent.

Setting port to 0 in the records of the client connection is not right, but at the time TPROXY was implemented there was no better place. Things have changed since then in how Squid connection state is managed. It may be time to move the 0 assignment to a later timing, but I have not checked where that may be.

gozzy · 2018-02-13T14:09:54Z

Maybe it could be moved to Comm::TcpAcceptor::acceptOne? Just right after getting nfmark.

rousskov · 2018-02-13T14:58:51Z

IMHO, assuming Squid cares about the actual client port number (e.g., for logging it and sending it to external ACLs), the only logical place to zero that port is when passing a copy of the actual client address to the bind(2) call wrapper (or equivalent). That approach preserves the true client port number for all high-level Squid purposes while making low-level OS IP stack happy.

gozzy · 2018-02-13T21:48:24Z

I am still not sure if zeroizing in Ip::Intercept::TproxyTransparent() is necessary. At the end of FwdState::connectStart() I see this:

serverDestinations[0]->local.port(0);

Also, in squid's log there is something like this:

ConnOpener: will connect to local=192.168.2.2 remote=172.17.9.80:80 flags=25 with 60 timeout

Local address is logged without a port. If I understand correctly, the address is written without port if the port is random. Currently it's hard for me to trace everything that happens but there definetely should be a reason why everything works without 'newConn->remote.port(0)' :) Could it be that zeroizing is also done in connectStart?

rousskov · 2018-02-14T00:11:59Z

At the end of FwdState::connectStart() I see this:
serverDestinations[0]->local.port(0);

Good point, although I do not see a similar reset in tunnel.cc (which duplicates a lot of FwdState code/purpose) so be careful about relying on the above code alone.

Again, none of those existing random places feel logical for a local port reset if a zero local port is always needed for opening a TCP connection from Squid.

yadij · 2018-02-14T05:03:35Z

I think the right place in current code is getOutgoingAddress() in FwdState.cc (and its equivalent in tunnel.cc). There is already a section for tproxy there which should set conn->local.port(0) just before setting the flags.

This will need testing of TPROXY intercepted port 80, 443, 25 (FTP) traffic of course, and also an intercepted CONNECT message intended for another proxy (for the tunnel.cc code), and also ICAP traffic. If any of those results in that "Address already in use" or similar errors the source of its outbound IP:port will need to be tracked down and fixed - if possible by making it use the getOutgoingAddress() function.

gozzy · 2018-02-14T09:22:43Z

tunnel.cc seems to use PeerSelector, which in it's turn uses getOutgoingAddress(), if I'm not mistaken. There's a following chain:
tunnelStart() -> startSelectingDestinations() -> peerSelect() -> resolveSelected() -> handlePath() -> getOutgoingAddress()

At the moment I think that adding conn->local.port(0) to getOutgoingAddress() should do the trick. Correct me if I'm wrong.

gozzy · 2018-02-14T13:36:33Z

Updated the PR: zeroizing moved to FwdState. Things seem to work so far... We've checked the following cases (HTTP, HTTPS and FTP):

TPROXY with a direct access to requested servers
TPROXY that intercepts explicit proxy requests (i.e. CONNECT intented for another proxy)

I don't completely understand what should be checked about ICAP: tunneling or regular squid+ICAP?
When on_unsupported_protocol is specified, ICAP is tunneled fine, otherwise it results in error in parseHttpVersionField().

yadij · 2018-02-14T15:01:43Z

As far as I know; testing that ICAP connections use the right (client) IP when the HTTP is received over a TPROXY port. The c-icap test echo service should be okay for the test.

gozzy · 2018-02-14T15:33:56Z

You mean we should enable 'adaptation_send_client_ip' and check c-icap's logs?

gozzy · 2018-02-14T15:57:22Z

c-icap gets correct client's address in X-Client-IP header.

rousskov · 2018-02-14T16:07:32Z

@yadij: I think the right place in current code is getOutgoingAddress() in FwdState.cc (and its equivalent in tunnel.cc). There is already a section for tproxy there which should set conn->local.port(0) just before setting the flags.

Why not all the way down to comm_openex()? In other words, do we have any use cases where a COMM_BIND TCP address in comm_openex() should not use a zero source port?

gozzy · 2018-02-14T18:23:29Z

@rousskov do you mean something like this:

Comm::ConnectionPointer conn = new Comm::Connection;
conn->local = addr;
conn->local.port(0);        // <----
conn->fd = new_socket;

?

If yes, does this need to be remarked somehow?

gozzy · 2018-02-14T18:24:36Z

We need to come to an agreement :)

rousskov · 2018-02-14T19:34:20Z

conn->local.port(0); // <----

Not exactly because conn->local is not used for binding the socket -- comm_apply_flags() uses addr -- and also because comm_openex() should not reset port for opening requests that are not both COMM_DOBIND and SOCK_STREAM. Even if we can reset the port here for TCP, we still do not want to screw up UDP sockets (at least) that may need to bind to a specific source port.

We need to come to an agreement :)

Yes, and the key question to answer here, AFAICT, is whether there are use cases where a COMM_BIND TCP address in comm_openex() should not use a zero source port. If there are no such use cases, then we should zero the port at the lowest level possible, to avoid screwing up logging/ACLs/marking (the bug this PR is fixing) and to avoid mistakes where the port is not zero by accident (past bugs).

gozzy · 2018-02-14T20:07:53Z

So we have two options: zeroize port either in getOutgoingAddress or in comm_openex.

The first option is proven to work and resets port only when it's really needed. The second option is probably better because no connection is created without comm_openex(), but there're some caveats.

@yadij any thoughts?

yadij · 2018-02-15T03:28:27Z

Why not all the way down to comm_openex()?

because all the debugging, ACLs etc applied between getOutgoingAddr() and comm_openex() will be saying that the port about to be opened will be the one the client used, when it will actually be OS assigned.

rousskov · 2018-02-15T05:24:03Z

all the debugging, ACLs etc applied between getOutgoingAddr() and comm_openex() will be saying that the port about to be opened will be the one the client used

The above problems do not imply we should not zero the port in comm_openex() AFAICT.

You mentioned two very different problems:

Debugging: I do not think getOutgoingAddr() should return an address with some bogus port number. That function should set the port to zero or, better, not use/copy the bogus port in the first place. Rule of thumb: "Do not return garbage even if you think nobody is going to notice it".
ACLs should not even have access to the Squid-peer connection source address until comm_openex() creates such a connection. If they do, there is a bug somewhere, but the port value (zero or not) is not related to that bug.

rousskov · 2018-02-15T05:30:00Z

So we have two options: zeroize port either in getOutgoingAddress or in comm_openex.

Those two options are not mutually exclusive:

The former is needed to avoid returning garbage port number from getOutgoingAddress (which should probably be renamed to getOutgoingIp to reduce confusion).
The latter may be a good idea if no TCP connection should originate from a specific port number. Zeroing here makes this low-level code more robust when it is being fed garbage. Or, to be more precise, it works around the API deficiency that forces comm_openex() callers to supply a TCP source port that they do not care about.

gozzy · 2018-02-15T05:57:28Z

Maybe you are right and comm_openex should take care of a source port too... However, does it really need to be included in this patch? It looks like a separate problem to me. Also, I don't know if none of TCP connections made by squid requires source port.

rousskov · 2018-02-15T16:34:25Z

Maybe you are right and comm_openex should take care of a source port too... However, does it really need to be included in this patch?

No, I do not think it has to be included. We can fix getOutgoingAddress() alone because its breakage (i.e., returning garbage ports) is independent from comm_openex() API/robustness issues.

rousskov

I just have a small (but important) polishing request for this fix.

rousskov · 2018-02-15T17:38:20Z

src/FwdState.cc

            // some flags need setting on the socket to use this address
            conn->flags |= COMM_DOBIND;
            conn->flags |= COMM_TRANSPARENT;
+            conn->local.port(0); // allow random outgoing port to prevent address clashes


Please move this line up, right after the local assignment, without an empty line in-between: We are fixing that assignment, and the local value remains wrong until we reset the port. We should keep this fix as close to the assignment it is fixing as possible...

This assignment-fixing logic is repeated in many places, including ClientRequestContext::clientAccessCheck(), ftpOpenListenSocket(), Eui::Eui48::lookup(), IcmpSquid::Recv(), Ftp::Server::listenForDataConnection(), and Ftp::Server::createDataConnection(). We could add and use a dedicated Ip::Address method:

/// makes us the same as addr except for a possibly different port void reset(const Address &addr, const int port);

but that method does not help avoid similar bugs because buggy code can still mindlessly copy the whole Ip::Address using other public methods and operators. Thus, I doubt we should add this or a similar method. On the other hand, using reset() would make the code a little cleaner when there are two if/else address assignments (as there are in our case). Your call.

rousskov · 2018-02-15T17:42:04Z

src/FwdState.cc

            // some flags need setting on the socket to use this address
            conn->flags |= COMM_DOBIND;
            conn->flags |= COMM_TRANSPARENT;
+            conn->local.port(0); // allow random outgoing port to prevent address clashes


Please s/allow random outgoing/let OS pick the source/ in the comment.

rousskov · 2018-02-15T18:14:33Z

I adjusted the future commit message (i.e., PR title/description), twice, to better reflect the scope of the problem (and formatted them to satisfy commit message character limits that will be eventually enforced by the auto-merging bot). Please adjust further if needed.

The clientside_mark ACL was not working with TPROXY because a conntrack query could not find connmark without a true client port. Ip::Intercept::Lookup() must return true client address, but its TproxyTransparent() component was reseting the client port. We should use zero port when we compute the source address for the Squid-to-peer connection instead.

gozzy · 2018-02-15T18:28:46Z

Done.

rousskov

Thank you for addressing my change requests. I do not see any more problems with this PR, but will not merge it immediately to give @yadij a chance to review it. Amos knows more about the being-fixed code than I do.

rousskov · 2018-02-15T19:12:06Z

OK to test

yadij · 2018-02-16T08:59:25Z

Looks good to me.

…he#150) The clientside_mark ACL was not working with TPROXY because a conntrack query could not find connmark without a true client port. Ip::Intercept::Lookup() must return true client address, but its TproxyTransparent() component was reseting the client port. We should use zero port when we compute the source address for the Squid-to-peer connection instead.

The clientside_mark ACL was not working with TPROXY because a conntrack query could not find connmark without a true client port. Ip::Intercept::Lookup() must return true client address, but its TproxyTransparent() component was reseting the client port. We should use zero port when we compute the source address for the Squid-to-peer connection instead.

…he#150) The clientside_mark ACL was not working with TPROXY because a conntrack query could not find connmark without a true client port. Ip::Intercept::Lookup() must return true client address, but its TproxyTransparent() component was reseting the client port. We should use zero port when we compute the source address for the Squid-to-peer connection instead.

The clientside_mark ACL was not working with TPROXY because a conntrack query could not find connmark without a true client port. Ip::Intercept::Lookup() must return true client address, but its TproxyTransparent() component was reseting the client port. We should use zero port when we compute the source address for the Squid-to-peer connection instead.

rousskov requested changes Feb 15, 2018

View reviewed changes

rousskov changed the title ~~clientside_mark ACL was not working in TPROXY mode.~~ Fix client-port-dependent ACLs in TPROXY mode Feb 15, 2018

rousskov changed the title ~~Fix client-port-dependent ACLs in TPROXY mode~~ Fixed clientside_mark and client port logging in TPROXY mode Feb 15, 2018

rousskov approved these changes Feb 15, 2018

View reviewed changes

Merge branch 'master' into clientside_mark_tproxy

efcbd6b

yadij merged commit 3b65960 into squid-cache:master Feb 16, 2018

gozzy deleted the clientside_mark_tproxy branch February 16, 2018 11:58

Conversation

gozzy commented Feb 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

squid-prbot commented Feb 13, 2018

Uh oh!

gozzy commented Feb 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yadij commented Feb 13, 2018

Uh oh!

gozzy commented Feb 13, 2018

Uh oh!

rousskov commented Feb 13, 2018

Uh oh!

gozzy commented Feb 13, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rousskov commented Feb 14, 2018

Uh oh!

yadij commented Feb 14, 2018

Uh oh!

gozzy commented Feb 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gozzy commented Feb 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yadij commented Feb 14, 2018

Uh oh!

gozzy commented Feb 14, 2018

Uh oh!

gozzy commented Feb 14, 2018

Uh oh!

rousskov commented Feb 14, 2018

Uh oh!

gozzy commented Feb 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gozzy commented Feb 14, 2018

Uh oh!

rousskov commented Feb 14, 2018

Uh oh!

gozzy commented Feb 14, 2018

Uh oh!

yadij commented Feb 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rousskov commented Feb 15, 2018

Uh oh!

rousskov commented Feb 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gozzy commented Feb 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rousskov commented Feb 15, 2018

Uh oh!

rousskov left a comment

Choose a reason for hiding this comment

Uh oh!

rousskov Feb 15, 2018

Choose a reason for hiding this comment

Uh oh!

rousskov Feb 15, 2018

Choose a reason for hiding this comment

Uh oh!

rousskov commented Feb 15, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gozzy commented Feb 15, 2018

Uh oh!

rousskov left a comment

Choose a reason for hiding this comment

Uh oh!

rousskov commented Feb 15, 2018

Uh oh!

yadij commented Feb 16, 2018

Uh oh!

Reviewers

gozzy commented Feb 13, 2018 •

edited

Loading

gozzy commented Feb 13, 2018 •

edited

Loading

gozzy commented Feb 13, 2018 •

edited

Loading

gozzy commented Feb 14, 2018 •

edited

Loading

gozzy commented Feb 14, 2018 •

edited

Loading

gozzy commented Feb 14, 2018 •

edited

Loading

yadij commented Feb 15, 2018 •

edited

Loading

rousskov commented Feb 15, 2018 •

edited

Loading

gozzy commented Feb 15, 2018 •

edited

Loading

rousskov commented Feb 15, 2018 •

edited

Loading