Skip to content

Do not send more than one self-signed certificate#1058

Closed
rousskov wants to merge 6 commits intosquid-cache:masterfrom
measurement-factory:SQUID-642-do-not-send-two-roots
Closed

Do not send more than one self-signed certificate#1058
rousskov wants to merge 6 commits intosquid-cache:masterfrom
measurement-factory:SQUID-642-do-not-send-two-roots

Conversation

@rousskov
Copy link
Contributor

@rousskov rousskov commented May 22, 2022

Configured chain L-M-R-R was sent as L-M-R-R instead of just L-M-R.

Configured chain R was sent as R-R instead of just R!

... where R is a self-signed (i.e. "Root CA") certificate that signed
(possibly indirectly, via an intermediate certificate M) the traffic
signing (i.e. "Leaf" or "end-entity") certificate L (if any).

Security::KeyData::loadX509ChainFromFile() has other problems, but
properly solving those problems needs significant code refactoring. It
is best done in a dedicated no-functionality-changes PR. The same
refactoring is likely to simplify the existing code logic, handling all
root certificates in one place.

Also removed disabled code so that we do not have to keep maintaining
it. Incorrectly disabling this code at the very end of commit 540e296
work created the bugs we are fixing in this branch. For some, it is
easier to comprehend active code without disabled code misdirection.
This disabled code removal does not mean that Squid should keep
sending Root CA certificates (when they are not the signing/end-entity
certificates). However, stopping that behavior deserves a dedicated PR.

rousskov added 4 commits May 21, 2022 21:15
Incorrectly disabling this code at the last minute of commit 540e296
work created a bug we are fixing in this branch. It may be easier to
comprehend the fix without being distracted by this disabled code.
A single-certificate chain containing a self-signed certificate R was
sent as R-R (instead of just R).

The original "will not be chained" debugs() message contents for the
self-signed signing certificate felt odd because the signing
certificates should never be added to the chain, regardless of whether
they are self-signed. What we probably meant to say is that Squid will
not form/sent a chain of CA certificates after the signing certificate
because/when that signing certificate is self-signed.
L-M-R-...-R was sent as L-M-R-R instead of just L-M-R

... where R is a self-signed (i.e. "root") certificate that signed
(possibly indirectly, via intermediate certificate(s) M) the traffic
signing (i.e. "leaf") certificate L.
yadij
yadij previously requested changes May 23, 2022
Copy link
Contributor

@yadij yadij left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I object to this change in its entirety.

  • It is a direct regression re-opening bug 4831.

  • The code wrapped in TLS_CHAIN_NO_SELFSIGNED was intentionally disabled per discussion in #187 (comment).
    AFAIK there has been no evidence that it was needing to be enabled by anyone yet. As such the expectation is still that the code block be removed instead of enabled.

Copy link
Contributor Author

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I object to this change in its entirety.

Your objection is based on several false assumptions:

  • It is a direct regression re-opening bug 4831.

What makes you think that? PR #187 fixing bug 4831 summarizes that bug as "the leaf certificate being added twice at the prefix of a chain in the serverHello". The changes in this PR add no certificates at all! Thus, they cannot reopen a bug about adding an extra certificate. If PR #187 summary was also incorrect, then please specify what regression you think this PR introduces.

  • The code wrapped in TLS_CHAIN_NO_SELFSIGNED was intentionally disabled per discussion in #187 (comment).

While that code was intentionally disabled indeed, disabling it did not have (just) the effect that the author thought it had. Disabling it also created two bugs fixed in this PR. Unfortunately, I did not review disabling changes when they happened back then (AFAICT, I was only given 12 hours, from 2AM to 2PM to do that before those changes were merged). I am fixing those bugs now, in this PR.

AFAIK there has been no evidence that it was needing to be enabled by anyone yet.

The above statement appears to assume that this PR re-enabled disabled code. That is not an accurate assumption. TLS_CHAIN_NO_SELFSIGNED disables two if statements in the official code. Only one of those two statements is resurrected in this PR. The other if statement is different from the disabled one. The result of this PR changes is not equivalent to enabling that previously disabled code.

As for the evidence of need, please see this PR description for the summary of two bugs that this PR is fixing -- official Squid is sending two copies of the same certificate in certain cases. In one of those cases, Squid is not even (mis)configured to send two copies but sends two copies anyway. This is a bug-fixing PR.

As such the expectation is still that the code block be removed instead of enabled.

Agreed, and that is how I recommend one views this PR -- the disabled code was removed. Now forget about it. It was never running and now it is gone. End of that story. What was added after that removal is new code that should be evaluated from scratch rather than based on (false) assumptions about what the disabled code did or did not do.

Please dismiss your negative review or re-review with the above corrections in mind.

P.S. As promised, I have removed all PR-added comments that you did not like and will not waste time on defending myself from personal attacks in reviews of those comments.

@rousskov rousskov requested a review from yadij May 23, 2022 15:22
@rousskov rousskov added the S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box label May 23, 2022
@yadij
Copy link
Contributor

yadij commented Jun 4, 2022

I object to this change in its entirety.

Your objection is based on several false assumptions:

  • It is a direct regression re-opening bug 4831.

What makes you think that?

The client-side behaviour of this PR will be the same TLS verify failures. Duplicate leaf is just one of the cases that can lead to that rejection. more below.

  • The code wrapped in TLS_CHAIN_NO_SELFSIGNED was intentionally disabled per discussion in #187 (comment).

While that code was intentionally disabled indeed, disabling it did not have (just) the effect that the author thought it had.

Correction, the effect was exactly what the author knew it would do.

Disabling it also created two bugs fixed in this PR.

Intentionally so, for bug compatibility with existing TLS/SSL clients that reject traffic where the root CA is omitted.

AFAIK there has been no evidence that it was needing to be enabled by anyone yet.

The above statement appears to assume that this PR re-enabled disabled code.
That is not an accurate assumption.

The PR diff shows it clearly doing so.

TLS_CHAIN_NO_SELFSIGNED disables two if statements in the official code. Only one of those two statements is resurrected in this PR.

No. Both are converted from dead code to always-occuring.

The first one at line 100 simply has its #if wrappers removed.

  • This has the effect of erasing the chain portion of server handshake when Squid is configured with a self-signed tls-cert= option. This is a problem when the self-signed cert was cross-signed by an alternative CA and the client only trusts that alternative CA's root.

The second one at line 112 has its wrappers removed and moved down two lines into a sub-scope.

  • This has the effect of truncating a multi-root chain when the first root CA is encountered.
    *-> expected behaviour is that both branches of the multi-root chain are delivered. Optionally sans the root CA's (only).

As such the expectation is still that the code block be removed instead of enabled.

Agreed, and that is how I recommend one views this PR -- the disabled code was removed. Now forget about it. It was never running and now it is gone. End of that story. What was added after that removal is new code that should be evaluated from scratch rather than based on (false) assumptions about what the disabled code did or did not do.

The PR diff contradicts the validity of this approach. What was "added after that removal" is "new code" identical to the so-called "removed" code.

Please dismiss your negative review or re-review with the above corrections in mind.

If you really insist on continuing with this, please consider and test your changes with the following certificate chains in the tls-cert= file. At the very least case 1 & 2 have active users we should not break things for.

Case 1) L-A-R (aka. CA cross-signing)

where L is self-signed, L is signed by A, and A signed by R.
-> client verify should always work when client only trusts one of (L, A, R)

Case 2) L-A-B-R (aka. CA key transitions)

where L is signed by A, L is signed by B, B signed by A, A signed by R, and B signed by R
-> client verify should always work when client only trusts one of (A, B, R)

Case 3) L-A-R1-B-R2 (aka. composite chains)

where L is signed by A, A signed by R1, L is signed by B, and B signed by R2
-> client verify should always work when client only trusts one of (A, B, R1, R2)

** this case (3) is not supported by the current code. So I do not insist on it, but if you can get it going we will have some more happy users.

@yadij yadij added S-waiting-for-author author action is expected (and usually required) and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box labels Jun 4, 2022
@rousskov
Copy link
Contributor Author

rousskov commented Jun 5, 2022

Amos: The client-side behaviour of this PR will be the same TLS verify failures [as in bug 4831].

Thank you for providing specific examples of the alleged "same failures" (later in your response). Please see below for their analysis. In summary, I am still unaware of any use case that we want to support, that was working before this PR, and that will stop working if this PR is merged. In other words, the PR fixes a few problems without breaking the already supported use cases.

Amos: The code wrapped in TLS_CHAIN_NO_SELFSIGNED was intentionally disabled

Alex: Disabling it also created two bugs fixed in this PR.

Amos: Intentionally so, for bug compatibility with existing TLS/SSL clients that reject traffic where the root CA is omitted.

The fixed bugs could not help with any legitimate use cases -- sending two copies of the same certificate is never useful. The cases you are thinking about are not the cases affected by this PR AFAICT.

Please note that this PR changes do not create situations with "traffic where the root CA is omitted": If the configured port chain ends in a root CA certificate R, then this PR does not affect R presence in the sent TLS ServerHello. In other words, PR code still sends R in cases where the official code sends R (e.g., R, L-R, L-M-R, and similar sent certificate sequences; see below for details and more cases). This PR only removes extra R entries, sending the first R copy just as the official code sends it.

Amos: The first one at line 100 simply has its #if wrappers removed.

This has the effect of erasing the chain portion of server handshake when Squid is configured with a self-signed tls-cert= option.

I am not sure what you mean by the "chain portion" in this context. To reduce the number of review iterations, I will cover several configuration cases where Squid is given a self-signed certificate (R) in tls-cert. Perhaps one of them will match the case you are thinking about (see further below for the legend). If not, please give a specific example.

  1. tls-cert=R. Prior to this PR, Squid was sending R-R (i.e. two identical certificates -- a bug). After this PR, Squid sends just one R, as it should.

  2. tls-cert=R-R.... Prior to this PR, Squid was sending R-R-R... (i.e. at least three identical certificates R). After this PR, Squid sends just one R, as it should.

  3. tls-cert=R-X... (where X is not R). Prior to this PR, Squid was sending R-R (i.e. two identical certificates). After this PR, Squid sends just one R, as it should.

  4. tls-cert=L-M-R. No changes here. Squid still sends L-M-R. This is a typical case.

  5. tls-cert=L-M-R-R.... Prior to this PR, Squid was sending L-M-R-R... (i.e. two or more identical certificates R at the end of the chain). After this PR, Squid sends just L-M-R, as it should.

  6. tls-cert=L-M-R-X... (where X is not R). No changes here. Squid still sends L-M-R, dropping X.

In the above, the left-most certificate in all chains is the "signing" certificate used to encrypt Squid port traffic. Each chain represents the certificates in the tls-cert bundle, in the specified order. Here is the legend for the certificate letters used above (just like in the PR description plus X):

  • L: Leaf end-entity certificate. Cannot issue/sign any certificates.
  • M: InterMediate CA certificate (that issued/signed L).
  • R: Root CA (a.k.a. "self-signed") certificate. Issued/signed M (where present). Issued/signed R.
  • X: Any certificate. The relationship (or lack thereof) with other certificates including other X certificates, is unimportant (except as explicitly stated in specific use cases).

I see no PR-relevant problems with the above use cases.

Amos: The second one at line 112 has its wrappers removed and moved down two lines into a sub-scope.
This has the effect of truncating a multi-root chain when the first root CA is encountered.

Please define "multi-root chain".

FWIW, the second added if only avoids adding extra copies of the already chained root CA certificate (i.e. case 5 above: L-M-R-R...). It does not change how Squid handles certificate bundles with multiple different roots -- Squid still sends only one chain to the client -- no changes there.

*-> expected behaviour is that both branches of the multi-root chain are delivered. Optionally sans the root CA's (only).

AFAICT, the behavior you expect is not supported by official Squid (and that support is not added by this bug-fixing PR). Official Squid sends certificates that form a single chain. That chain starts with the first certificate in the bundle. Certificates are considered in the order they are stored, and the first-discovered path/chain wins. All other certificates are dropped. The IssuedBy() check takes care of all that (R duplication bugs notwithstanding).

If you really insist on continuing with this, please consider and test your changes with the following certificate chains in the tls-cert= file. At the very least case 1 & 2 have active users we should not break things for.

Again, thank you for providing specific cases you are concerned about. Here is their analysis:

Case 1) L-A-R (aka. CA cross-signing)

where L is self-signed, L is signed by A, and A signed by R. -> client verify should always work when client only trusts one of (L, A, R)

Official Squid sends L-L.
PR code sends L, fixing that bug.

Warning: Here, I am using L as you define it for this case. (In my own examples/cases and in the PR description, L is a leaf end-entity certificate, never a self-signed CA certificate).

BTW, this is not a valid cross-signing case: If L is self-signed, then the client would not follow the chain from L to A. This is true even if the client does not trust L but has and trusts A. The chain always stops at the self-signed certificate. Before trust verification happens, clients restore certificate chains based on issued-by pairings (among other conditions). In case of a self-signed certificate, the issuer is the certificate subject itself so the client would not add any other certificate to the candidate chain -- that chain is done/terminated.

In a valid cross-signing case, L is not self-signed (it is a leaf end-entity or an intermediate CA certificate).

FWIW, the following Stack Exchange answer has nice illustrations for various cross-signing cases: https://security.stackexchange.com/a/15583

Case 2) L-A-B-R (aka. CA key transitions)

where L is signed by A, L is signed by B, B signed by A, A signed by R, and B signed by R -> client verify should always work when client only trusts one of (A, B, R)

AFAICT, this case is not supported by the official code: Squid just sends L-A-R (because B did not issue A and, hence, Squid skips B after adding A to the chain). This PR does not change this behavior. Again, the official IssuedBy() check takes care of this.

Case 3) L-A-R1-B-R2 (aka. composite chains)

where L is signed by A, A signed by R1, L is signed by B, and B signed by R2 -> client verify should always work when client only trusts one of (A, B, R1, R2)

this case (3) is not supported by the current code. So I do not insist on it, but if you can get it going we will have some more happy users.

Since this case is not yet supported, let's not discuss it in this bug-fixing PR. One this PR is merged, I am happy to discuss this case elsewhere (e.g., Zulip?). Just ping me then.


FWIW, we should really stop mentioning removed disabled code in any shape or form in this change request context IMO -- it only delays progress. Let's focus on the changes to the non-disabled code instead! Only they should matter in this discussion. However, if you insist on continuing that part of the conversation, here are the relevant corrections/clarifications:

Amos: The second one at line 112 has its wrappers removed and moved down two lines into a sub-scope.

... and its if condition was changed! It is still an if-condition-then-continue code but with a new condition, and a new pre-condition. Any similarities with the previously disabled code are completely unimportant. There is no value (and clearly real harm!) in trying to compare this new code to the officially-disabled old code IMO.

Alex: TLS_CHAIN_NO_SELFSIGNED disables two if statements in the official code. Only one of those two statements is resurrected in this PR.

Amos: No. Both are converted from dead code to always-occuring.

The first block of disabled code was effectively resurrected -- we are in agreement there.

The second disabled block was removed and was not resurrected. Yes, there is a similar block of new code, but it is different code in a different location, not a resurrection! That similar but new if statement has a different condition (and, hence, a different effect) compared to the second (disabled by TLS_CHAIN_NO_SELFSIGNED) if statement. There is an abridged diff below that pinpoints the difference between the removed and new code.

Amos: As such the expectation is still that the code block be removed instead of enabled.

Alex: Agreed, and that is how I recommend one views this PR -- the disabled code was removed. Now forget about it. It was never running and now it is gone. End of that story. What was added after that removal is new code that should be evaluated from scratch rather than based on (false) assumptions about what the disabled code did or did not do.

Amos: The PR diff contradicts the validity of this approach. What was "added after that removal" is "new code" identical to the so-called "removed" code.

... except it is not identical! Here is an abridged diff that might help you see the difference:

 while (...) {

-#if TLS_CHAIN_NO_SELFSIGNED
-    if (conditionB)
-       continue;
-#endif

    if (conditionX) {
+       if (conditionC)
+           continue;
        ...

See? ConditionB is gone. The added continue code runs if conditionX and conditionC are true. Those are different from conditionB! Or, simplifying further, the PR change is equivalent to this:

 while (...) {

-#if TLS_CHAIN_NO_SELFSIGNED
-    if (conditionB)
-       continue;
-#endif

+   if (conditionXC)
+       continue;

    if (conditionX)
        ...

Where conditionXC is (conditionX and conditionC). ConditionXC is not the previously disabled and now removed conditionB. No resurrection here...

@rousskov rousskov added S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box review-2 and removed S-waiting-for-author author action is expected (and usually required) labels Jun 5, 2022
@squid-anubis squid-anubis added M-failed-other https://github.com/measurement-factory/anubis#pull-request-labels and removed M-failed-other https://github.com/measurement-factory/anubis#pull-request-labels labels Jun 9, 2022
@yadij
Copy link
Contributor

yadij commented Jun 12, 2022

Please define "multi-root chain".

see commet case (2) and (3).

See? ConditionB is gone. The added continue code runs if conditionX and conditionC are true. Those are different from conditionB! Or, simplifying further, the PR change is equivalent to this:

What you are calling conditionC(ca) is actually conditionB(++ca). You have simply swapped the order of two "&&" criteria being evaluated and obfuscated them across loop cycles.

Unwrapping the loops makes this clearer:
Existing code:

// while:
if (conditionB( n=0 ) ) continue;
if (conditionX( n=1 ) )
 ...

if (conditionB( n=1 ) ) continue;
if (conditionX( n=2 ) )
 ...

if (conditionB( n=2 ) ) continue;
if (conditionX( n=3 ) ) ...

This PR:

if (conditionB( n=0 )) return;

// while:
if (conditionX( n=1 )
if (conditionB( n=1) ) continue;
 ...

if (conditionX( n=2 )
if (conditionB( n=2) ) continue;
  ...

if (conditionX( n=3 )
if (conditionB( n=3) ) continue;
 ...

@yadij
Copy link
Contributor

yadij commented Jun 12, 2022

I am giving up arguing about this. I will let the users bug reports demonstrate issues in future.

@yadij yadij added M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box labels Jun 12, 2022
@yadij yadij dismissed their stale review June 12, 2022 23:13

giving up

@yadij yadij removed the review-2 label Jun 12, 2022
@rousskov
Copy link
Contributor Author

rousskov commented Jun 13, 2022

Here is another simple way to prove that this PR effect is not resurrection of commented out code. Consider an ordinary certificate bundle containing L-M-R.

  • Official code built with TLS_CHAIN_NO_SELFSIGNED=1: Sends L-M (because conditionB skips ca=R).
  • PR code: Sends L-M-R (because conditionC(M) is still false during ca=R iteration).

I hope that is sufficient proof that, for certificate bundles that do not start with a self-signed certificate, this PR does not resurrect the previously commented out code but implements different logic.

I will let the users bug reports demonstrate issues in future.

No bug reports are needed -- we already know this code is buggy! This PR does not address all known bugs. It only eliminates duplicate root certificate entries in sent chains.


Alex: ConditionB is gone. The added continue code runs if conditionX and conditionC are true. Those are different from conditionB! Or, simplifying further, the PR change is equivalent to this:

Amos: What you are calling conditionC(ca) is actually conditionB(++ca).

Even this simple statement is false: conditionC is not evaluated for ca. It is evaluated for latestCa that may skip some cas.

Also, we must not ignore the presence of conditionX in the PR code (which is ANDed with conditionC). To claim resurrection, the entire condition XC would have to be equivalent to condition B.

Amos: You have simply swapped the order of two "&&" criteria being evaluated and obfuscated them across loop cycles.

No, the PR code is substantially different from the official code with built with TLS_CHAIN_NO_SELFSIGNED=1. The code manipulations you are describing do not preserve the (incorrect) outcome of the official code built with TLS_CHAIN_NO_SELFSIGNED=1, as demonstrated by the L-M-R example above.

@rousskov rousskov removed the request for review from yadij June 13, 2022 13:24
squid-anubis pushed a commit that referenced this pull request Jun 13, 2022
Configured chain L-M-R-R was sent as L-M-R-R instead of just L-M-R.

Configured chain R was sent as R-R instead of just R!

... where R is a self-signed (i.e. "Root CA") certificate that signed
(possibly indirectly, via an intermediate certificate M) the traffic
signing (i.e. "Leaf" or "end-entity") certificate L (if any).

Security::KeyData::loadX509ChainFromFile() has other problems, but
properly solving those problems needs significant code refactoring. It
is best done in a dedicated no-functionality-changes PR. The same
refactoring is likely to simplify the existing code logic, handling all
root certificates in one place.

Also removed disabled code so that we do not have to keep maintaining
it. Incorrectly disabling this code at the very end of commit 540e296
work created the bugs we are fixing in this branch. For some, it is
easier to comprehend active code without disabled code misdirection.
This disabled code removal does _not_ mean that Squid should keep
sending Root CA certificates (when they are not the signing/end-entity
certificates). However, stopping that behavior deserves a dedicated PR.
@squid-anubis squid-anubis added the M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels label Jun 13, 2022
@squid-anubis squid-anubis added M-merged https://github.com/measurement-factory/anubis#pull-request-labels and removed M-waiting-staging-checks https://github.com/measurement-factory/anubis#pull-request-labels M-cleared-for-merge https://github.com/measurement-factory/anubis#pull-request-labels labels Jun 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

M-merged https://github.com/measurement-factory/anubis#pull-request-labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants