Tr default cert#3392
Conversation
|
Refer to this link for build results (access rights to CI server needed): |
There was a problem hiding this comment.
is there other code that also synchronizes on this object?
There was a problem hiding this comment.
Yes. The importCertificateData also synchronizes on the CertificateRegistry instance. Also, the getInstance() method could be called by multiple threads.
There was a problem hiding this comment.
importCertificateDataList is a synchronized method, which I think is the same as if the method body was synchronized on CertificateRegistry.class, right?
There was a problem hiding this comment.
Yes. I placed this Synchronize here to protect the data structure in the event that more than one thread calls the 'getInstance' method during startup. There are in fact three endpoints being initialized using the LanguidNioProtocol at startup. Once initialized this method never gets called again so they synchronized doesn't have any impact on performance.
There was a problem hiding this comment.
The data structure would only be protected from race conditions if the other code that accesses it is also synchronized on the same object. synchronized (CertificateRegistryHolder.DELIVERY_SERVICE_CERTIFICATES) and synchronized (CertificateRegistry.class) are locking on two different things, so both of those methods could overlap with each other.
There was a problem hiding this comment.
this is done in new commits
There was a problem hiding this comment.
Ah yeah I just realized that getInstance is now static synchronized while importCertificateDataList is non-static synchronized. So those methods aren't actually locking on the same object yet (static locks on the class while non-static locks on a single instance).
So if one thread calls getInstance and another thread calls importCertificateDataList at the same time, both methods will interleave and mutate handshakeDataMap at the same time. That's what you're trying to prevent with the synchronization here, right?
There was a problem hiding this comment.
I'm more concerned about simultaneous calls to 'getInstance' during startup. The poller that calls 'importCertificateDataList' launches after initialization has succeeded.
There was a problem hiding this comment.
This discussion has caused me to realize that there is a bug in getInstance. I've just committed another change.
There was a problem hiding this comment.
I committed one more change based on our discussion and just moved the default cert creation into the importCertificateData method.
There was a problem hiding this comment.
should this check if the created value is null? what would happen if it ends up being null?
There was a problem hiding this comment.
Yes. I've updated the PR. Now it returns a null CertificateRegistry if it can't create the cert.
|
Also, can you add a line in CHANGELOG.md about TR's new default cert behavior? |
dc7c08e to
ac74234
Compare
|
Refer to this link for build results (access rights to CI server needed): |
Improved descriptions in the default self-signed cert Improved warning messages in x509 certificate validation
|
Refer to this link for build results (access rights to CI server needed): |
| */ | ||
| @Override | ||
| protected void doRun() { | ||
| final SocketWrapperBase<NioChannel> localWrapper = socketWrapper; |
There was a problem hiding this comment.
this line got mixed up w/ tabs and spaces
|
|
||
| import com.comcast.cdn.traffic_control.traffic_router.protocol.RouterNioEndpoint; | ||
| import com.comcast.cdn.traffic_control.traffic_router.shared.CertificateData; | ||
| //import com.fasterxml.jackson.core.type.TypeReference; |
There was a problem hiding this comment.
these imports can be removed if they're unused
| */ | ||
| @Override | ||
| protected void doRun() { | ||
| final SocketWrapperBase<NioChannel> localWrapper = socketWrapper; |
There was a problem hiding this comment.
this line is still way over-indented -- its using tabs and it looks like the other lines use spaces
There was a problem hiding this comment.
Okay. It should be better now. There seems to be an issue with one of my code templates.
There was a problem hiding this comment.
Thanks, it looks much better now.
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
|
Refer to this link for build results (access rights to CI server needed): |
| if (defaultHd == null){ | ||
| log.error("Failed to initialize the CertificateRegistry because of a problem with the 'default' " + | ||
| "certificate. Returning the Certificate Registry without a default."); | ||
| return; |
There was a problem hiding this comment.
If this ends up being the case, would we still want to do handshakeDataMap = master before returning so that all the newly-processed certs would still be loaded (just without a default cert), or would it never make sense to keep the rest of the certs without the default cert in that case?
There was a problem hiding this comment.
In both situation I think it will fail in either case when it goes to RouterNioEndpoint because there has to be a default SSLHost. This condition should only occur when TR is being initiated. I did perform a test and made sure that these lines only get called once. After that they are skipped because here is always a default cert.
There was a problem hiding this comment.
In other words. I think this is the appropriate behavior. If it can't create the default certificate, something bigger is wrong and TR should not start.
There was a problem hiding this comment.
Thanks, you make a fair point.
|
Refer to this link for build results (access rights to CI server needed): |
| */ | ||
| @Override | ||
| protected void doRun() { | ||
| final SocketWrapperBase<NioChannel> localWrapper = socketWrapper; |
There was a problem hiding this comment.
Thanks, it looks much better now.
| if (defaultHd == null){ | ||
| log.error("Failed to initialize the CertificateRegistry because of a problem with the 'default' " + | ||
| "certificate. Returning the Certificate Registry without a default."); | ||
| return; |
There was a problem hiding this comment.
Thanks, you make a fair point.
* Update maven-pmd-plugin to 3.14.0, which uses PMD 6.29.0 * Update Java PMD rule names to their PMD 6 equivalents * Unlist rules that are scheduled for removal in PMD 7 * Suppress PMD.CyclomaticComplexity and PMD.NPathComplexity * Fix PMD warnings * Set JAVA_HOME so Maven can find javac * Rebuild the Traffic Router builder image if its Dockerfile was changed * Use a fetch-depth of 4 to check files changed in the last 3 commits for the build-rpms GHA * Update Traffic Router Java version to 11 * Do not set UseSplitVerifier, SplitVerifier is required in Java 11 * Resolve new CloseResource PMD failure in Java 11 * Use RSAPrivateCrtKey, RSAPrivateCrtKeyImpl is not visible in Java 11 * Decrease NPath complexity of DeliveryService constructor * Do not escape '"' * Commant and unabbreviate commit comparison * Close resources in LetsEncryptDnsChallengeWatcher.readConfigFile() * Do not close socket before handling the request * Bind port to wildcard host for OpenJDK 11 compatibility * Store socketWrapper socket for use after super.doRun() * Fix PMD failures where all PMD failures were suppressed * Remove temporary hack from #3392 that resolved an SSL issue in Tomcat * Use OpenJDK 11 in Systemd service files * Use OpenJDK 11 for Ansible Java HOME
Which issue is fixed by this PR? If not related to an existing issue, what does this PR do?
This PR sets and explicit self-signed certificate as the 'default' certificate which gets returned during an SSL handshake if the requesting host does not match any of the other valid certificates TR hosts. The default certificate will not match any hostname and should be rejected by the client.
Which TC components are affected by this PR?
What is the best way to verify this PR? Please include manual steps or automated tests.
(If no tests are part of this PR, please provide explanation as to why no tests are included.)
There are unit tests and integration tests built into the TR build which test this PR. To manually test:
T1. use curl -kv --resolve to connect with an SNI host that matches the domain of an existing DS, but not the regular expression. For instance try https://cdn.someds.somecdn.toplevel for a DS that responds to https://edge.someds.somecdn.toplevel
R1. Inspect the curl output and you should see that the TLS handshake was successful but the certificate returned was for domain : Invalid
Check all that apply