TLS: Add support for specifying TLS session ticket keys,#1747
TLS: Add support for specifying TLS session ticket keys,#1747mattklein123 merged 18 commits intoenvoyproxy:masterfrom
Conversation
|
This depends on envoyproxy/data-plane-api#178 |
There was a problem hiding this comment.
This will point at the hash for envoyproxy/data-plane-api#178 after it is merged.
for resuming TLS sessions across hot-restarts and/or across multiple instances of Envoy. Signed-off-by: Greg Greenway <ggreenway@apple.com>
e105195 to
c7f8ac8
Compare
| namespace Ssl { | ||
|
|
||
| const unsigned char ContextImpl::SERVER_SESSION_ID_CONTEXT = 1; | ||
| int ContextImpl::ssl_context_index_ = -1; |
There was a problem hiding this comment.
Nit: I think this should be upper case? cc @htuch
There was a problem hiding this comment.
I think this is Ok, as it's not a constant. I have a vague preference for CONSTRUCT_ON_FIRST_USE here, but not if it makes things a pain.
|
|
||
| std::copy_n(src_key.begin(), dst_key.name.size(), dst_key.name.begin()); | ||
| size_t pos = dst_key.name.size(); | ||
| std::copy_n(src_key.begin() + pos, dst_key.aes_key.size(), dst_key.aes_key.begin()); |
There was a problem hiding this comment.
AES key should be after HMAC key, this way keying material will be compatible with NGINX and possibly other proxies.
| std::vector<uint8_t> parsed_alt_alpn_protocols_; | ||
| struct SessionTicketKey { | ||
| std::array<uint8_t, SSL_TICKET_KEY_NAME_LEN> name; | ||
| std::array<uint8_t, 32> aes_key; |
There was a problem hiding this comment.
AES key should be after HMAC key, this way keying material will be compatible with NGINX and possibly other proxies.
| // CA cert. This ensures that the client is always validated against | ||
| // the correct CA cert, even if session resumption across different listeners | ||
| // is enabled. | ||
| int ret = X509_digest(ca_cert_.get(), EVP_sha256(), session_context_buf, &session_context_len); |
There was a problem hiding this comment.
This is definitely not enough and you should also include all validation options, possibly also listener's name.
With the current code, session established with a listener that has ca_cert_file configured can be used to resume session on any listener that has the same ca_cert_file, even when extra restrictions are in place (e.g. verify_subject_alt_name), effectively bypassing them.
| RELEASE_ASSERT(session_ticket_keys_.size() >= 1); | ||
| const SessionTicketKey& enc_key = session_ticket_keys_.front(); | ||
|
|
||
| RELEASE_ASSERT(enc_key.name.size() == SSL_TICKET_KEY_NAME_LEN); |
There was a problem hiding this comment.
This should be checked when adding key to the session_ticket_keys_ and not when using it.
There was a problem hiding this comment.
It is checked when the key is added, I just wanted to make it clear at the point-of-use that the value is the correct size.
What I really want here is a static_assert() that the size is correct. I wanted to have it at the point where the data is used so it's super-clear that this won't result in a buffer-overrun due to size mismatch between this code and the SSL library's expectations.
I'll see if I can figure out a way to make this into a static_assert.
| return -1; | ||
| } | ||
|
|
||
| ASSERT(enc_key.aes_key.size() == EVP_CIPHER_key_length(cipher)); |
There was a problem hiding this comment.
This should be checked when adding key to the session_ticket_keys_ and not when using it.
| // Decrypt | ||
| bool is_enc_key = true; // first element is the encryption key | ||
| for (const SessionTicketKey& key : session_ticket_keys_) { | ||
| RELEASE_ASSERT(key.name.size() == SSL_TICKET_KEY_NAME_LEN); |
There was a problem hiding this comment.
This should be checked when adding key to the session_ticket_keys_ and not when using it.
| return 0; | ||
| } | ||
|
|
||
| ASSERT(key.aes_key.size() == EVP_CIPHER_key_length(cipher)); |
There was a problem hiding this comment.
This should be checked when adding key to the session_ticket_keys_ and not when using it.
| if (encrypt == 1) { | ||
| // Encrypt | ||
| RELEASE_ASSERT(session_ticket_keys_.size() >= 1); | ||
| const SessionTicketKey& enc_key = session_ticket_keys_.front(); |
| "type" : "array", | ||
| "items" : { | ||
| "type" : "string" | ||
| } |
There was a problem hiding this comment.
You've added this one level too deep (right now it's part of verify_subject_alt_name), breaking the whole schema.
There was a problem hiding this comment.
Yuck, that must have happened during a rebase/merge, and somehow I missed it.
|
|
||
| if (ssl_context_index_ < 0) { | ||
| ssl_context_index_ = SSL_CTX_get_ex_new_index(0, nullptr, nullptr, nullptr, nullptr); | ||
| if (ssl_context_index_ < 0) { |
There was a problem hiding this comment.
drive by review comment: Please see https://github.com/envoyproxy/envoy/blob/master/STYLE.md#error-handling for guidance on error handling. IMO a lot of the error handling added in this PR should be crashing asserts. (Obviously anything that involves loading a ticket file, etc. should have normal error handling). Will defer to @PiotrSikora on which should be which.
There was a problem hiding this comment.
That's a really good point. What's the preferred way to handle that? RELEASE_ASSERT()? PANIC()?
There was a problem hiding this comment.
Typically we use RELEASE_ASSERT() so it can all be inline.
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
|
@htuch @mattklein123 @PiotrSikora This is now building; I've addressed or commented on all the PR comments so far. Please take another look. |
Signed-off-by: Greg Greenway <ggreenway@apple.com>
mattklein123
left a comment
There was a problem hiding this comment.
Just skimmed through and overall this looks generally sane to me. It's been several years since I have looked at session ticket TLS stuff so will defer to @PiotrSikora and potentially @davidben for full review.
@ggreenway what kind of code coverage do we have on the error paths? It would be good to try to cover them as much as possible for the ones where we are not crashing on purpose.
| translateCommonTlsContext(json_tls_context, *downstream_tls_context.mutable_common_tls_context()); | ||
| JSON_UTIL_SET_BOOL(json_tls_context, downstream_tls_context, require_client_certificate); | ||
|
|
||
| std::vector<std::string> paths = |
| } | ||
| break; | ||
| case envoy::api::v2::DownstreamTlsContext::kSessionTicketKeysSdsSecretConfig: | ||
| // TODO(ggreenway): handle ticket keys from SDS |
| for (const auto& datasource : config.session_ticket_keys().keys()) { | ||
| switch (datasource.specifier_case()) { | ||
| case envoy::api::v2::DataSource::kFilename: { | ||
| const std::string key_data = Filesystem::fileReadToEnd(datasource.filename()); |
There was a problem hiding this comment.
You should verify that the correct length (i.e. 80 bytes) was read from the file.
There was a problem hiding this comment.
That's checked in ServerContextImpl::ServerContextImpl(). I'm following the pattern of the existing code and doing the validation in the ContextImpl.
There was a problem hiding this comment.
Thinking about this a bit more, I agree that this seems like a nicer place to do the check. However, I don't want this config setting to be handled differently (in terms of where it is validated) from all the other settings for the context. Maybe in a future change we should move all the checks to here?
There was a problem hiding this comment.
I think this is a bit different because most of the other values are validated by using them (i.e. passing value to BoringSSL when creating context), with this, we validate it inside Envoy.
cc @mattklein123 to see if he has any strong preference.
There was a problem hiding this comment.
I don't have a preference. If both of you feel it's better to check it here feel free to do that.
There was a problem hiding this comment.
Ok, I'm moving the validation to here.
| break; | ||
| } | ||
| case envoy::api::v2::DataSource::kInline: { | ||
| const std::string& key_data = datasource.inline_(); |
There was a problem hiding this comment.
You should verify that the correct length (i.e. 80 bytes) was provided.
There was a problem hiding this comment.
That's checked in ServerContextImpl::ServerContextImpl(). I'm following the pattern of the existing code and doing the validation in the ContextImpl.
| const std::vector<uint8_t>& src_key = config.sessionTicketKeys().at(i); | ||
| SessionTicketKey& dst_key = session_ticket_keys_.at(i); | ||
|
|
||
| if (src_key.size() != sizeof(SessionTicketKey)) { |
There was a problem hiding this comment.
I don't think you need this here if you check during the config parsing.
|
|
||
| if (!config.sessionTicketKeys().empty()) { | ||
| session_ticket_keys_.resize(config.sessionTicketKeys().size()); | ||
| for (unsigned i = 0; i < config.sessionTicketKeys().size(); ++i) { |
There was a problem hiding this comment.
This could be:
for (const std::vector<uint8_t>& key : config.sessionTicketKeys()) { ... }
once you move length check to the config parsing.
| pos += dst_key.hmac_key.size(); | ||
| std::copy_n(src_key.begin() + pos, dst_key.aes_key.size(), dst_key.aes_key.begin()); | ||
| pos += dst_key.aes_key.size(); | ||
| ASSERT(src_key.begin() + pos == src_key.end()); |
There was a problem hiding this comment.
This seems unnecessary, since we already know that the length is correct.
| "Expected key.name length"); | ||
| if (std::equal(key.name.begin(), key.name.end(), key_name)) { | ||
| if (!HMAC_Init_ex(hmac_ctx, key.hmac_key.data(), key.hmac_key.size(), hmac, nullptr)) { | ||
| return 0; |
There was a problem hiding this comment.
This should be return -1;.
There was a problem hiding this comment.
Can this even happen for cases other than OOM?
|
|
||
| RELEASE_ASSERT(key.aes_key.size() == EVP_CIPHER_key_length(cipher)); | ||
| if (!EVP_DecryptInit_ex(ctx, cipher, nullptr, key.aes_key.data(), iv)) { | ||
| return 0; |
There was a problem hiding this comment.
This should be return -1;.
There was a problem hiding this comment.
Same for this. Can it happen in any cases we care about? If not, I can just ASSERT success.
| "Expected key.name length"); | ||
| std::copy_n(key.name.begin(), SSL_TICKET_KEY_NAME_LEN, key_name); | ||
|
|
||
| int got_rand_bytes = RAND_bytes(iv, EVP_CIPHER_iv_length(cipher)); |
|
|
||
| // Include SERVER_SESSION_ID_CONTEXT so that if all the other verify-settings are unset | ||
| // we have a deterministic value. | ||
| rc = EVP_DigestUpdate(&md, &SERVER_SESSION_ID_CONTEXT, sizeof(SERVER_SESSION_ID_CONTEXT)); |
There was a problem hiding this comment.
Perhaps we could use "envoy" here? SERVER_SESSION_ID_CONTEXT doesn't seem to provide any value.
Also, the deterministic fallback could be used only when ca_cert_ == nullptr.
There was a problem hiding this comment.
I wasn't sure what the reason for SERVER_SESSION_ID_CONTEXT was; I agree it doesn't add much/any value. I'll switch to "envoy", unless @mattklein123 has a reason it needs to remain.
| RELEASE_ASSERT(rc == 1); | ||
| } | ||
|
|
||
| for (const std::string& name : verify_subject_alt_name_list_) { |
There was a problem hiding this comment.
Both verify_subject_alt_name_list_ and verify_certificate_hash_ require ca_cert_ to work, so you can pull everything into if (ca_cert_ != nullptr) { ... }.
|
In general in the crypto path, I have my paranoid-level set to 11, which is why there are a lot of RELEASE_ASSERTs right next to calls into boringssl. None of those things should be possible to happen; but if they do, it means we may not correctly/safely encrypt the session ticket. That's why I'm asserting things in ServerContextImpl::sessionTicketProcess() that have been validated as impossible in the constructor (or that are impossible in boringssl, such as RAND_bytes not succeeding). I'm doing this in case someone modifies the checks elsewhere forgets to change it in another place, or someone tries to build against openssl 5 years from now, or something else unexpected. I can dial back my paranoia and remove a bunch of the ASSERTs if you guys are less paranoid than me, but it makes me nervous because I don't know of a way to functional-test some of these things (like if the AES key is too short so half of it is uninitialized-but-predictable memory). Thoughts on this @PiotrSikora @mattklein123 @htuch ? |
|
@mattklein123 re: coverage: I don't know how to simulate the error paths. You're talking about EVP_EncryptInit_ex, HMAC_Init_ex, and EVP_DecryptInit_ex right? I don't know if they can even fail other than OOM, and I also don't know where to find an answer to the question of when/why they fail. So I'm sure we don't have coverage on those paths, unfortunately. |
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
@PiotrSikora and @davidben can likely advise. If things can only fail if OOM, etc. I would just make them RELEASE_ASSERT and just crash. Per your other point, I would err on the paranoid side. |
|
@ggreenway I wanted you to dial back to |
|
Does anyone still compile Envoy against openssl? If not or if that's unsupported, then I'll remove that RELEASE_ASSERT() |
|
@ggreenway we don't guarantee that it works with anything other than the currently linked version of BoringSSL. But please leave |
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Signed-off-by: Greg Greenway <ggreenway@apple.com>
| for (const auto& datasource : config.session_ticket_keys().keys()) { | ||
| switch (datasource.specifier_case()) { | ||
| case envoy::api::v2::DataSource::kFilename: { | ||
| const std::string key_data = Filesystem::fileReadToEnd(datasource.filename()); |
There was a problem hiding this comment.
I think this is a bit different because most of the other values are validated by using them (i.e. passing value to BoringSSL when creating context), with this, we validate it inside Envoy.
cc @mattklein123 to see if he has any strong preference.
| EVP_MD_CTX md; | ||
| int rc = EVP_DigestInit(&md, EVP_sha256()); | ||
| RELEASE_ASSERT(rc == 1); | ||
|
|
There was a problem hiding this comment.
Actually, thinking about this a bit more... I think we should also include server's certificate in the digest to prevent resumption across different certificates, even without any requirements on the client certificates.
Thoughts?
There was a problem hiding this comment.
One use case I'd like to support it resuming across certificates. The use case for this:
Front-proxy boxes come up, generate a certificate and CSR, get it signed, and now have a certificate unique to that box, and the private key never leaves the box.
Another use-case is when certificates are renewed. If we can resume cross-certificate, there won't be a spike in handshakes caused by resumption failures.
Maybe we should hash in the CN and SANs?
There was a problem hiding this comment.
The solution for the renewals is actually pretty easy - hash SPKI instead of the certificate (this is used in certificate pinning, i.e. RFC7469, for example).
The unique certificate per box is a bit more tricky, and it seems that indeed, we need to either hash CN and SANs (perhaps also the Issuer?) or add the ability to configure Session Context ID.
Let's just hash CN and SANs (and Issuer?), and we can revisit SPKI / Session Context ID in the future, if there is any need for that.
There was a problem hiding this comment.
I've never dealt with the boringssl/openssl API for dealing with names. I'm trying to use X509_get_subject_name() and X509_NAME_digest(), but cross-cert resumption fails. I'm working on debugging the problem (trying to figure out what is included in X509_get_subject_name(); I suspect it includes the serial number, subject key identifier, or similar). But if you happen to know of more appropriate calls to use, I'd appreciate any hints.
Either way, hopefully I'll have something put together on this soon.
There was a problem hiding this comment.
I think you could use:
X509_NAME_print_ex(..., X509_get_subject_name(cert.get()), 0, XN_FLAG_RFC2253);
to be consistent with what we use in ConnectionImpl::subjectPeerCertificate() and hash on that.
The result is Subject=emailAddress=frontend-team@lyft.com,CN=Test Frontend Team,OU=Lyft Engineering,O=Lyft,L=San Francisco,ST=California,C=US, so it shouldn't change across your certificates (unless you add some randomness to the subject to identify nodes).
There was a problem hiding this comment.
I was already well on the path to digging out just the CN. What I ended up with never deals with the CN as a c-string, so it's nicer in that respect. And it's similar to how the SANs are handled. Let me know if you want me to switch to X509_NAME_print_ex(...) instead.
| case envoy::api::v2::DownstreamTlsContext::kSessionTicketKeysSdsSecretConfig: | ||
| NOT_IMPLEMENTED; | ||
| break; | ||
| case envoy::api::v2::DownstreamTlsContext::SESSION_TICKET_KEYS_TYPE_NOT_SET: |
There was a problem hiding this comment.
Do we really need this? Shouldn't this fall into default case?
There was a problem hiding this comment.
Right now the default case throws, because it got a format it doesn't understand.
This case will probably be the most-common case: no keys set at all, which should not be an error. Even if we don't think the default case should throw, it should at least log or something, whereas this case should be silent I think.
|
|
||
| if (encrypt == 1) { | ||
| // Encrypt | ||
| if (session_ticket_keys_.empty()) { |
There was a problem hiding this comment.
Sorry, I assumed this would fallback to "don't send SessionTicket", but that's not the case and we can't have an empty session_ticket_keys_ list after the callback is configured. Please revert this to RELEASE_ASSERT() that you had before.
We'll need to enforce this in SDS.
Signed-off-by: Greg Greenway <ggreenway@apple.com>
|
@ggreenway FYI I'm going to wait for @PiotrSikora to give the approve on this one, then either myself or @htuch can skim though and rubber stamp it. |
| // sessions can only be resumed to a certificate for the same name, but allows | ||
| // resuming to unique certs in the case that different Envoy instances each have | ||
| // their own certs. | ||
| X509* cert = SSL_CTX_get0_certificate(ctx_.get()); |
There was a problem hiding this comment.
Add RELEASE_ASSERT(cert != nullptr);.
PiotrSikora
left a comment
There was a problem hiding this comment.
LGTM. There is one missing RELEASE_ASSERT(), and we need a decision from @mattklein123 and/or @htuch on whether to do validation of the key data in ServerContextImpl or in ServerContextConfigImpl.
|
Will trade @mattklein123 the final pass on this for me reviewing #1859 :D |
|
Yup will look today. |
|
@ggreenway looking through now. Quick note: Can we please get docs for this feature. Should be something in the high level overview here: https://envoyproxy.github.io/envoy/intro/arch_overview/ssl.html. And also the specific configuration section on listener TLS config (key format, etc.). |
|
P.S. please also doc the new stats. |
mattklein123
left a comment
There was a problem hiding this comment.
LGTM. I think just docs and the new random things that @PiotrSikora pointed out. I think people will be excited about this feature!
| namespace Ssl { | ||
|
|
||
| const unsigned char ContextImpl::SERVER_SESSION_ID_CONTEXT = 1; | ||
| int ContextImpl::sslContextIndex() { |
There was a problem hiding this comment.
For my own edification, why is this needed? Was the old fixed index of 1 just broken?
There was a problem hiding this comment.
These two things are unrelated; they just happen to be nearby in the file.
The OpenSSL API let's you store arbitrary data on both an SSL_CTX and on an SSL. You do this by registering for a global index, then you can set and get on any SSL_CTX or SSL with that global index. This function registers for an index on first use, then returns it there-after.
We use this to store a this-pointer on the SSL_CTX so we can retrieve it in various callbacks that don't let us set an a void* context on the callback.
-Documentation -Add missing RELEASE_ASSERT Signed-off-by: Greg Greenway <ggreenway@apple.com>
| by, for example, putting the new keyfile first, and the previous keyfile second | ||
|
|
||
| Each keyfile must contain exactly 80 bytes of cryptographically-secure random data. For example, | ||
| the output of ``openssl rand 80``. |
There was a problem hiding this comment.
Could you add a paragraph saying that:
- This feature is a security nightmare and that the session ticket keys must be (a) protected at the same or higher level that private keys of the TLS certificates, (b) rotated frequently (hourly or daily). See: https://www.imperialviolet.org/2013/06/27/botchingpfs.html.
- Session Tickets are supported by Envoy by default and this feature should be only used in setups where TLS traffic is distributed across multiple Envoy instances.
| break; | ||
| } | ||
| case envoy::api::v2::DataSource::kInline: { | ||
| const std::string& key_data = datasource.inline_(); |
There was a problem hiding this comment.
FWIW, I suspect this will need to be const std::string, because of Google's protobuf string != std::string issue, but I can fix it during next import if that's the case.
There was a problem hiding this comment.
The generated code at least currently returns a const std::string& . Are you saying it's going to change in a newer version of protobuf? If so, I'd say let's fix it when it changes.
There was a problem hiding this comment.
No, what I meant is that Google's internal version of protobuf uses string type that's not std::string and we've been accommodating that in Envoy via various hacks, which result in a copy vs reference.
But thinking about that, I've just realized that there are no tests that verify this code path - could you add one successful test that loads key data inline? Thanks!
There was a problem hiding this comment.
Ok, I understand now. I'll change it to make it easier on you. Also adding test coverage.
There was a problem hiding this comment.
I imported this change locally and it doesn't look that this is issue after all (but that might because of lack of tests), so let's leave it as-is for now. Thank you, though!
There was a problem hiding this comment.
Too late; I just pushed the fix. I think the new version is fine, but let me know if you want me to put it back.
Signed-off-by: Greg Greenway <ggreenway@apple.com>
| #include "common/common/hex.h" | ||
|
|
||
| #include "fmt/format.h" | ||
| #include "openssl/rand.h" |
There was a problem hiding this comment.
Please also add #include "openssl/hmac.h".
| key_data.size(), sizeof(SessionTicketKey))); | ||
| } | ||
|
|
||
| keys.push_back({}); |
There was a problem hiding this comment.
I believe this should be keys.emplace_back();.
| .. attention:: | ||
|
|
||
| Using this feature has serious security considerations and risks. Improper handling of keys may | ||
| result in loss of secrecy in connections, even if ciphers supporting perfect foward secrecy are |
-Fix typo in docs -Don't use std::string& for inline key type for temporary -Include hmac.h -emplace_back instead of push_back Signed-off-by: Greg Greenway <ggreenway@apple.com>
|
tsan build failed with: no such package '@grpc_httpjson_transcoding//src': Error cloning repository: Connection timed out Is there any wait to start a new build? Or should I push a no-op change? |
Signed-off-by: Greg Greenway <ggreenway@apple.com>
Broken in envoyproxy#1747. Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Broken in #1747. Signed-off-by: Piotr Sikora <piotrsikora@google.com>
Signed-off-by: Jose Nino <jnino@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
Signed-off-by: Jose Nino <jnino@lyft.com> Signed-off-by: JP Simard <jp@jpsim.com>
…1747) **Description** This upgrades k8s related tooling as well as adds more recent k8s versions to the e2e test matrics Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
for resuming TLS sessions across hot-restarts and/or
across multiple instances of Envoy.
Signed-off-by: Greg Greenway ggreenway@apple.com