Skip to content

All Hops Encrypted: TLS between activator and queue-Proxy#12815

Merged
knative-prow[bot] merged 5 commits into
knative:mainfrom
nak3:add-queue-proxy-tls-main
Apr 19, 2022
Merged

All Hops Encrypted: TLS between activator and queue-Proxy#12815
knative-prow[bot] merged 5 commits into
knative:mainfrom
nak3:add-queue-proxy-tls-main

Conversation

@nak3
Copy link
Copy Markdown
Contributor

@nak3 nak3 commented Apr 5, 2022

This patch fixes the following issues:
Fix #12502
Fix #12503

TODO:

@knative-prow knative-prow Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/autoscale area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features labels Apr 5, 2022
@nak3 nak3 force-pushed the add-queue-proxy-tls-main branch from 440ed18 to 4a79434 Compare April 5, 2022 10:33
@nak3 nak3 force-pushed the add-queue-proxy-tls-main branch from a52ab2a to 87f3d49 Compare April 6, 2022 08:54
@nak3 nak3 force-pushed the add-queue-proxy-tls-main branch from 252faa3 to 4cfda88 Compare April 7, 2022 04:52
@knative-prow-robot knative-prow-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 10, 2022
knative-prow Bot pushed a commit to knative/pkg that referenced this pull request Apr 11, 2022
…S proxy (#2479)

* Add `NewProxyAutoTLSTransport` and `DialTLSWithBackOff` to support TLS proxy

Part of: knative/serving#12503
PoC: knative/serving#12815

This patch `NewProxyAutoTLSTransport` which is `NewProxyAutoTransport + TLS config.
Current proxy does not support TLS but it needs for knative/serving#12503.

`DialTLSWithBackOff` is also `DialWithBackOff` + TLS config. It needs
`newH2Transport` which handles HTTP2 with TLS.

* Fix lint

* Fix review comments
@nak3 nak3 force-pushed the add-queue-proxy-tls-main branch from 4cfda88 to c6a967d Compare April 11, 2022 08:50
@knative-prow-robot knative-prow-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 11, 2022
@nak3 nak3 force-pushed the add-queue-proxy-tls-main branch 4 times, most recently from 986115d to 685a942 Compare April 12, 2022 02:18
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 12, 2022

Codecov Report

Merging #12815 (a377984) into main (ad15a85) will decrease coverage by 0.42%.
The diff coverage is 50.86%.

@@            Coverage Diff             @@
##             main   #12815      +/-   ##
==========================================
- Coverage   87.44%   87.01%   -0.43%     
==========================================
  Files         196      197       +1     
  Lines        9782    14426    +4644     
==========================================
+ Hits         8554    12553    +3999     
- Misses        938     1578     +640     
- Partials      290      295       +5     
Impacted Files Coverage Δ
cmd/activator/main.go 0.00% <0.00%> (ø)
cmd/queue/main.go 0.65% <0.00%> (+0.11%) ⬆️
pkg/reconciler/revision/resources/deploy.go 90.10% <0.00%> (-6.89%) ⬇️
pkg/reconciler/domainmapping/reconciler.go 93.62% <33.33%> (+1.52%) ⬆️
pkg/activator/handler/handler.go 87.67% <45.45%> (-8.08%) ⬇️
pkg/networking/util.go 88.00% <88.00%> (ø)
pkg/reconciler/autoscaling/hpa/hpa.go 91.66% <88.88%> (+0.49%) ⬆️
pkg/http/proxy.go 100.00% <100.00%> (ø)
pkg/reconciler/autoscaling/config/store.go 100.00% <100.00%> (ø)
pkg/reconciler/autoscaling/kpa/kpa.go 95.18% <100.00%> (+0.06%) ⬆️
... and 199 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2de1474...a377984. Read the comment docs.

knative-prow Bot pushed a commit to knative/networking that referenced this pull request Apr 12, 2022
* Add certificates keys in config-network

This patch adds the following certificate variables:

- `activator-server-certs`
- `queue-proxy-ca`
- `queue-proxy-san`
- `queue-proxy-server-certs`

It is similar to #608.

knative/serving#12815 and knative/serving#12770
verifeid the change.

* Rename -server-certs with -cert-secret

* Bump checksum

* Fix lint
@nak3 nak3 force-pushed the add-queue-proxy-tls-main branch from 685a942 to 551f285 Compare April 12, 2022 04:31
@nak3 nak3 changed the title [WIP] All Hops Encrypted: TLS between activator and queue-Proxy All Hops Encrypted: TLS between activator and queue-Proxy Apr 12, 2022
@knative-prow knative-prow Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 12, 2022
@nak3
Copy link
Copy Markdown
Contributor Author

nak3 commented Apr 12, 2022

/cc @evankanderson @skonto @rhuss

@nak3 nak3 force-pushed the add-queue-proxy-tls-main branch from 09e936c to 3375537 Compare April 12, 2022 15:00
@evankanderson
Copy link
Copy Markdown
Member

I'll take a look at this tonight; I'm on my way to sitting break right now.

@nak3 nak3 force-pushed the add-queue-proxy-tls-main branch from 3375537 to 51fb4f4 Compare April 13, 2022 01:16
@knative-prow
Copy link
Copy Markdown

knative-prow Bot commented Apr 13, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: nak3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment thread cmd/activator/main.go Outdated
Comment thread cmd/activator/main.go Outdated
ctx, cancel, _ := rtesting.SetupFakeContextWithCancel(t)
defer cancel()
handler := New(ctx, test.throttler, rt, false /*usePassthroughLb*/, logging.FromContext(ctx))
handler := New(ctx, test.throttler, rt, false /*usePassthroughLb*/, logging.FromContext(ctx), false /* TLS */)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test where this is true?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not possible to add the HTTPs proxy in this test code. TestNewHeaderPruningProxyHTTPS covered the part instead.


// Make sure to update this if the activator's main file changes.
ah := New(ctx, fakeThrottler{}, rt, false, logger)
ah := New(ctx, fakeThrottler{}, rt, false, logger, false /* TLS */)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the overhead if tls is true, does it make sense to have run a benchmark with that too?

Comment thread pkg/reconciler/revision/resources/deploy_test.go
Comment thread pkg/reconciler/revision/resources/queue_test.go
@skonto
Copy link
Copy Markdown
Contributor

skonto commented Apr 15, 2022

/hold for @evankanderson

@knative-prow knative-prow Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 15, 2022
@skonto
Copy link
Copy Markdown
Contributor

skonto commented Apr 15, 2022

Afaik this is also protected from accidental misuse in this phase so let's get it in.

Copy link
Copy Markdown
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a slight concern that this PR will break namespaces which don't have the correct certificate and when the ConfigMap is set (mandatory volume mount in the new Deployment of a secret which doesn't exist, and activator only requesting https for those containers), but if the current e2e tests pass, I'm okay with this being a pre-alpha release.

CC @dprotaso just in case this ends up causing problems for the release in 2 business days.

Comment thread cmd/activator/main.go Outdated
// Enable TLS client when queue-proxy-ca is specified.
// At this moment activator with TLS does not disable HTTP.
// See also https://github.com/knative/serving/issues/12808.
if networkConfig.QueueProxyCA != "" && networkConfig.QueueProxySAN != "" {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires a restart of activator to pick up the new network config.

For pre-alpha, this seems okay, but we should aim to make this configurable without a restart by beta.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: if the secret name wasn't configurable, you could use a volume mount to achieve this addition to the system cert pool.

Comment thread cmd/activator/main.go
pool = x509.NewCertPool()
}

if ok := pool.AppendCertsFromPEM(caSecret.Data["ca.crt"]); !ok {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the secret exists but doesn't have the right data, this will fail.

Do we need to undo the base64 encoding here before AppendCertsFromPEM? (I don't recall what the client-go libraries assist with here.)

Comment thread cmd/activator/main.go Outdated
go concurrencyReporter.Run(ctx.Done())

// Enable TLS against queue-proxy when the CA and SA are specified.
tlsEnabled := networkConfig.QueueProxyCA != "" && networkConfig.QueueProxySAN != ""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is repeating 160, it looks like. Can we set this there?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or make it part of the transport)?

Comment thread cmd/queue/main.go Outdated
Comment thread cmd/queue/main.go Outdated
mainTLSServer, drain := buildServer(ctx, env, probe, stats, logger, concurrencyendpoint, true /* enable TLS */)
tlsServers := map[string]*http.Server{
"tlsMain": mainTLSServer,
"tlsAdmin": buildAdminServer(logger, drain),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like moving the admin server to TLS is spread all over this function. What about moving this before the loop on 190 and explicitly moving the server from one map to the other. (It would add another tlsEnabled condition to the function, but that feels simpler than the other logic.)

i.e. build a tlsServer and httpServer map, and then iterate through one or both after building the maps?

Comment thread cmd/queue/main.go
Comment thread cmd/queue/main.go Outdated
Comment thread pkg/http/proxy.go Outdated
extraVolumes = append(extraVolumes, varTokenVolume)
}

if cfg.Network.QueueProxyCertSecret != "" {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this means changing the QueueProxtCertSecret will cause all Knative Serving user deployments to get updated and rolled out, correct?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes't that's correct.

@nak3 nak3 force-pushed the add-queue-proxy-tls-main branch from 9a229a1 to 1bf67e5 Compare April 17, 2022 15:23
Copy link
Copy Markdown
Member

@evankanderson evankanderson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/hold cancel

Still have a few pre-alpha and post -1.4 concerns, but I think this should be safe to go in tomorrow.

Comment thread cmd/queue/main.go
// See also https://github.com/knative/serving/issues/12808.
var tlsServers map[string]*http.Server
if tlsEnabled {
mainTLSServer, drain := buildServer(ctx, env, probe, stats, logger, concurrencyendpoint, true /* enable TLS */)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the drain from line 176 is lost?

Again, okay for pre-alpha, but it would be nice to unify the drains post 1.4

Comment thread cmd/queue/main.go
httpProxy.BufferPool = network.NewBufferPool()
httpProxy.FlushInterval = network.FlushInterval

breaker := buildBreaker(logger, env)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we won't be counting concurrency properly across HTTP and HTTPS while the activator change rolls out (but again, a post-1.4 note, just worth adding a TODO here).

@knative-prow knative-prow Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 18, 2022
@nak3
Copy link
Copy Markdown
Contributor Author

nak3 commented Apr 18, 2022

Thank you! Updated.

@evankanderson
Copy link
Copy Markdown
Member

/lgtm

@knative-prow knative-prow Bot added the lgtm Indicates that a PR is ready to be merged. label Apr 18, 2022
@nak3
Copy link
Copy Markdown
Contributor Author

nak3 commented Apr 18, 2022

/retest

TestActivatorOverload failed. Not only this PR but also other PRs failed with TestActivatorOverload last night.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/API API objects and controllers area/autoscale area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

All Hops Encrypted: Activator uses queue-proxy TLS if available All Hops Encrypted: Queue-Proxy serving TLS

4 participants