Config switch to turn on/off Quic processing#9679
Config switch to turn on/off Quic processing#9679mattklein123 merged 67 commits intoenvoyproxy:masterfrom
Conversation
There was a problem hiding this comment.
cc @alyssawilk @danzh2010 @mattklein123 this is initial draft for config layout.
alyssawilk
left a comment
There was a problem hiding this comment.
Looks good!
Some part of me wants to encourage one big enum (since when we disable QUIC we make the shutdown mode call based on the reason we're shutting down) but honestly this is more clear and will hopefully avoid any decision paralysis when dealing with production fires. And folks can still push a different shut down mode with their disable if their default doesn't match their disasater :-)
mattklein123
left a comment
There was a problem hiding this comment.
Thanks for working on this. Flushing out a few comments.
/wait
There was a problem hiding this comment.
Sorry I'm a little confused about what is being proposed here. Do you mind adding some more comments?
There was a problem hiding this comment.
@mattklein123 @htuch for turning off/on udp/quick processing, my proposal is to have boolean switch on udp listener config + shutdown mode field (for shutdown type). Submitted this draft config to get community agreement on the approach, code will follow.
There was a problem hiding this comment.
Why don't we just remove the listener and drain as normal? Trying to grok what is distinct here vs. any other listener that comes to life at some point, starts accepting, and then is removed and drained.
There was a problem hiding this comment.
@htuch thanks for suggestion, i did not work much with LDS, so missed that opportunity. Agree that this is cleaner way, however currently listener manager supports only draining (graceful shutdown) of active listeners:
https://github.com/envoyproxy/envoy/blob/master/source/server/listener_manager_impl.cc#L668
In case we go with LDS approach, it will require code change to support multiple draining modes, right?
There was a problem hiding this comment.
I guess it depends how we structure shared listeners.
I thought if we had QUIC and QUIC-proxy listening on UDP/443 they'd share a listener, and some higher level filter would determine which packets went where. In that case as said it'd be nice to able to disable them one at a time. We can deal without but it's a nice to have.
If we think we'd have a QUIC/443 listener and a QUIC-proxy/443 listener just sharing incoming traffic via some combination of SO_REUSEPORT and BPF this would be fine.
I'd assumed we were going to do more of the former, at which point a non-listener kill switch has value IMO (though again we can live without it)
There was a problem hiding this comment.
My actual assumption (potentially naive) was that QUIC and QUIC-proxy would probably be different deployments of the binary in different auto scaling groups, etc. What is the use case in which you envision the same process doing both?
There was a problem hiding this comment.
Generic example is for edge proxies with low-value certs that do transparent proxying for latency gains. QUIC-terminate locally served content, UDP proxy where you'd TCP proxy because DNS SRV doesn't work.
We have two different frontline deployments which do TLS / TCP proxy based on VIP and one of them did QUIC+Udp-proxy for years.
We haven't had an issue where we needed to turn off one and not the other yet, but as tunneling over QUIC picks up steam (and that team has agreed to OSS once we land core QUIC) I think it gets more useful to turn different modes down separately.
We can always start simple and add more later though!
There was a problem hiding this comment.
We can always start simple and add more later though!
That would be my feeling. Right now we don't have any filter chain concept for UDP (I would like to fix that as part of a general listener API cleanup), and without that I'm not sure how we would do something like ^ today. IMO right now I think we could just have a per-listener runtime kill switch for UDP listeners? Or even just for the QUIC listener config itself?
There was a problem hiding this comment.
Thanks for feedback, it's valuable information to get familiar with. So we agreed to introduce separate config switches for upd and quic listeners.
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
| ActiveRawUdpListenerConfigFactory::createActiveUdpListenerFactory( | ||
| const Protobuf::Message& /*message*/) { | ||
| const Protobuf::Message& message) { | ||
| auto& config = |
There was a problem hiding this comment.
for now this cast will not work, need to reorg udp listener proto config to have one top level message
mattklein123
left a comment
There was a problem hiding this comment.
Thanks for working on this. Flushing out a few more comments.
/wait
| google.protobuf.Duration crypto_handshake_timeout = 3; | ||
| } | ||
|
|
||
| message ActiveQuicListenerConfig { |
There was a problem hiding this comment.
I would probably just name this QuicListenerConfig. I think we can drop the Active (having it in the other place is a mistake IMO).
In terms of the shutoff, what I'm proposing is just having a RuntimeFeatureFlag which says whether to process packets or not. I think this is ultimately the kill switch we are looking for? @alyssawilk @danzh2010 WDYT?
There was a problem hiding this comment.
Again I think it's not the switch I'm looking for in the long term (because not granular enough) but I'm fine landing whatever controls we think are useful today and we can add the finicky ones once we're running in prod and care more :-)
| } | ||
|
|
||
| message ActiveRawUdpListenerConfig { | ||
| enum UdpShutDownMode { |
There was a problem hiding this comment.
I'm not completely positive we want to bother with this kill switch for raw UDP right now. Maybe defer that for later? If we do it here, we should de-dup the messages and have a common configuration message of some type.
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
|
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
|
Spent most of the time on task, which may be out of scope for this change, basically trying to propagate Envoy primitives (like Runtime) all the way down from |
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
|
Check out |
|
@danzh2010 switched to real Runtime object and added end to end test for empty config. Please take a look, i believe it addresses your review comments. |
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
| quic::QuicBufferedPacketStore* const buffered_packets = | ||
| quic::test::QuicDispatcherPeer::GetBufferedPackets(quic_dispatcher_); | ||
| configureQuicRuntimeFlag(/* runtime_enabled = */ true); | ||
| // configureQuicRuntimeFlag(/* runtime_enabled = */ true); |
| "-Wall", | ||
| "-Wextra", | ||
| "-Werror", | ||
| #"-Werror", |
There was a problem hiding this comment.
@danzh2010 i was getting complaint (gcc) for quiche library code:
bazel-out/k8-fastbuild/bin/external/com_googlesource_quiche/quiche/quic/core/quic_framer.cc: In member function 'bool quic::QuicFramer::AppendIetfAckFrameAndTypeByte(const quic::QuicAckFrame&, quic::QuicDataWriter*)':
bazel-out/k8-fastbuild/bin/external/com_googlesource_quiche/quiche/quic/core/quic_framer.cc:5434:40: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]
writer->remaining() - ecn_size <
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
QuicDataWriter::GetVarInt62Len(gap) +
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
QuicDataWriter::GetVarInt62Len(ack_range)) {
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
There was a problem hiding this comment.
I'm fixing this issue in quiche now. It will be updated soon.
| void resumeListening() override; | ||
| void shutdownListener() override; | ||
|
|
||
| bool enabled() { return enabled_.enabled(); } |
There was a problem hiding this comment.
I don't strongly feel this interface is needed. Can you inline it at call site? If it's only used in test, accessing via a peer class is better.
There was a problem hiding this comment.
@danzh2010 alright, just looked up that method in other envoy modules and followed by example.
There was a problem hiding this comment.
This is not the only definition of quic listener being enabled. I would prefer not to add this interface.
| quic_listener_ = std::make_unique<ActiveQuicListener>(*dispatcher_, connection_handler_, | ||
| listen_socket_, listener_config_, | ||
| quic_config_, nullptr, enabledFlag()); |
There was a problem hiding this comment.
Overriding enabledFlag() is a bit indirect. Can you add two methods like below:
std::make_unique<ActiveQuicListener> createQuicListenerFactory(const std::string& yaml) {
std::string listener_name = QuicListenerName;
auto& config_factory =
Config::Utility::getAndCheckFactoryByName<Server::ActiveUdpListenerConfigFactory>(
listener_name);
ProtobufTypes::MessagePtr config = config_factory.createEmptyConfigProto();
TestUtility::loadFromYaml(yaml, *config);
return config_factory.createActiveUdpListenerFactory(*config, /*concurrency=*/1);
virtual std::string yamlForQuicConfig() {
return R"EOF(
runtime_key: "quic.enabled"
default_value: true
)EOF";
}
And use them here:
EXPECT_CALL(listener_config_, listenSocketFactory());
EXPECT_CALL(listener_config_. socket_factory_, getListenSocket).WillOnce(Return(listen_socket_));
quic_listener_ = createQuicListenerFactory(yamlForQuicConfig())->createActiveUdpListener(*dispatcher_, connection_handler_, ...);
In ActiveQuicListenerEmptyFlagConfigTest,
std::string yamlForQuicConfig() override {
return R"EOF(
max_concurrent_streams: 10
)EOF";
}
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
|
|
||
| quic_listener_ = std::make_unique<ActiveQuicListener>( | ||
| *dispatcher_, connection_handler_, listen_socket_, listener_config_, quic_config_, nullptr); | ||
| EXPECT_CALL(listener_config_, listenSocketFactory()) |
There was a problem hiding this comment.
this is needed to wire up mocks, that are used inside listener factory when creating quic listener
There was a problem hiding this comment.
For this type of thing it's better to use ON_CALL(...).WillByDefault()
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
|
There is failing coverage: https://343614-65214191-gh.circle-artifacts.com/0/coverage/index.html but none if it is related to this PR. Going to merge master and see if it helps |
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
mattklein123
left a comment
There was a problem hiding this comment.
LGTM other than small nits. @danzh2010 can you take a final pass? Thank you!
/wait
| Network::ListenerConfig& listener_config, const quic::QuicConfig& quic_config, | ||
| Network::Socket::OptionsSharedPtr options); | ||
| Network::Socket::OptionsSharedPtr options, | ||
| const envoy::config::core::v3::RuntimeFeatureFlag enabled); |
There was a problem hiding this comment.
nit: pass by const ref here and below
|
|
||
| quic_listener_ = std::make_unique<ActiveQuicListener>( | ||
| *dispatcher_, connection_handler_, listen_socket_, listener_config_, quic_config_, nullptr); | ||
| EXPECT_CALL(listener_config_, listenSocketFactory()) |
There was a problem hiding this comment.
For this type of thing it's better to use ON_CALL(...).WillByDefault()
| ActiveQuicListenerFactory::ActiveQuicListenerFactory( | ||
| const envoy::config::listener::v3::QuicProtocolOptions& config, uint32_t concurrency) | ||
| : concurrency_(concurrency) { | ||
| : concurrency_(concurrency), enabled_(config.enabled()) { |
There was a problem hiding this comment.
Thanks for looking into this!
| ReadFromClientSockets(); | ||
| } | ||
|
|
||
| TEST_P(ActiveQuicListenerTest, QuicProcessingDisabled) { |
There was a problem hiding this comment.
This test is already tested in QuicProcessingDisabledAndEnabled, right? Maybe remove it?
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
|
|
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
Signed-off-by: Kateryna Nezdolii <nezdolik@spotify.com>
|
@danzh2010 @mattklein123 applied review comments. |
Signed-off-by: Kateryna Nezdolii nezdolik@spotify.com
Description:
Extending quic listener configuration with runtime feature flag for enabling/disabling quic processing. Flag controls onReadReady() and onWriteReady() callback methods and is enabled by default.
Risk Level: Low/Medium
Testing: Unit tests+Integration tests.
Docs Changes: NA?
Release Notes:
Fixes ##9230