Skip to content

H2 encode flow control#1328

Merged
mattklein123 merged 21 commits into
envoyproxy:masterfrom
alyssawilk:h2-encode-flow-control
Aug 4, 2017
Merged

H2 encode flow control#1328
mattklein123 merged 21 commits into
envoyproxy:masterfrom
alyssawilk:h2-encode-flow-control

Conversation

@alyssawilk
Copy link
Copy Markdown
Contributor

@alyssawilk alyssawilk commented Jul 25, 2017

#150 (for h2 flow control)

H2 flow control for the encode (backend to client) path, specifically enforcing the exiting connection limits and applying those same limits to the codec send/recv buffers.

If the connection buffer overflows, the CodecClient notifies the Codec.
If the Codec send buffer overflows (or the connection overflows) the codec calls stream callbacks.
The router listens for stream callbacks and passes the backup to the connection manager via StreamDecoderFilterCallbacks
the connection manager enables/disables reads on the stream.

Connection buffer limits are already configurable. This adds configurable stream limits via the h2 setting structure.

TBD, the client-to-backend path, filter buffers, and HTTP1.

onDecoderFilterAboveWriteBufferHighWatermark() {
ENVOY_STREAM_LOG(debug, "Read-disabling downstream stream due to filter callbacks.", parent_);
parent_.response_encoder_->getStream().readDisable(true);
parent_.connection_manager_.stats_.named_.downstream_flow_control_paused_reading_total_.inc();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattklein123 Do you think it's also worth adding the source-cluster of the flow control backup or is that overkill?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have this in upstream stats, so I think this is overkill (see my comment about router stats). Router should be hitting the cluster stats in this case IMO.

@htuch
Copy link
Copy Markdown
Member

htuch commented Jul 26, 2017

It seems to make sense to have separate stream/connection limits, given you have many streams per connection. There's a natural place for them in https://github.com/lyft/envoy/blob/master/source/common/json/config_schemas.cc#L267 (for H2 at least - in v2 of the API we have places for all these settings in https://github.com/lyft/envoy-api/blob/master/api/protocol.proto, including a per_stream_buffer_limit_bytes that is already there :) ).

Comment thread include/envoy/http/filter.h Outdated
virtual void encodeTrailers(HeaderMapPtr&& trailers) PURE;

/**
* Called when a decoder filter goes over its high watermark.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the filter going over the high watermark or the underlying buffers for its stream/connection? This is a distinction that is interesting because at some point the filter itself will have a notion of low/high watermark when managing its own buffers.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filter, codec, and connection for the router filter.

Updating to (not upstream yet, still doing h2 config changes)

  • Called when the buffer for a decoder filter or any buffers the filter sends data to go over
  • their high watermark.
  • In the case of a filter such as the router filter, which spills into multiple buffers (codec,
  • connection etc.) this may be called multiple times. Any such filter is responsible for calling
  • the Low Watermark callbacks an equal number of times as the respective buffers are drained.

Comment thread source/common/http/codec_client.h Outdated
void onBelowWriteBufferLowWatermark() override {}
// Pass watermark events from the connection on to the codec which will pass it to the underlying
// streams.
void onAboveWriteBufferHighWatermark() override { codec_->onAboveWriteBufferHighWatermark(); }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't gone over the full PR yet, but one question I have is how do the connection and stream watermark event edge triggers interact? For example, if connections goes above high watermark, then stream goes above high watermark, are we guaranteed to only ever get one onAboveWriteBufferHighWatermark for the stream's registered callbacks, or will we see two? I recall this was an issue I had to write some funky logic to handle when experimenting with this a while back.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connection manager has to handle callbacks from multiple parties since each filter can go over watermark, so I figured it was better to handle all the deduping in one place rather than do it both here and over there. That's all managed by the overrun_buffers() counters.

@alyssawilk
Copy link
Copy Markdown
Contributor Author

@htuch I'm not convinced http2_settings is the right place - everything in there is an actual HTTP/2 setting, which gets sent to the peer. Unless you're saying we should use the stream window size though that adds a constraint that the data you're willing to buffer locally is also what you want to allow to be on the network.

I can put it there temporarily and have it be the odd "setting" out or I can add a TODO to use per_stream_buffer_limit_byte when we move over to v2. Preference?

@htuch
Copy link
Copy Markdown
Member

htuch commented Jul 26, 2017

I don't think there's a big issue with the semantics of that JSON object today, it's basically a disposable object, since it will be retired soon. I think plumbing in the per_streambuffer_limit_byte stuff into v1 will make it easier to handle in v2. I think there is a strong case for different buffer limits, since we will have to size the connection limits too conservatively to deal with with the connection_buffer_limit * max_streams blowout.


per_stream_buffer_limit
*(optional, integer)* A soft limit on the number of bytes envoy will buffer per-stream in the
codec buffers. Once the buffer reaches this point, Watermark callbacks will fire to stop the
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/Watermark/watermark/, s/envoy/Envoy/

/**
* Enable/disable further data from this stream.
* Cessation of data may not be immediate. For example, for HTTP/2 this may stop further flow
* control window updates which will result in the peer eventually stopping sending data.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a later implementation, have you thought about using distance from watermark as a way to size the H2 stream flow control window?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would avoid the double buffer and given the proximity between buffer and data source I think it'd be a nice clean thing to do, but it would make this one buffer + data source combination special and unusual so I'm not convinced it'd be worth doing unless I get to do the full flow control redesign we've talked about in person where everything seemlessly avoids the double buffering.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

everything seemlessly avoids the double buffering

I assume you mean by rejiggering everything so that encode* returns the number of bytes accepted, etc.? That's certainly doable, though, buffering still will need to happen since whoever fails to write will need to buffer locally until they can write. This pushes more complexity out to the leaves. Not sure it's worth it, but I guess we can see how things go with this approach.

* Called when the buffer for a decoder filter or any buffers the filter sends data to go over
* their high watermark.
*
* In the case of a filter such as the router filter, which spills into multiple buffers (codec,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you share this comment with the other variants below? I'm thinking we might want to take the design doc and digest it down to a Markdown file explaining what is actually implemented, and example sequences of going above and below to convey the intuition of when callbacks are fired etc. Maybe you can figure out a better way to convey the intuition in comment form, up to you.

Copy link
Copy Markdown
Member

@htuch htuch Jul 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having read more of the PR, I think a MD file explaining the connection + stream level design stuff, and then real examples of callback sequences on different events/overflows would be super helpful. Looking through the PR locally, it looks very solid, but I'm having a hard time connecting all the dots staring at the code, some worked examples or diagrams if you prefer would be great.

int ConnectionImpl::onData(int32_t stream_id, const uint8_t* data, size_t len) {
getStream(stream_id)->pending_recv_data_.add(data, len);
StreamImpl* stream = getStream(stream_id);
stream->pending_recv_data_.add(data, len);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the H2 stream flow control window save us from infinite buffering in pending_recv_data_ here? If so, maybe add a comment.

Comment thread source/common/http/http2/codec_impl.h Outdated
// I don't fully understand.
static const uint64_t MAX_HEADER_SIZE = 63 * 1024;

bool buffers_overrun() { return buffers_overrun_ > 0; }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: bool buffers_overrrun() const {

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall solid as usual. This is great. I have a bunch of high level comments to get started with. I haven't looked through the tests yet.

no_cluster, Counter, Total requests in which the target cluster did not exist and resulted in a 404
rq_redirect, Counter, Total requests that resulted in a redirect response
rq_total, Counter, Total routed requests
flow_control_paused_downstream_reads_total, Counter, Total times requests backed up enough to pause reading from downstream.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this useful at router level? We have http_conn_man stats, and upstream stats, so maybe seems superfluous?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought as we added various other filters we might want to track the source of the reads getting paused. If you think it's overkill I'm happy to remove it :-)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO for now it's probably overkill, but up to you. But we definitely need to track the cluster that we paused. Are we doing that with this change? I didn't see it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had cluster counters for when we stopped reading from upstream but not when upstream stopped reads from downstream.

Adding them now. I think does mean we'll have somewhat redundant TCP and HTTP stats where the cluster level "downstream backing up" stats match the "upstream stopped reading" stats and vice versa but it'll make the H2 case easier to understand for sure.

* Fires when a stream, or the connection the stream is sending to, goes over its high watermark.
*/
virtual void onAboveWriteBufferHighWatermark() PURE;
/**
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: newline before here

onDecoderFilterAboveWriteBufferHighWatermark() {
ENVOY_STREAM_LOG(debug, "Read-disabling downstream stream due to filter callbacks.", parent_);
parent_.response_encoder_->getStream().readDisable(true);
parent_.connection_manager_.stats_.named_.downstream_flow_control_paused_reading_total_.inc();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have this in upstream stats, so I think this is overkill (see my comment about router stats). Router should be hitting the cluster stats in this case IMO.

Comment thread source/common/http/http1/codec_impl.h Outdated
void addCallbacks(StreamCallbacks& callbacks) override { addCallbacks_(callbacks); }
void removeCallbacks(StreamCallbacks& callbacks) override { removeCallbacks_(callbacks); }
void resetStream(StreamResetReason reason) override;
void readDisable(bool /*disable*/) override {}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rest of code just deletes param name to deal with unused parameter.

Comment thread source/common/router/router.h Outdated
void onResetStream(Http::StreamResetReason reason) override;
void onAboveWriteBufferHighWatermark() override {
// Have the connection manager disable reads on the downstream stream.
parent_.config_.stats_.flow_control_paused_downstream_reads_total_.inc();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented elsewhere, but IMO I would remove router stats here, and do upstream cluster stats.

StreamImpl* stream = getStream(stream_id);
stream->pending_recv_data_.add(data, len);
if (!stream->buffers_overrun()) {
nghttp2_session_consume(session_, stream_id, len);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: The current auto window handling code does the "standard" h2 behavior of only sending window update when we fall below max size / 2 of window (to avoid excess window updates). Does this code path still do the same thing? I suspect it might be it would be nice to check and comment.

}

connection_.dispatcher().deferredDelete(stream->removeFromList(active_streams_));
nghttp2_session_consume(session_, stream_id, stream->unconsumed_bytes_);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity was this required to get tests passing? I would think nghttp2 would actually do this accounting at the session level if a stream is destroyed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have thought so too. I had a random moment of "oh but how awful would it be to debug things it doesn't?" and added a regression test just in case. (Http2CodecImplFlowControlTest.EarlyResetRestoresWindow) which fails without this consume.

You don't need to update stream windows on stream close, but the stream window influences the connection window and that isn't updated. My first reaction was that was terrible. My second reaction was that if a stream is "closed" as far as nghttp2 is concerned (all data is sent, fin is received) but the consumer has the data buffered, it really ought not update the connection level window until the consumer says to. So yeah, necessary by design :-/

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK thanks for looking into this. I might add a comment here for the next person that comes along and wonders about this.

ConnectionCallbackImpl(HttpActiveHealthCheckSession& parent) : parent_(parent) {}
// Network::ConnectionCallbacks
void onEvent(uint32_t events) override { parent_.onEvent(events); }
void onAboveWriteBufferHighWatermark() override {}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment: I noticed in latest coverage report that there now a bunch of missing lines for watermark callbacks in some network filters (and soon potentially http filters) where we don't do anything. I realize it's kind of "cheap" coverage but it would be nice to see if we can get everything covered if you see any good way to do it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for the no-op functions? I was pretty careful on coverage to cover actual lines of code.
For no-ops do we want to have the tests all call onUnusedfunction() or are coverage annotations to disregard ok?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"coverage annotations to disregard" don't currently work until we switch to lcov. You can bug @htuch about that. 😉

In this case though, I'm not sure we should disregard as these functions will likely get called. I feel like unit tests on filters, etc. should probably call the functions when they can just to future proof anything strange happening, but not a big deal.

Comment thread source/common/http/http2/codec_impl.cc Outdated
ENVOY_CONN_LOG(debug, "Stream {} disabled: disable {}, unconsumed_bytes {}", parent_.connection_,
stream_id_, disable, unconsumed_bytes_);
if (disable) {
++buffers_overrun_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I would rename variable to read_disable_count_ to match what is going on here. (or maybe rename readDisable() to something else).

}
}

void ConnectionImpl::StreamImpl::pendingRecvBufferHighWatermark() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth heaving temporary asserts for all 4 of these functions like we did in the other PR with a TODO to clean them up after we get more prod experience? I would be in favor of that personally. This is all super scary.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the work being factored into the watermark classes I don't think it gains us much but I'm happy to err on the side of caution. Added locally, upstreamed soon :-)

@@ -0,0 +1,103 @@
### Overview
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@htuch obviously fairly rough but I'd appreciate high level comments - is this the sort of detailed walkthrough you were looking for?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this is very useful. nit: it would be easier to read if using foo for functions and possibly using some numeric lists/bullet lists in subsections.

Copy link
Copy Markdown
Member

@htuch htuch Jul 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, this is what I was after. Some more suggestions on how to polish:

  • Explain explicitly the connection between network and stream level handling (and how this relates to plain TCP or H1 network level handling).
  • Any diagrams that are reasonable (i.e. don't take ages to put together) would help. I have in my mind the perfect way to efficiently communicate this entire story, an animation showing what happens on backup, the series of firing callbacks and how the buffers fill and drain, but this would take a bit of time to put together, so I don't expect that from this PR (maybe for a future blog post however..).
  • More explanation of who the principle actors are. Kind of like telling a story and providing some character background. Leading into the callback description, it would be good to know why HTTP connection manager and the router play the role they do, i.e. that they are the unique bridges across certain kinds of events and subsystem divides in Envoy. Sames goes with the codecs.
  • Cover corner cases explicitly, such as the combination of stream and connection level watermark firing.
  • Run spell checker and fix any grammar nits.

@htuch
Copy link
Copy Markdown
Member

htuch commented Aug 2, 2017

@alyssawilk Can you merge to resolve conflicts?

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This documentation is fantastic, thanks! PR is basically ready to ship IMHO.

Comment thread include/envoy/http/codec.h Outdated
virtual void onBelowWriteBufferLowWatermark() PURE;

/**
* Called when the connection goes from over its high watermark to under its low watermark.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these comments the opposite way around (or maybe I'm standing on my head today ;) )?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hasty push
No, why would I ever do something as silly as that?

Comment thread source/common/http/http2/codec_impl.cc Outdated

void ConnectionImpl::StreamImpl::pendingRecvBufferHighWatermark() {
ENVOY_CONN_LOG(debug, "recv buffer over limit ", parent_.connection_);
ASSERT(pending_receive_buffer_high_watermark_called_ == false);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for booleans we don't ned equality with the value, just something like !pending_receive...

## HTTP/2 codec recv buffer

Given the HTTP/2 `Envoy::Http::Http2::ConnectionImpl::StreamImpl::pending_recv_data_` is processed immediately
there's no real need for buffer limits, but for consistency and to future-proof the implementation,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I wondered about this. Seems to make it more complicated than needed for now IMHO, but it's done already, so I don't mind I guess.

Comment thread source/common/http/http2/codec_impl.h Outdated

private:
virtual ConnectionCallbacks& callbacks() PURE;
virtual Http::ConnectionCallbacks& callbacks() PURE;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Why did this namespace prefixing become necessary in this PR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier version had connection callbacks here as well. Reverted.

Comment thread source/docs/flow_control.md Outdated
@@ -0,0 +1,182 @@
### Overview

Flow control in envoy is done by having limits on each buffer, and watermark callbacks. When a
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super minor nit: I really prefer to have s/envoy/Envoy/g. It's not really important, but it takes me to my calm place ;)

Comment thread source/docs/flow_control.md Outdated

![HTTP2 data flow diagram](h2_buffers.png)

For HTTP/2, when filters, streams, or connections back up, the end result is readDisable(true)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: prefer code quotations, readDisable(true) as done elsewhere.

Comment thread source/docs/flow_control.md Outdated
being called on the source stream. This results in the stream ceasing to consume window, and so
not sending further flow control window updates to the peer. This will result in the peer
eventually stopping sending data when the available window is consumed (or nghttp2 closing the
connection if the peer violates the flow control limit). When readDisable(false) is called, any
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto etc.

Comment thread source/docs/flow_control.md Outdated
Note that readDisable() on a stream may be called by multiple entities. It is called when any
filter buffers too much, when the stream backs up and has too much data buffered, or the
connection has too much data buffered. Because of this, readDisable() maintains a count of
the number of times it has been called to both enable and disable the stream, resumes reads when
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/resumes/resuming/

Comment thread source/docs/flow_control.md Outdated
each caller has called the equivalent low watermark callback. For example, if
the TCP window upstream fills up and results in the network buffer backing up,
all the streams associated with that connection will readDisable(true) their
downstream data sources. While the HTTP/2 flow control window fills up an
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/While/When/

Comment thread source/docs/flow_control.md Outdated
all the streams associated with that connection will readDisable(true) their
downstream data sources. While the HTTP/2 flow control window fills up an
individual stream may use all of the window available and call a second
readDisable() its downstream data source. When the upstream TCP socket drains,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/its/on its/

Comment thread include/envoy/http/codec.h Outdated
static const uint32_t MAX_INITIAL_CONNECTION_WINDOW_SIZE = (1U << 31) - 1;

// By default, do not enforce stream buffer limits beyond the connection level limits.
static const uint32_t DEFAULT_PER_STREAM_BUFFER_LIMIT = 0;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat related to our offline convos, but I'm wondering if by default we should set stream limits (since defaults are huge, maybe stream limit could default to 128MiB or less vs. 256MiB)? Also, from a config perspective, what is the reasoning behind differentiating between per_stream_buffer_limit_ and initial_stream_window_size_? (I will look more through but figured I would quickly as this question since I noticed it).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Connection limits default to 1M - I'd prefer to have stream limits equal or smaller if that works for you.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current defaults IIRC are 1MiB for TCP level buffering, but 256MiB for both connection and stream HTTP/2 windows. We can tweak defaults in follow up (or here), but my main question is actually why to have the stream window default be different from stream buffer limit. Seems like they could be the same?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just looked at the code more, and correct me if I am wrong the default OOB behavior will be:

  • 1MiB connection buffer limit, when breached, fires callbacks and pauses all streams.
  • No default codec level buffer limit. If stream/connection windows back up, nothing will be paused and we can use unlimited memory. (In case TCP is open, but other side is not ACKing window updates).

IMO we should set buffer limit by default, and perhaps merge stream buffer/window settings into single setting?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I'd prefer something smaller in the long run 256M is strictly better than leaving it infinite :-)

I think lowering should be done in a follow-up anyway. I think there will be complexities we'll want to mull on - @htuch suggests defaulting filter limits to stream limits, which would make this a breaking change for non-streaming filters.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on both turning on as well as tuning defaults. I guess my question is still why we need two different settings for stream window size and buffer size. If there are good reasons for that can we add some comments somewhere?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, you wanted to remove the option entirely and just go with window size? Hm... I could imagine a use case where a user wanted to keep the stream limits smaller to avoid one stream hogging full connection bandwidth but was willing to buffer locally to spool upstream immediately as window became available. I can comment up the reason they're somewhat distinct use cases, or just remove it until/if someone asks for it back.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my preference would be to remove the extra configuration pivot right now, but don't feel that strongly. If you want to keep it I would just add some comments on why it exists.

Copy link
Copy Markdown
Member

@htuch htuch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably going to be an issue in v2, where we have independent per stream buffer limits and initial window sizes. I kind of preferred it how it was before, but it's fine for a first PR as long as we have some idea of how this will evolve for v2.

* Called when the underlying Network::Connection goes over its high watermark.
*/
virtual void onAboveWriteBufferHighWatermark() PURE;
virtual void onUnderlyingConnectionAboveWriteBufferHighWatermark() PURE;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably find onNetworkConnections... more descriptive, but potatoes potatahs..

request_encoder_->encodeHeaders(request_headers, false);
Buffer::OwnedImpl body(std::string(1024 * 1024, 'a'));
EXPECT_CALL(client_stream_callbacks, onAboveWriteBufferHighWatermark()).Times(AnyNumber());
;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

?

@htuch
Copy link
Copy Markdown
Member

htuch commented Aug 2, 2017

Ignore last set of comments except boilerplate, I see you've already had a long thread on this.

{
"virtual_hosts": [
{
"name": "redirect",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not unique to this PR, but I feel there is a fair bit of boiler plate in these JSON files that is actually unused. A good example is the redirect stuff. In a PR I'm working on right now, I made an effort to cut out this, I suggest giving this JSON a quick pass and seeing what can be removed.

Comment thread source/docs/flow_control.md Outdated
eventually stopping sending data when the available window is consumed (or nghttp2 closing the
connection if the peer violates the flow control limit). When `readDisable(false)` is called, any
outstanding unconsumed data is immediately consumed, which results in resuming window updates to the
peer and the resumption of data.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth pointing out that we only do H2 flow control on stream level and not the connection level for the purposes of buffer bounding, which is implied here but a useful thing to call out explicitly.


* When `Envoy::Network::ConnectionImpl::write_buffer_` has too much data it calls
`Network::ConnectionCallbacks::onAboveWriteBufferHighWatermark()`.
* When `Envoy::Http::CodecClient` receives `onAboveWriteBufferHighWatermark()` it
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the doc need updating for the underlying buffer name changes?

@htuch
Copy link
Copy Markdown
Member

htuch commented Aug 2, 2017

Can you also file a PR to remove https://github.com/lyft/envoy-api/blob/master/api/protocol.proto#L14? Thanks.

htuch
htuch previously approved these changes Aug 3, 2017
@htuch
Copy link
Copy Markdown
Member

htuch commented Aug 3, 2017

@alyssawilk docs build is still failing, otherwise LGTM.

htuch
htuch previously approved these changes Aug 3, 2017
window. Currently , this has the same minimum/maximum/default as :ref:`initial_stream_window_size
<config_http_conn_man_http2_settings_initial_stream_window_size>`.

per_stream_buffer_limit_bytes
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't exist anymore right? Should this text get merged into initial_stream_window_size?

Copy link
Copy Markdown
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! Great work.

@mattklein123 mattklein123 merged commit 2d209cd into envoyproxy:master Aug 4, 2017
@alyssawilk alyssawilk deleted the h2-encode-flow-control branch August 4, 2017 01:00
Because the various buffers in the HTTP/2 stack are fairly complicated, each path from a buffer
going over the watermark limit to disabling data from the data source is documented separately.

![HTTP2 data flow diagram](h2_buffers.png)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a source vector drawing that was used to generate this image?

I think that the color or some of the buffers in the diagram may need to be updated. I think that based on how L4 buffering works today the ConnectionImpl buffer is bounded by the configured connection watermark and should be green. Some other changes are also in the pipeline.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comes from https://docs.google.com/document/d/1EB3ybx3yTndp158c4AdQ4nutksZA9lL-BQvixhPnb_4/edit#bookmark=id.umrpbw846kb. I'll share the doc with you and you can copy from there if you want.

mathetake added a commit that referenced this pull request Mar 3, 2026
**Description**

Since log time ago (i think v0.2.), there has been a limitation on k8s
Serice as a target by AIServiceBackend. The detailed context is
described in #902. However, we haven't had any CRD CEL validation that
enforces the limitation, hence there has been a lot of users
encountering the limitation that results in *very* hard-to-debug errors.
e.g. #1169
kserve/kserve#4595 (comment)

Since resolving #902 takes a relatively large amount of efforts (needs
change in EG) as well as using k8s service as a target is out of our
primary scope, this adds a CEL validation to save everyone's time.

---------

Signed-off-by: Takeshi Yoneda <t.y.mathetake@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants