Title: gRPC streaming keepAlive ping never fails when proxied through Envoy
Envoy: envoyproxy/envoy-alpine:cd514cc3f1ad82bfd57b6b832b379eb9a2888891
gRPC: grpc-go 1.7.2
Description:
I have a Docker setup where I am running Envoy and a gRPC service running in a single container. Envoy is proxying port 80 to port 8000 where the service is listening. The gRPC has a server->client unidirectional streaming endpoint that has keepAlive enabled so that if a client ever disconnects ungracefully, they won't leave a hanging connection. When I connect to my service directly and Ctrl-Z my test client, in ~30 seconds the server notices that a keepAlive HTTP/2 PING has failed, so it closes the connection. When I connect to my service through Envoy and Ctrl-Z my test client, the connection hangs forever.
I test this locally by running my docker container, and then from my local machine I first point my gRPC test client to port 8000 to bypass Envoy. I get the following results on Wireshark on the docker0 interface:

At the end, there are 3 groups of 3 TCP frames at 55, 85, and 115 seconds on port 8000. These are obviously the keepAlive HTTP/2 PINGs.
Here is what happens when I go through Envoy on port 80:

Here I see the actual HTTP/2, but it's only on the initial connection. No matter how long I listen, I never see any keepAlive frames. I assume my service is still sending the keepAlive PINGs to Envoy on the docker container's loopback interface, but I don't know an easy way to capture that.
gRPC KeepAlive Go config:
keepAliveOpt := grpc.KeepaliveParams(keepalive.ServerParameters{
MaxConnectionIdle: infinity,
MaxConnectionAge: infinity,
MaxConnectionAgeGrace: infinity,
Time: 25 * time.Second,
Timeout: 5 * time.Second,
})
keepAliveEnforcementPolicyOpt := grpc.KeepaliveEnforcementPolicy(keepalive.EnforcementPolicy{
MinTime: 5 * time.Minute,
PermitWithoutStream: false,
})
Envoy config:
Notice that I have a separate route for my streaming endpoint, because I needed to make the timeout_ms: 0
{
"listeners": [
{
"address": "tcp://0.0.0.0:80",
"filters": [
{
"type": "read",
"name": "http_connection_manager",
"config": {
"codec_type": "auto",
"stat_prefix": "ingress_http",
"route_config": {
"virtual_hosts": [
{
"name": "local_service",
"domains": ["*"],
"routes": [
{
"timeout_ms": 0,
"prefix": "/gprc.prefix.to.my.streaming/Endpoint",
"headers": [
{"name": "content-type", "value": "application/grpc"}
],
"cluster": "local_service_grpc",
"retry_policy": {
"retry_on": "5xx",
"num_retries": 3
}
},
{
"timeout_ms": 10000,
"prefix": "/",
"headers": [
{"name": "content-type", "value": "application/grpc"}
],
"cluster": "local_service_grpc",
"retry_policy": {
"retry_on": "5xx",
"num_retries": 3
}
},
{
"timeout_ms": 10000,
"prefix": "/",
"cluster": "local_service_http"
}
]
}
]
},
"filters": [
{
"type": "decoder",
"name": "router",
"config": {}
},
{
"type": "both",
"name": "health_check",
"config": {
"pass_through_mode": true,
"endpoint": "/healthcheck"
}
}
]
}
}
]
}
],
"admin": {
"access_log_path": "/dev/null",
"address": "tcp://0.0.0.0:8001"
},
"cluster_manager": {
"clusters": [
{
"name": "local_service_grpc",
"connect_timeout_ms": 10000,
"type": "strict_dns",
"lb_type": "round_robin",
"features": "http2",
"hosts": [
{
"url": "tcp://127.0.0.1:8000"
}
]
},
{
"name": "local_service_http",
"connect_timeout_ms": 10000,
"type": "strict_dns",
"lb_type": "round_robin",
"hosts": [
{
"url": "tcp://127.0.0.1:8000"
}
]
}
],
}
}
Title: gRPC streaming keepAlive ping never fails when proxied through Envoy
Envoy: envoyproxy/envoy-alpine:cd514cc3f1ad82bfd57b6b832b379eb9a2888891
gRPC: grpc-go 1.7.2
Description:
I have a Docker setup where I am running Envoy and a gRPC service running in a single container. Envoy is proxying port 80 to port 8000 where the service is listening. The gRPC has a server->client unidirectional streaming endpoint that has keepAlive enabled so that if a client ever disconnects ungracefully, they won't leave a hanging connection. When I connect to my service directly and Ctrl-Z my test client, in ~30 seconds the server notices that a keepAlive HTTP/2 PING has failed, so it closes the connection. When I connect to my service through Envoy and Ctrl-Z my test client, the connection hangs forever.
I test this locally by running my docker container, and then from my local machine I first point my gRPC test client to port 8000 to bypass Envoy. I get the following results on Wireshark on the docker0 interface:
At the end, there are 3 groups of 3 TCP frames at 55, 85, and 115 seconds on port 8000. These are obviously the keepAlive HTTP/2 PINGs.
Here is what happens when I go through Envoy on port 80:
Here I see the actual HTTP/2, but it's only on the initial connection. No matter how long I listen, I never see any keepAlive frames. I assume my service is still sending the keepAlive PINGs to Envoy on the docker container's loopback interface, but I don't know an easy way to capture that.
gRPC KeepAlive Go config:
Envoy config:
Notice that I have a separate route for my streaming endpoint, because I needed to make the timeout_ms: 0