Skip to content

broker-ingress pods Ready before accepting connections, "Fanout had an error": "failed to forward reply", "connection refused" #4473

@maschmid

Description

@maschmid

Describe the bug

When horizontalpodautoscaler.autoscaling/broker-ingress-hpa decides to scale up broker-ingress, imc-dispatcher sometimes fails to forward replies, as the new broker-ingress pods are Ready before accepting connections (as they lack Readiness probe)

The following error can be seen in imc-dipatcher log (the 172.30.107.69 is broker-ingress svc)

{"level":"error","ts":"2020-11-05T20:21:31.470Z","logger":"inmemorychannel-dispatcher","caller":"fanout/fanout_message_handler.go:189","msg":"Fanout had an error","error":"failed to forward reply to http://broker-ingress.knative-eventing.svc.cluster.local/imc-broker-counter-3/broker-0: Post \"http://broker-ingress.knative-eventing.svc.cluster.local/imc-broker-counter-3/broker-0\": dial tcp 172.30.107.69:80: connect: connection refused","stacktrace":"knative.dev/eventing/pkg/channel/fanout.(*MessageHandler).dispatch\n\t/opt/app-root/src/go/src/knative.dev/eventing/pkg/channel/fanout/fanout_message_handler.go:189\nknative.dev/eventing/pkg/channel/fanout.createMessageReceiverFunction.func1.1\n\t/opt/app-root/src/go/src/knative.dev/eventing/pkg/channel/fanout/fanout_message_handler.go:143"}

Expected behavior
broker-ingress should not be Ready before it can accept connections, so events are not lost when broker-ingress scales-up.

To Reproduce
Hard to reproduce, probably manually setting minReplicas on broker-ingress-hpa

Knative release version
0.17.2

Additional context

Metadata

Metadata

Labels

kind/bugCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions