Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ contents:
listen health_check_http_url
bind :::50936 v4v6
mode http
monitor-uri /healthz
monitor-uri /readyz
option dontlognull
listen stats
bind localhost:{{`{{ .LBConfig.StatPort }}`}}
Expand All @@ -32,7 +32,7 @@ contents:
stats refresh 30s
stats auth Username:Password
backend masters
option httpchk GET /healthz HTTP/1.0
option httpchk GET /readyz HTTP/1.0
option log-health-checks
balance roundrobin
{{`{{- range .LBConfig.Backends }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ contents:
livenessProbe:
initialDelaySeconds: 10
httpGet:
path: /healthz
path: /readyz
port: 50936
terminationMessagePolicy: FallbackToLogsOnError
imagePullPolicy: IfNotPresent
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ contents:
listen health_check_http_url
bind :::50936 v4v6
mode http
monitor-uri /healthz
monitor-uri /readyz
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't /readyz endpoint used usually for readiness check and /healthyz used for liveness check?
If that is that is the case, I think we should consider keeping /healthz here, as it's used for HAProxy pod Liveness check

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the time /healthz turns red, it's too late -- you'll possibly have sent traffic to the node after it's become unavailable. We're seeing a lot of flakes in CI with connection reset and other errors related to the API server availability.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR includes two changes:
A. The change that API folks recommended doing, change from /healthz to /readyz in HAProxy backend check [1], it's covered in this PR - I'm fine with that.
B. Changing the endpoint name for HAProxy static pod Liveness from /healthz to /readyz, [2]. my comment/question was about the endpoints naming convention for Liveness/Readiness probes.

I just checked the endpoints naming used in other pods (kube-api-server) and seems that they are not using /healthz for Liveness probe and /readyz for Readiness probe.
So I guess we can change it also here

[1] https://github.com/openshift/machine-config-operator/blob/master/templates/master/00-master/baremetal/files/baremetal-haproxy-haproxy.yaml#L35
[2] https://github.com/openshift/machine-config-operator/blob/master/templates/master/00-master/baremetal/files/baremetal-haproxy-haproxy.yaml#L24

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to capture what I said on slack, this healthcheck is an arbitrary endpoint. It doesn't actually relate to the api, it just needs to match whatever the haproxy liveness probe uses. I changed it so we'd be using consistent names here, but we might consider moving this endpoint to a completely different name to make it clear that it has nothing to do with the api endpoints.

option dontlognull
listen stats
bind localhost:{{`{{ .LBConfig.StatPort }}`}}
Expand All @@ -33,7 +33,7 @@ contents:
stats refresh 30s
stats auth Username:Password
backend masters
option httpchk GET /healthz HTTP/1.0
option httpchk GET /readyz HTTP/1.0
option log-health-checks
balance roundrobin
{{`{{- range .LBConfig.Backends }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ contents:
livenessProbe:
initialDelaySeconds: 10
httpGet:
path: /healthz
path: /readyz
port: 50936
terminationMessagePolicy: FallbackToLogsOnError
imagePullPolicy: IfNotPresent
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ contents:
listen health_check_http_url
bind :::50936 v4v6
mode http
monitor-uri /healthz
monitor-uri /readyz
option dontlognull
listen stats
bind 127.0.0.1:{{`{{ .LBConfig.StatPort }}`}}
Expand All @@ -32,7 +32,7 @@ contents:
stats refresh 30s
stats auth Username:Password
backend masters
option httpchk GET /healthz HTTP/1.0
option httpchk GET /readyz HTTP/1.0
option log-health-checks
balance roundrobin
{{`{{- range .LBConfig.Backends }}
Expand Down
2 changes: 1 addition & 1 deletion templates/master/00-master/ovirt/files/ovirt-haproxy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ contents:
livenessProbe:
initialDelaySeconds: 10
httpGet:
path: /healthz
path: /readyz
port: 50936
terminationMessagePolicy: FallbackToLogsOnError
imagePullPolicy: IfNotPresent
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ contents:
listen health_check_http_url
bind :::50936 v4v6
mode http
monitor-uri /healthz
monitor-uri /readyz
option dontlognull
listen stats
bind localhost:{{`{{ .LBConfig.StatPort }}`}}
Expand All @@ -34,7 +34,7 @@ contents:
stats refresh 30s
stats auth Username:Password
backend masters
option httpchk GET /healthz HTTP/1.0
option httpchk GET /readyz HTTP/1.0
option log-health-checks
balance roundrobin
{{`{{- range .LBConfig.Backends }}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ contents:
livenessProbe:
initialDelaySeconds: 10
httpGet:
path: /healthz
path: /readyz
port: 50936
terminationMessagePolicy: FallbackToLogsOnError
imagePullPolicy: IfNotPresent
Expand Down