cc @mattmoor @dgerd @sixolet @steren @shefaliv @mikehelmick @evankanderson
tl;dr this issue describes a scenario where the Service reports Ready=True before rollout is complete, and proposes a change (possibly with a new condition) that allows reporting Readiness more correctly.
Problem Statement
Consider this runLatest Service scenario - what should Service's Ready condition = in this case?
kind: Service
metadata:
name: myservice
generation: 2
...
spec:
runLatest:
...
status:
conditions:
- type: Ready
status: ?
- type: ConfigurationsReady
status: True
- type: RoutesReady
status: True
observedGeneration: 2
latestCreatedRevisionName: myservice.2
latestReadyRevisionName: myservice.2 # Configuration is True based on new revision
traffic:
- revisionName: myservice.1 # Route is True based on the previous revision
percent: 100
In this case, by the Knative conditions convention, the Service should be Ready=True because ConfigurationsReady = True and RoutesReady = True.
But in reality, what has happened is that the rollout is still in progress. The Configuration has reconciled and successfully created the new myservice.2, and becomes true, but the Route has not yet been notified of the updated revision and migrated traffic to it. It is still on the old revision. So the client's intent in updating the service is only partially complete.
The client is prematurely notified that their deployment is finished, but in reality traffic has not migrated yet.
Proposal(s)
To address this, a suggestion is to compare the status.latestReadyRevisionName (which is propagated from the Configuration) and the status.traffic.revisionName (which is propagated from the Route) to determine overall readiness.
One possibility is that Service's Ready condition is no longer simply the AND of RoutesReady & ConfigurationsReady terminal conditions. Instead Service's Ready is based on RoutesReady = True AND ConfigurationsReady = True AND the latestReadyRevisionName = traffic.revisionName.
Doing so could lead to conditions like this:
..
status:
conditions:
- type: Ready
status: Unknown
message: "Traffic migration is not complete"
- type: ConfigurationsReady
status: True
- type: RoutesReady
status: True
...
Another alternative is to leave the Ready condition as a conventional rollup of terminal conditions, and introduce a third terminal condition named something like RolloutComplete. RolloutComplete would encode the latestReadyRevisionName = traffic.revisionName test. This is recomputed with updates from the Configuration and Route. This would lead to conditions like:
..
status:
conditions:
- type: Ready
status: Unknown
message: "Traffic migration is not complete"
- type: RolloutComplete
status: Unknown
message: "Traffic migration is not complete"
- type: ConfigurationsReady
status: True
- type: RoutesReady
status: True
...
Note that in either of these suggestions, the logic would be slightly different in Release mode, where the traffic in the spec of the Service (i.e. named revisions and percentages) is compared with the traffic in status.
cc @mattmoor @dgerd @sixolet @steren @shefaliv @mikehelmick @evankanderson
tl;dr this issue describes a scenario where the Service reports Ready=True before rollout is complete, and proposes a change (possibly with a new condition) that allows reporting Readiness more correctly.
Problem Statement
Consider this runLatest Service scenario - what should Service's Ready condition = in this case?
In this case, by the Knative conditions convention, the Service should be Ready=True because ConfigurationsReady = True and RoutesReady = True.
But in reality, what has happened is that the rollout is still in progress. The Configuration has reconciled and successfully created the new myservice.2, and becomes true, but the Route has not yet been notified of the updated revision and migrated traffic to it. It is still on the old revision. So the client's intent in updating the service is only partially complete.
The client is prematurely notified that their deployment is finished, but in reality traffic has not migrated yet.
Proposal(s)
To address this, a suggestion is to compare the status.latestReadyRevisionName (which is propagated from the Configuration) and the status.traffic.revisionName (which is propagated from the Route) to determine overall readiness.
One possibility is that Service's Ready condition is no longer simply the AND of RoutesReady & ConfigurationsReady terminal conditions. Instead Service's Ready is based on RoutesReady = True AND ConfigurationsReady = True AND the latestReadyRevisionName = traffic.revisionName.
Doing so could lead to conditions like this:
Another alternative is to leave the Ready condition as a conventional rollup of terminal conditions, and introduce a third terminal condition named something like RolloutComplete. RolloutComplete would encode the latestReadyRevisionName = traffic.revisionName test. This is recomputed with updates from the Configuration and Route. This would lead to conditions like:
Note that in either of these suggestions, the logic would be slightly different in Release mode, where the traffic in the spec of the Service (i.e. named revisions and percentages) is compared with the traffic in status.