Clean-up Runtime Contract#4035
Conversation
This change makes numerous cleanups to the runtime contract in an attempt to improve the readability of the document and make the document more useful for the intended auidence. * Moves developer facing statements to a new `runtime-user-guide`. Focuses `runtime-contract` on operator/platform-provider. * Add links to Conformance tests that test Runtime Contract statements. * Corrects, updates, or removes statements to more accurately represent today's Knative runtime. * Updates to informative or removes most untestable statements * Copies in important OCI runtime requirements we previously referenced * Removes reference to OCI specification that didn't bring new requirements. Ref: knative#2539, knative#2973, knative#4014, knative#4027
Co-Authored-By: dgerd <dangerd@google.com>
Co-Authored-By: dgerd <dangerd@google.com>
| @@ -181,51 +157,39 @@ for purposes of scaling CPU and removing idle containers. | |||
|
|
|||
| #### Protocols and Ports | |||
There was a problem hiding this comment.
LGTM 👍
Summary: http1 and h2c are supported, http1 by default. No automatic upgrades.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: dgerd, tanzeeb If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| ## Connection | ||
|
|
||
| A Knative container must package a webserver that can respond to HTTP/1.1 | ||
| requests on a specified port. |
There was a problem hiding this comment.
HTTP/2.0 is covered in the Protocols section.
My understanding is that we expect clients to implement h2c and not h2 which requires an initial upgrade request over HTTP/1.1. This is why I have set the expectation that responding to an HTTP/1.1 request is required.
There was a problem hiding this comment.
It might be worth clarifying that containers can implement h2c with the logic you provided here.
|
/lgtm Will leave |
evankanderson
left a comment
There was a problem hiding this comment.
I have a slight preference for splitting up functional vs presentational changes in the future, as it allows the two conversations to proceed at a different pace.
Part of the reason for the structure of the existing spec was due to questions asked during the drafting about how different parts of the OCI contract applied -- in some cases, the mechanism was suitable for operator consumption but had multitenancy or security concerns, so needed to be cordoned off from developers. I'm not sure how much these notes add/remove from the spec, but most were added in response to questions. 😁
| particularly concerned with the expectations of _developers_ (and _language and | ||
| tooling developers_, by extension) running code in the environment. | ||
| This document is aimed at _operators_ with the goal of providing a consistent | ||
| runtime environment for _developers_. |
There was a problem hiding this comment.
Not sure if you want to add the additional detail of language and tooling developers being additional roles that are concerned about the environment as consumers.
There was a problem hiding this comment.
| runtime environment for _developers_. | |
| runtime environment for _developers_ (who may interact via tools created by _language and tooling developers_). |
| `poststop` hooks to platform operators rather than developers. All of these | ||
| hooks are defined in the context of the runtime namespace, rather than the | ||
| container namespace, and may expose system-level information (and are | ||
| container MUST be sent a `SIGTERM` signal when it is killed to allow for a |
There was a problem hiding this comment.
This can at most be a SHOULD, because there are a number of conditions which could reasonable cause a container to stop running without receiving a SIGTERM:
- Kernel SIGKILL due to cgroup resource exhaustion / machine resource exhaustion (particularly memory)
- Kernel SIGKILL due to AppArmor / BPF policy violation
- Operator SIGKILL due to exceptional circumstances
- Machine (hardware) failure
There's also the fun case of network failure isolating or partially-isolating the machine, which may not affect SIGTERM/SIGKILL, but may screw up other shutdown.
There was a problem hiding this comment.
I agree here, but I would like to distinguish between this may not happen due to the reason your container is being terminated versus this may vary from platform to platform. Perhaps we can bound the situations for which a SIGTERM must be sent (i.e. normal autoscale actions or deployment actions). We can work on the exact wording, but thoughts here?
There was a problem hiding this comment.
Yes, I think saying that "normal orderly shutdown MUST send a SIGTERM" would be acceptable.
| contents from a particular execution. Because containers (particularly failing | ||
| containers) can experience frequent starts, operators or platform providers | ||
| SHOULD limit the total space consumed by these failures. | ||
| - A container should write its own termination message to `/dev/termination-log` |
There was a problem hiding this comment.
/dev/termination-log is inherited from the Kubernetes terminationMessagePolicy, which is pretty helpful for collecting information from crashing containers.
There was a problem hiding this comment.
I removed it because:
- It is written from the standpoint of the container. Should the container write the termination message on all Knative platforms (i.e. All installs MUST implement and offer
terminationMessagePolicy), or just some platforms? - We do not have test coverage for this and I couldn't quite find where we actually consume this information and present it back to users in our API. It is not clear to me that we do this at all.
- Most importantly, this seems like an overly limiting runtime requirement that is not in the critical path at the MUST/SHOULD level and does not add much developer value at the MAY level.
I am not stuck on this one and could be convinced to just rewording it.
There was a problem hiding this comment.
Yeah -- good catch. I don't think we need to keep this, because it could be a scalability / visibility concern to report these up to the control plane.
In particular, what happens if you have a simultaneous crash of 10k Pods -- aggregation termination messages back to the Revision could be fairly expensive, with each Pod crash causing another attempted update of the Revision. It looks like these don't get aggregated from Pod -> ReplicaSet in k8s, so let's follow that precedent.
| ### Hooks | ||
|
|
||
| Operation hooks SHOULD NOT be configurable by the Knative developer. Operators | ||
| or platform providers MAY use hooks to implement their own lifecycle controls. |
There was a problem hiding this comment.
This was removed because it's not developer-facing?
There was a problem hiding this comment.
The only hooks I could find here were the OCI hooks ( https://github.com/opencontainers/runtime-spec/blob/master/config.md#posix-platform-hooks ). We already describe these hooks starting on line 90. This was duplicative.
There was a problem hiding this comment.
SGTM, I copied all the sections from OCI, I think.
| collected and retained in a developer-accessible logging repository. | ||
| A read from the `stdin` file descriptor on the container [SHOULD](https://github.com/knative/serving/blob/master/test/conformance/file_descriptor_test.go) | ||
| always result in `EOF`. The `stdout` and `stderr` file descriptors on the container | ||
| MUST be collected and retained in a developer-accessible logging repository. |
There was a problem hiding this comment.
Wondering about the upgrade from SHOULD to MUST.
There was a problem hiding this comment.
It would be nice to have a minimum logging requirement that is common across all Knative installs. Right now both /var/log and stdout are SHOULD which means that a developer need to go hunting through the docs or configuration for each Knative install to understand how logs are collected. My understanding is that right now in Knative Serving we collect stdout and stderr for all containers so we are already doing this.
There was a problem hiding this comment.
I think one reason for SHOULD was that it wasn't entirely clear how to test that the logs were actually being retained.
I'm fine upgrading to MUST.
| [MUST](https://github.com/knative/serving/blob/master/test/conformance/user_test.go) run | ||
| the container as the specified user ID if allowed by the platform (see below). | ||
| If no `runAsUser` is specified, a platform-specific default SHALL be used. | ||
| Platform Providers SHOULD document this default behavior. |
There was a problem hiding this comment.
This is removing a documentation recommendation.
Would it make sense instead to have a section of the spec recommending the most significant documentation areas where platform providers may differ?
There was a problem hiding this comment.
I am going to take another stab at updating this after some more recent conversations about how OpenShift handles runAsUser.
It would be nice to have documentation recommendations and requirements elsewhere so you don't have to go through this entire document to find them all.
There was a problem hiding this comment.
Is there a <!-- TODO(dgerd): .... --> here?
|
|
||
| The namespace configuration MUST be provided by the operator or platform | ||
| provider; developers or container providers MUST NOT set or assume a particular | ||
| namespace configuration. |
There was a problem hiding this comment.
Looking at this again I am going to put it back and update it. I was assuming Kubernetes Namespace and not Container Namespace when I was deleting this.
There was a problem hiding this comment.
Looks like this is still missing?
| This option MAY only be set by the operator or platform provider, and MUST NOT | ||
| be configurable by the developer. As masked paths may be part of the platform | ||
| security hardening, operators may tune this from time to time as the threat | ||
| environment changes. |
There was a problem hiding this comment.
Again, wondering if it's worth documenting where the responsibility for these setting lies, given that they are in the OCI spec.
There was a problem hiding this comment.
This section seemed overly limiting and does not seem to prevent the operation of Knative. Why must this not be configurable by the developer? It seemed to me like you could let the developer have control over some of these things. Given that the OCI interface should not be available within the container the only way to configure this would be through the Knative API. An extension of the Knative API to allow certain paths to be masked or read-only didn't seem crazy to me.
There was a problem hiding this comment.
I'm fine with dropping this.
| ## Connection | ||
|
|
||
| A Knative container must package a webserver that can respond to HTTP/1.1 | ||
| requests on a specified port. |
There was a problem hiding this comment.
It might be worth clarifying that containers can implement h2c with the logic you provided here.
|
|
||
| # Logging | ||
|
|
||
| Log statements should be sent to `stdout` and `stderr`. Log statements sent to |
There was a problem hiding this comment.
It would be nice to include /var/log or /dev/log as alternatives to the streams. /dev/log is particularly nice, as it natively supports multiline log statements (but I think our current fluentd configuration does not provide /dev/log).
There was a problem hiding this comment.
These are not guaranteed to be in a knative install. They are listed as SHOULD items.
|
/hold Will look at this tomorrow |
* Describe current behavior for HTTP1.1 and HTTP2 * Provide links to relevant tests for keywords Ref: knative#4035 Fixes: knative#4283
| the Knative cluster. | ||
| - **Language and tooling developers** typically write tools used by | ||
| _developers_ to package code into containers. As such, they are concerned | ||
| that tooling which wraps developer code complies with this runtime contract. |
There was a problem hiding this comment.
Add a section clarifying that this is descriptive advice rather than prescriptive requirements? "Your container will fit better if it follows the following guidance..."
| particularly concerned with the expectations of _developers_ (and _language and | ||
| tooling developers_, by extension) running code in the environment. | ||
| This document is aimed at _operators_ with the goal of providing a consistent | ||
| runtime environment for _developers_. |
There was a problem hiding this comment.
| runtime environment for _developers_. | |
| runtime environment for _developers_ (who may interact via tools created by _language and tooling developers_). |
| `poststop` hooks to platform operators rather than developers. All of these | ||
| hooks are defined in the context of the runtime namespace, rather than the | ||
| container namespace, and may expose system-level information (and are | ||
| container MUST be sent a `SIGTERM` signal when it is killed to allow for a |
There was a problem hiding this comment.
Yes, I think saying that "normal orderly shutdown MUST send a SIGTERM" would be acceptable.
| [SHOULD](https://github.com/knative/serving/blob/master/test/conformance/container_test.go) | ||
| restrict the use of `prestart`, `poststart`, and `poststop` hooks to platform | ||
| operators rather than developers. All of these hooks are defined in the context | ||
| of the runtime namespace, rather than the |
There was a problem hiding this comment.
s/runtime namespace/host runtime namespace/ to be extra clear?
| of the runtime namespace, rather than the | ||
| container namespace, and MAY expose system-level information (and are | ||
| non-portable). | ||
| - Failures of the developer-specified process MUST be logged to a |
There was a problem hiding this comment.
I'm assuming this was removed because it's not currently testable?
Can we leave a comment suggesting this for future inclusion once we have appropriate log infrastructure in place? I think it would be testable if we had a standard log-tail interface:
- start tailing
- Deploy container which runs
echo "Not gonna happen" 1>&2; exit 1 - Check for string in tail results.
| contents from a particular execution. Because containers (particularly failing | ||
| containers) can experience frequent starts, operators or platform providers | ||
| SHOULD limit the total space consumed by these failures. | ||
| - A container should write its own termination message to `/dev/termination-log` |
There was a problem hiding this comment.
Yeah -- good catch. I don't think we need to keep this, because it could be a scalability / visibility concern to report these up to the control plane.
In particular, what happens if you have a simultaneous crash of 10k Pods -- aggregation termination messages back to the Revision could be fairly expensive, with each Pod crash causing another attempted update of the Revision. It looks like these don't get aggregated from Pod -> ReplicaSet in k8s, so let's follow that precedent.
| ### Hooks | ||
|
|
||
| Operation hooks SHOULD NOT be configurable by the Knative developer. Operators | ||
| or platform providers MAY use hooks to implement their own lifecycle controls. |
There was a problem hiding this comment.
SGTM, I copied all the sections from OCI, I think.
| SHOULD NOT be exposed within the container. The operator or platform provider | ||
| MAY have the ability to directly interact with the OCI interface, but that is | ||
| beyond the scope of this specification. | ||
| MAY have the ability to directly interact with the OCI interface. |
There was a problem hiding this comment.
@evankanderson @dgerd did we want to reword the OCI interface parts?
There was a problem hiding this comment.
I think Dan was saying that we could probe at some well-known locations, but couldn't definitively guarantee that this was not exposed.
I'm fine with leaving as-is or moving to the developer guide.
evankanderson
left a comment
There was a problem hiding this comment.
Finished my pass; a few missed items from last time and some more discussion, but I think many of these can be closed with <!-- TODO(dgerd): ... --> comments rather than needing to get the final details right in this pass.
Observability in particular I expect to improve over the next 3-5 months, so there may be some better answers there soon.
| collected and retained in a developer-accessible logging repository. | ||
| A read from the `stdin` file descriptor on the container [SHOULD](https://github.com/knative/serving/blob/master/test/conformance/file_descriptor_test.go) | ||
| always result in `EOF`. The `stdout` and `stderr` file descriptors on the container | ||
| MUST be collected and retained in a developer-accessible logging repository. |
There was a problem hiding this comment.
I think one reason for SHOULD was that it wasn't entirely clear how to test that the logs were actually being retained.
I'm fine upgrading to MUST.
| #### Dev symbolic links | ||
|
|
||
| As specified by OCI. | ||
| The following symbolic links MUST exist within the container: |
There was a problem hiding this comment.
Where did the OCI reference end up?
|
|
||
| The container MUST accept HTTP/1.1 requests from the environment. The | ||
| environment | ||
| [SHOULD offer an HTTP/2.0 upgrade option](https://http2.github.io/http2-spec/#discover-http) |
There was a problem hiding this comment.
Followed up there -- empty == HTTP/1.1 with later upgrade option seems reasonable.
| Only one inbound `containerPort` SHALL be specified in the | ||
| The developer MAY specify a container port value at deployment; if the developer does not specify a port, the platform provider | ||
| [MUST](https://github.com/knative/serving/blob/master/test/conformance/service_test.go) provide a default. | ||
| Only one inbound `containerPort` [SHALL](https://github.com/knative/serving/blob/master/test/conformance/container_test.go) be specified in the |
There was a problem hiding this comment.
It's not 100% clear here whether not specifying a port in the spec (and just responding to $PORT in the container) is or should be acceptable.
Thoughts?
| of its source, the selected port will be made available in the `PORT` | ||
| environment variable. | ||
|
|
||
| The platform provider SHOULD configure the platform to perform HTTPS termination |
There was a problem hiding this comment.
I don't see a comment on HTTPS termination yet.
| This option MAY only be set by the operator or platform provider, and MUST NOT | ||
| be configurable by the developer. As masked paths may be part of the platform | ||
| security hardening, operators may tune this from time to time as the threat | ||
| environment changes. |
There was a problem hiding this comment.
I'm fine with dropping this.
| - name: "h2c" | ||
| containerPort: 8081 | ||
| ... | ||
| ``` |
There was a problem hiding this comment.
Define here or in connection that the proxy terminates SSL and may convert HTTP versions automatically, so the user container only needs to speak a single HTTP version.
(I found it strange to have "Connection" and "Protocols" so far separated, FWIW.
| persistent data as it is not retained in the event of container termination. | ||
|
|
||
| It is recommended to store temporary state such as local caches or working data in `/tmp` | ||
| as this space is guaranteed to be present and mounted Read-Write. |
There was a problem hiding this comment.
It's probably also worth pointing out here the option of off-node storage via API interfaces like Memcache/Redis, and recommending that for session storage.
There was a problem hiding this comment.
In particular:
- Sessions may not be sticky; subsequent requests from the same user may not arrive at the same container.
- Local disk state for session storage management may be lost when containers are scaled down (possibly to zero).
| It is recommended to store temporary state such as local caches or working data in `/tmp` | ||
| as this space is guaranteed to be present and mounted Read-Write. | ||
|
|
||
| # Probes |
There was a problem hiding this comment.
Why is this H1 when the other sections above are H2?
| # Probes | ||
|
|
||
| By default, the readiness of a Knative container is determined by a successful TCP | ||
| response on the specified port for incoming traffic. If this readiness check is not |
|
Some of this has merged. Some of this is in conflict now. Going to close this and get the remaining updates put in separately. |
This change makes numerous cleanups to the runtime contract in an
attempt to improve the readability of the document and make the document
more useful for the intended audience.
runtime-user-guide.Focuses
runtime-contracton operator/platform-provider.today's Knative runtime.
requirements.
Ref: #2539, #2973, #4014, #4027
/cc @evankanderson @mattmoor @tanzeeb @markusthoemmes @vaikas-google @tcnghia