Description of problem
We only think in terms of vCPUS today, that is shares/quota.
Expected result
We need to look at CPU affinity, and in particular the Cpus field:
https://github.com/kubernetes-sigs/cri-o/blob/master/vendor/github.com/opencontainers/runtime-spec/specs-go/config.go#L304
In a default kubernetes configuration, this mask should be set for all CPUs. However, if the cpu-manager is configured as static, it is possible to start setting CPU affinities on a best effort basis on a container granularity. With this in place, you'll see specific masks unique to each container.
Actual result
Today, if a user were to setup a mixed cluster with runc and kata, the kata runtime ignores the CPU set passed in, resulting in the vCPU (and vhost) threads running across all available CPUs (no isol is in place - affinity is managed by kubelet itself). This would result in kata based containers not only not getting a performance tuned affinity, but it would also result in our containers likely utilizing CPUs which kubelet wanted dedicated.
Proposal
Mandatory:
The following would need to be done in order to make sure we aren't utilizing CPUs dedicated to other pods.
Optimally, but secondary compared to the first set of changes:
With the mandatory bits in place, we'll be using the CPU set provided, but we won't be providing CPU affinity on a per container basis. To fully support CPU affinity in K8S, we'd also need to:
Description of problem
We only think in terms of vCPUS today, that is shares/quota.
Expected result
We need to look at CPU affinity, and in particular the Cpus field:
https://github.com/kubernetes-sigs/cri-o/blob/master/vendor/github.com/opencontainers/runtime-spec/specs-go/config.go#L304
In a default kubernetes configuration, this mask should be set for all CPUs. However, if the cpu-manager is configured as static, it is possible to start setting CPU affinities on a best effort basis on a container granularity. With this in place, you'll see specific masks unique to each container.
Actual result
Today, if a user were to setup a mixed cluster with runc and kata, the kata runtime ignores the CPU set passed in, resulting in the vCPU (and vhost) threads running across all available CPUs (no isol is in place - affinity is managed by kubelet itself). This would result in kata based containers not only not getting a performance tuned affinity, but it would also result in our containers likely utilizing CPUs which kubelet wanted dedicated.
Proposal
Mandatory:
The following would need to be done in order to make sure we aren't utilizing CPUs dedicated to other pods.
Optimally, but secondary compared to the first set of changes:
With the mandatory bits in place, we'll be using the CPU set provided, but we won't be providing CPU affinity on a per container basis. To fully support CPU affinity in K8S, we'd also need to: