Splitting this from #502
So far what appears to have happened is a config change broke the steering of jobs to build02 where we have nested virt enabled.
But now, AFAICS our jobs are running on build02 again, and:
kvm-device-plugin is healthy:
$ oc get ds
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kvm-device-plugin 42 42 42 42 42 <none> 39d
$
Looking at the pods from one CI run:
$ oc project ci-op-d7w80vyk
Now using project "ci-op-d7w80vyk" on server "https://api.build02.gcp.ci.openshift.org:6443".
$ oc get pods
NAME READY STATUS RESTARTS AGE
bin-build 0/1 Error 0 33m
src-build 0/1 Completed 0 35m
$
That's on node/build0-gstfj-w-c-gttfw.c.openshift-ci-build-farm.internal.
Looking at that node, I see:
Capacity:
attachable-volumes-gce-pd: 127
cpu: 16
devices.kubevirt.io/kvm: 2
ephemeral-storage: 1257739244Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 61832252Ki
pods: 250
Allocatable:
attachable-volumes-gce-pd: 127
cpu: 15
devices.kubevirt.io/kvm: 2
ephemeral-storage: 1158058743528
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 58584124Ki
pods: 250
So we have devices.kubevirt.io/kvm on that node.
And that node is backed by machine/build0-gstfj-w-c-gttfw which was booted with Image: projects/rhcos-cloud/global/images/rhcos-46-82-202011260640-0-gcp-x86-64 which looks right.
Splitting this from #502
So far what appears to have happened is a config change broke the steering of jobs to build02 where we have nested virt enabled.
But now, AFAICS our jobs are running on build02 again, and:
kvm-device-plugin is healthy:
Looking at the pods from one CI run:
That's on
node/build0-gstfj-w-c-gttfw.c.openshift-ci-build-farm.internal.Looking at that node, I see:
So we have
devices.kubevirt.io/kvmon that node.And that node is backed by
machine/build0-gstfj-w-c-gttfwwhich was booted withImage: projects/rhcos-cloud/global/images/rhcos-46-82-202011260640-0-gcp-x86-64which looks right.