When a package is removed from an SCR, the operator triggers an uninstall pod. However, the uninstall pod does not receive the package's configMap (scripts) or env (environment variables). The uninstall pod runs with an empty configuration, silently succeeds without executing any user-provided cleanup logic, and the operator marks the uninstall as complete.
This means any host-level changes made during apply/config are never reversed during uninstall, leaving the node in a dirty state.
Testing environment:
- Skyhook Controller: v0.12.0
➜ workload-clusters git:(testing-skyhook) ✗ k describe deploy -n skyhook skyhook-skyhook-operator-controller-manager | grep Image
Annotations: checkov.io/skip1: CKV_K8S_43=Image digest not required - we use tags
Image: nvcr.io/nvidia/skyhook/operator:v0.12.0@sha256:ce79a9778fca453e54d58506c71c8ff6765b65d44a73fb167441ab851c108dc2
Image: quay.io/brancz/kube-rbac-proxy:v0.15.0@sha256:2c7b120590cbe9f634f5099f2cbb91d0b668569023a81505ca124a5c437e7663
➜ workload-clusters git:(testing-skyhook) ✗
- Kubernetes: v1.34.5
➜ workload-clusters git:(testing-skyhook) ✗ k get nodes
NAME STATUS ROLES AGE VERSION
1u1g-x570-0432.pdc1a2.colossus.nvidia.com Ready control-plane 30d v1.34.5
1u1g-x570-0444.pdc1a2.colossus.nvidia.com Ready <none> 30d v1.34.5
z370-0433.ipp3a1.colossus.nvidia.com Ready <none> 30d v1.34.5
- SCR that I used is for opening a particular port on control plane node.
➜ workload-clusters git:(testing-skyhook) ✗ k get cm -n skyhook
NAME DATA AGE
demo-baz-1.1.0 2 7d23h
kube-root-ca.crt 1 7d23h
open-port-12379-1u1g-x570-0432-pdc1a2-colossus-nvidia-com-metadata-453076ca 3 3m30s
open-port-12379-firewall-1.0.0 6 3m30s
➜ workload-clusters git:(testing-skyhook) ✗ k describe cm -n skyhook open-port-12379-1u1g-x570-0432-pdc1a2-colossus-nvidia-com-metadata-453076ca
Name: open-port-12379-1u1g-x570-0432-pdc1a2-colossus-nvidia-com-metadata-453076ca
Namespace: skyhook
Labels: skyhook.nvidia.com/skyhook-node-meta=open-port-12379
Annotations: skyhook.nvidia.com/Node.name: 1u1g-x570-0432.pdc1a2.colossus.nvidia.com
skyhook.nvidia.com/name: open-port-12379
Data
====
packages.json:
----
{"agentVersion":"2bc0fe8c5c11130c843859dd0c8325e316bf4a9bb1d5883554c90a7a0574a771","packages":{"firewall":{"name":"firewall","version":"1.0.0","image":"ghcr.io/nvidia/skyhook-packages/shellscript"}}}
annotations.json:
----
{"cluster.x-k8s.io/annotations-from-machine":"","cluster.x-k8s.io/cluster-name":"pdc-nca-rayaankhan","cluster.x-k8s.io/cluster-namespace":"play","cluster.x-k8s.io/labels-from-machine":"","cluster.x-k8s.io/machine":"pdc-nca-rayaankhan-db5vw-qcdcg","cluster.x-k8s.io/owner-kind":"KubeadmControlPlane","cluster.x-k8s.io/owner-name":"pdc-nca-rayaankhan-db5vw","csi.volume.kubernetes.io/nodeid":"{\"csi.trident.netapp.io\":\"1u1g-x570-0432.pdc1a2.colossus.nvidia.com\"}","node.alpha.kubernetes.io/ttl":"0","projectcalico.org/IPv4Address":"10.46.254.176/16","projectcalico.org/IPv4IPIPTunnelAddr":"100.103.13.128","skyhook.nvidia.com/nodeState_open-port-12379":"{\"firewall|1.0.0\":{\"name\":\"firewall\",\"version\":\"1.0.0\",\"image\":\"ghcr.io/nvidia/skyhook-packages/shellscript\",\"stage\":\"config\",\"state\":\"complete\"}}","skyhook.nvidia.com/status_open-port-12379":"complete","skyhook.nvidia.com/version_open-port-12379":"v0.12.0","volumes.kubernetes.io/controller-managed-attach-detach":"true"}
labels.json:
----
{"beta.kubernetes.io/arch":"amd64","beta.kubernetes.io/os":"linux","kubernetes.io/arch":"amd64","kubernetes.io/hostname":"1u1g-x570-0432.pdc1a2.colossus.nvidia.com","kubernetes.io/os":"linux","node-role.kubernetes.io/control-plane":"","node.kubernetes.io/exclude-from-external-load-balancers":"","skyhook.nvidia.com/status_open-port-12379":"complete"}
BinaryData
====
Events: <none>
➜ workload-clusters git:(testing-skyhook) ✗ k describe cm -n skyhook open-port-12379-firewall-1.0.0
Name: open-port-12379-firewall-1.0.0
Namespace: skyhook
Labels: skyhook.nvidia.com/name=open-port-12379
Annotations: skyhook.nvidia.com/Package.Name: firewall
skyhook.nvidia.com/Package.Version: 1.0.0
skyhook.nvidia.com/name: open-port-12379
Data
====
apply.sh:
----
#!/bin/bash
set -e
if ! nsenter -t 1 -n -- iptables -C INPUT -p tcp --dport $PORT -j ACCEPT -m comment --comment "$COMMENT" 2>/dev/null; then
nsenter -t 1 -n -- iptables -I INPUT -p tcp --dport $PORT -j ACCEPT -m comment --comment "$COMMENT"
echo "Opened port $PORT"
else
echo "Port $PORT already open"
fi
apply_check.sh:
----
#!/bin/bash
set -e
nsenter -t 1 -n -- iptables -C INPUT -p tcp --dport $PORT -j ACCEPT -m comment --comment "$COMMENT" 2>/dev/null
config.sh:
----
#!/bin/bash
set -e
if ! nsenter -t 1 -n -- iptables -C INPUT -p tcp --dport $PORT -j ACCEPT -m comment --comment "$COMMENT" 2>/dev/null; then
nsenter -t 1 -n -- iptables -I INPUT -p tcp --dport $PORT -j ACCEPT -m comment --comment "$COMMENT"
echo "Opened port $PORT"
else
echo "Port $PORT already open"
fi
config_check.sh:
----
#!/bin/bash
set -e
nsenter -t 1 -n -- iptables -C INPUT -p tcp --dport $PORT -j ACCEPT -m comment --comment "$COMMENT" 2>/dev/null
uninstall.sh:
----
#!/bin/bash
set -e
if nsenter -t 1 -n -- iptables -C INPUT -p tcp --dport $PORT -j ACCEPT -m comment --comment "$COMMENT" 2>/dev/null; then
nsenter -t 1 -n -- iptables -D INPUT -p tcp --dport $PORT -j ACCEPT -m comment --comment "$COMMENT"
echo "Removed port $PORT"
else
echo "Port $PORT rule not found, nothing to remove"
fi
uninstall_check.sh:
----
#!/bin/bash
set -e
! nsenter -t 1 -n -- iptables -C INPUT -p tcp --dport $PORT -j ACCEPT -m comment --comment "$COMMENT" 2>/dev/null
BinaryData
====
Events: <none>
➜ workload-clusters git:(testing-skyhook) ✗
Steps I performed:
- I installed the above SCR to my cluster and it succeeded i.e. the port got opened (I confirmed it).
- I removed the package from the SCR and applied it again.
- I was watching the logs of all the containers of all the pods in skyhook namespace and I found this
2026-04-01T08:26:13.72056843Z stdout F [out]2026-04-01T08:26:13.720263 Could not find file /var/lib/skyhook/open-port-12379/firewall-1.0.0-82a01934-6cb6-4ff7-a250-b00b7e7844bd-2/configmaps/uninstall.sh was this in the configmap?
2026-04-01T08:26:13.758153955Z stdout F [out]2026-04-01T08:26:13.720263 SUCEEDED: shellscript_run.sh uninstall
So even though I had configured the uninstall.sh in the configmap, it couldn't find it.
Expected Behavior
The uninstall pod should receive the same configMap scripts and env variables that were used during apply, so that uninstall.sh can execute the cleanup logic (e.g., removing the iptables rule).
Findings
- Faux package missing ConfigMap and Env
In HandleVersionChange, when a package is removed from the spec, a "faux" package is created with only PackageRef and Image:
newPackage := &v1alpha1.Package{
PackageRef: packageStatusRef,
Image: packageStatus.Image,
}
The Env and ConfigMap fields are not set because the node state annotation only stores name, version, image, stage, and state — not the original package configuration.
The logs and other important details can be referred from this drive link accessible only to NVIDIANs.
When a package is removed from an SCR, the operator triggers an uninstall pod. However, the uninstall pod does not receive the package's configMap (scripts) or env (environment variables). The uninstall pod runs with an empty configuration, silently succeeds without executing any user-provided cleanup logic, and the operator marks the uninstall as complete.
This means any host-level changes made during apply/config are never reversed during uninstall, leaving the node in a dirty state.
Testing environment:
Steps I performed:
So even though I had configured the uninstall.sh in the configmap, it couldn't find it.
Expected Behavior
The uninstall pod should receive the same configMap scripts and env variables that were used during apply, so that uninstall.sh can execute the cleanup logic (e.g., removing the iptables rule).
Findings
In HandleVersionChange, when a package is removed from the spec, a "faux" package is created with only PackageRef and Image:
The Env and ConfigMap fields are not set because the node state annotation only stores name, version, image, stage, and state — not the original package configuration.
The logs and other important details can be referred from this drive link accessible only to NVIDIANs.