Skip to content

CocoonSet spec changes do not roll out to existing agent pods #1

@CMGS

Description

@CMGS

Problem

The CocoonSet reconciler only creates missing pods and deletes excess ones. It does not detect spec drift on existing pods or trigger recreation when the CocoonSet spec changes.

This means any field change on a live CocoonSet — forcePull, resources, storage, image, network, etc. — only takes effect for newly created agents. Existing agents continue running with the original spec.

Affected code paths

  • reconciler.go:112-140 — main pod: if it exists, skip creation regardless of spec
  • reconciler.go:148-178 — sub-agents: only fill missing slots, existing pods are not compared against desired spec
  • update.go:21-50 (vk-cocoon) — UpdatePod only handles hibernate/wake transitions, all other spec changes are ignored

Example

# User updates CocoonSet:
spec:
  agent:
    forcePull: true    # changed from false
    memory: 8Gi        # changed from 4Gi

After applying: existing agent pods still run with forcePull: false and memory: 4Gi. Only agents created after this point (e.g., new sub-agent slots, or pods recreated after deletion) pick up the new values.

Expected behavior

When the CocoonSet spec changes, the reconciler should detect the drift and perform a rolling replacement of affected pods — similar to how a Deployment rolls out ReplicaSet changes.

Suggested approach

  1. Compute a spec hash (e.g., SHA256 of the serialized agent spec) and store it as a pod annotation
  2. On reconcile, compare the current spec hash against each pod's annotation
  3. If mismatched, delete the stale pod — the next reconcile loop creates a replacement with the updated spec
  4. Respect PodDisruptionBudget or a max-unavailable setting to avoid killing all agents simultaneously

Context

Discovered while adding forcePull support (cocoonstack/vk-cocoon#2). The field works correctly at create time but cannot be toggled on existing agents without manually deleting the pods.

This is not specific to forcePull — it affects every mutable field in AgentSpec.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions