Skip to content

refactor: move CRD apply from Helm hook Job to init container on operator Deployment#6780

Merged
julienmancuso merged 3 commits into
mainfrom
jsm/dep-784
Mar 4, 2026
Merged

refactor: move CRD apply from Helm hook Job to init container on operator Deployment#6780
julienmancuso merged 3 commits into
mainfrom
jsm/dep-784

Conversation

@julienmancuso
Copy link
Copy Markdown
Contributor

@julienmancuso julienmancuso commented Mar 2, 2026

Summary

  • Replace the pre-install,pre-upgrade Helm hook Job for CRD management with an init container on the operator Deployment, removing the dependency on Helm hook lifecycle for CRD application
  • Delete the dedicated hook RBAC (ServiceAccount, ClusterRole, ClusterRoleBinding) since the operator's main ServiceAccount already has full CRD permissions
  • The upgradeCRD flag is preserved with the same semantics (set to false if CRDs are managed externally)

Motivation
The CRD apply hook Job relies on Helm-specific lifecycle (helm.sh/hook) which is invisible to helm template and behaves unpredictably in GitOps tools (ArgoCD, FluxCD). Moving to an init container makes CRD management part of the standard Deployment spec, visible in rendered manifests, and compatible with all deployment workflows.

Details

  • The init container runs /crd-apply --crds-dir=/opt/dynamo-operator/crds/ --version= using the same operator image
  • Uses server-side apply, so running on every pod restart is safe and idempotent
  • No orphan cleanup needed on upgrade: the old hook resources use hook-succeeded delete policy and are already cleaned up after each successful run

Summary by CodeRabbit

  • Chores
    • Updated Custom Resource Definition (CRD) initialization during operator deployment startup for improved reliability.

…ator Deployment

Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
@julienmancuso julienmancuso requested a review from a team as a code owner March 2, 2026 22:44
@github-actions github-actions Bot added refactor deployment::k8s Relates to dynamo deployment in kubernetes labels Mar 2, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 2, 2026

Walkthrough

The pull request migrates CRD application from a Helm pre-install/pre-upgrade hook Job to an init container within the operator Deployment. The hook-based template is removed, and initialization logic is integrated directly into the pod specification, controlled by the same upgradeCRD flag.

Changes

Cohort / File(s) Summary
CRD Management Refactoring
deploy/helm/charts/platform/components/operator/templates/deployment.yaml
Adds initContainers block with crd-apply container that applies CRDs from a specified directory during pod initialization, gated by .Values.upgradeCRD.
CRD Management Refactoring
deploy/helm/charts/platform/components/operator/templates/upgrade-crd.yaml
Removes entire Helm hook template that previously managed CRD application via pre-install/pre-upgrade Job with associated RBAC resources and lifecycle annotations.
CRD Management Refactoring
deploy/helm/charts/platform/components/operator/values.yaml
Updates documentation to reflect CRD application shift from hook-based Job to init container approach and clarifies external CRD management semantics.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐰✨ CRDs now hop right in, no hooks to wait,
Init containers start the show—crisp and straight,
One Helm file gone, one deployment blessed,
The operator boots with CRDs pre-dressed! 🚀

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the primary change: moving CRD apply from a Helm hook Job to an init container on the operator Deployment.
Description check ✅ Passed The PR description is comprehensive and includes all template sections: Overview (Summary), Details, Motivation, and implementation details. It clearly explains the change, rationale, and technical approach.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deploy/helm/charts/platform/components/operator/templates/deployment.yaml`:
- Around line 62-70: The initContainer "crd-apply" is missing a container-level
securityContext and can fail strict PodSecurity admission; update the
initContainers block for the crd-apply container to include a hardened
securityContext (for example: runAsNonRoot: true and an explicit runAsUser, set
allowPrivilegeEscalation: false, readOnlyRootFilesystem: true, drop all
capabilities, and set seccompProfile type to RuntimeDefault) so it matches the
security posture of the other containers and satisfies restricted pod policies.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 93fa4d4 and dbf4e1b.

📒 Files selected for processing (3)
  • deploy/helm/charts/platform/components/operator/templates/deployment.yaml
  • deploy/helm/charts/platform/components/operator/templates/upgrade-crd.yaml
  • deploy/helm/charts/platform/components/operator/values.yaml
💤 Files with no reviewable changes (1)
  • deploy/helm/charts/platform/components/operator/templates/upgrade-crd.yaml

…ator Deployment

Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
@julienmancuso julienmancuso merged commit a01cd9c into main Mar 4, 2026
43 of 44 checks passed
@julienmancuso julienmancuso deleted the jsm/dep-784 branch March 4, 2026 00:42
julienmancuso added a commit that referenced this pull request Mar 7, 2026
…ator Deployment (#6780)

Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
yao531441 pushed a commit to yao531441/dynamo that referenced this pull request May 13, 2026
…ator Deployment (ai-dynamo#6780)

Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

deployment::k8s Relates to dynamo deployment in kubernetes refactor size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants