Skip to content

feat(helm): add Neo 4.x Helm chart with microservice architecture#9

Open
kirkryan wants to merge 1 commit intomainfrom
feature/helm-v4
Open

feat(helm): add Neo 4.x Helm chart with microservice architecture#9
kirkryan wants to merge 1 commit intomainfrom
feature/helm-v4

Conversation

@kirkryan
Copy link
Collaborator

Summary

  • Adds new Helm chart at charts/netapp-neo/ for the v4.x multi-service architecture
  • 6 independently scalable services: api, worker, extractor, ner, ui, postgres (conditional)
  • Shared secrets (db, graph, encryption, license) referenced across services; per-service secrets (jwt, mcp, openai) scoped appropriately

Design highlights

  • Inter-service URLs auto-constructed via template helpers — users never configure them manually
  • DATABASE_URL built from postgresql.auth.* when built-in postgres enabled, or passthrough via postgresql.externalDatabaseUrl
  • Security contexts match docker-compose: worker gets SYS_ADMIN+DAC_READ_SEARCH caps (not privileged), extractor gets privileged: true, api/worker run as UID 1000
  • Health probes on all services with appropriate delays (NER: 180s for model loading)
  • GPU support for NER: opt-in via ner.gpu.enabled, adds nvidia.com/gpu resource limits + nodeSelector/tolerations
  • Init containers (wait-for-db) on api, worker, extractor — conditional on postgresql.enabled
  • API service uses sessionAffinity: ClientIP

Test plan

  • helm lint charts/netapp-neo — passed (0 failures)
  • helm template test charts/netapp-neo — all 26 manifests render correctly
  • Verified DATABASE_URL constructed correctly (internal + external modes)
  • Verified inter-service URLs resolve to correct service names
  • Verified worker has SYS_ADMIN caps but NOT privileged
  • Verified extractor has privileged: true
  • Verified NER GPU resources render only when ner.gpu.enabled=true
  • Verified postgresql.enabled=false omits StatefulSet, Service, and init containers
  • Verified image tags default to 4.0.2 (api/worker/extractor/ner) and 3.2.2 (ui)

New chart at charts/netapp-neo/ for the v4.x multi-service deployment:
6 independently scalable services (api, worker, extractor, ner, ui,
postgres) with shared secrets, per-service configmaps, health probes,
init containers, GPU support for NER, and optional ingress.
@kirkryan
Copy link
Collaborator Author

@romdalf - could you review this PR for 4.x.x code line support. Thanks in advance.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Helm chart (charts/netapp-neo/) to deploy NetApp NEO v4.x as a multi-service (microservice-style) application with optional in-cluster PostgreSQL and shared configuration/secrets.

Changes:

  • Introduces a new Helm chart definition and default values for API/worker/extractor/NER/UI/PostgreSQL.
  • Adds Kubernetes manifests (Deployments/Services/Ingresses/ConfigMaps/Secrets) for each service plus optional PostgreSQL StatefulSet.
  • Adds helper templates for inter-service URLs, DATABASE_URL construction, and a reusable wait-for-db initContainer.

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
charts/netapp-neo/Chart.yaml New chart metadata for netapp-neo v4.x (appVersion/version).
charts/netapp-neo/values.yaml Default configuration for all services, ingress, GPU toggles, and PostgreSQL settings.
charts/netapp-neo/templates/_helpers.tpl Template helpers for naming/labels, inter-service URLs, DATABASE_URL, image tags, and DB wait initContainer.
charts/netapp-neo/templates/NOTES.txt Post-install guidance and endpoint summary for deployed services/ingresses.
charts/netapp-neo/templates/api-deployment.yaml API Deployment wiring env/config/secrets, probes, and optional DB init container.
charts/netapp-neo/templates/api-service.yaml API Service definition (incl. sessionAffinity).
charts/netapp-neo/templates/api-ingress.yaml Optional API Ingress template.
charts/netapp-neo/templates/configmap-api.yaml API ConfigMap for non-secret runtime settings and internal URLs.
charts/netapp-neo/templates/worker-deployment.yaml Worker Deployment wiring env/config/secrets, probes, and optional DB init container.
charts/netapp-neo/templates/worker-service.yaml Worker Service definition for internal API→worker communication.
charts/netapp-neo/templates/configmap-worker.yaml Worker ConfigMap for concurrency and internal service URLs.
charts/netapp-neo/templates/extractor-deployment.yaml Extractor Deployment wiring env/config/secrets, probes, privileged securityContext, and optional DB init container.
charts/netapp-neo/templates/extractor-service.yaml Extractor Service definition for internal worker→extractor communication.
charts/netapp-neo/templates/configmap-extractor.yaml Extractor ConfigMap for extractor/VLM/OpenAI non-secret settings.
charts/netapp-neo/templates/ner-deployment.yaml NER Deployment wiring env/config, probes, and optional GPU resource limits.
charts/netapp-neo/templates/ner-service.yaml NER Service definition for internal calls.
charts/netapp-neo/templates/configmap-ner.yaml NER ConfigMap for model/runtime settings.
charts/netapp-neo/templates/ui-deployment.yaml UI Deployment wiring NEO_API env var and probes.
charts/netapp-neo/templates/ui-service.yaml UI Service definition.
charts/netapp-neo/templates/ui-ingress.yaml Optional UI Ingress template.
charts/netapp-neo/templates/postgres-statefulset.yaml Optional PostgreSQL StatefulSet with probes and persistence configuration.
charts/netapp-neo/templates/postgres-service.yaml Optional PostgreSQL Service for in-cluster DB access.
charts/netapp-neo/templates/secret-db.yaml Secret for DATABASE_URL and PostgreSQL credentials.
charts/netapp-neo/templates/secret-encryption.yaml Shared encryption key Secret.
charts/netapp-neo/templates/secret-jwt.yaml API JWT Secret.
charts/netapp-neo/templates/secret-license.yaml Shared license/connector id Secret.
charts/netapp-neo/templates/secret-graph.yaml Microsoft Graph credential Secret.
charts/netapp-neo/templates/secret-mcp.yaml MCP OAuth/API key Secret for API integration.
charts/netapp-neo/templates/secret-openai.yaml OpenAI API key Secret for extractor.

Comment on lines +170 to +180
postgresql:
enabled: true
name: neo-postgres
image:
repository: docker.io/library/postgres
tag: "17-alpine"
pullPolicy: IfNotPresent
auth:
username: neo
password: neo_password
database: neo_connector
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostgreSQL is enabled by default and ships with a well-known default password (neo_password). This can lead to insecure by-default installs if users don’t override values. Consider defaulting postgresql.enabled to false and/or requiring the password to be provided (or generating it and persisting it via lookup on upgrade) instead of hard-coding a default.

Copilot uses AI. Check for mistakes.
Comment on lines +10 to +13
DATABASE_URL: {{ include "netapp-neo.databaseUrl" . | quote }}
POSTGRES_USER: {{ .Values.postgresql.auth.username | quote }}
POSTGRES_PASSWORD: {{ .Values.postgresql.auth.password | quote }}
POSTGRES_DB: {{ .Values.postgresql.auth.database | quote }}
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Secret always includes POSTGRES_USER/POSTGRES_PASSWORD/POSTGRES_DB even when postgresql.enabled=false (external DB mode). That stores potentially unused credentials in-cluster and can mislead users about which values are in effect. Consider only emitting these keys when postgresql.enabled is true (and relying on DATABASE_URL alone for external mode).

Suggested change
DATABASE_URL: {{ include "netapp-neo.databaseUrl" . | quote }}
POSTGRES_USER: {{ .Values.postgresql.auth.username | quote }}
POSTGRES_PASSWORD: {{ .Values.postgresql.auth.password | quote }}
POSTGRES_DB: {{ .Values.postgresql.auth.database | quote }}
DATABASE_URL: {{ include "netapp-neo.databaseUrl" . | quote }}
{{- if .Values.postgresql.enabled }}
POSTGRES_USER: {{ .Values.postgresql.auth.username | quote }}
POSTGRES_PASSWORD: {{ .Values.postgresql.auth.password | quote }}
POSTGRES_DB: {{ .Values.postgresql.auth.database | quote }}
{{- end }}

Copilot uses AI. Check for mistakes.
Comment on lines +74 to +101
{{- if .Values.postgresql.persistence.enabled }}
volumeClaimTemplates:
- metadata:
name: data
{{- with .Values.postgresql.persistence.annotations }}
annotations:
{{- toYaml . | nindent 10 }}
{{- end }}
spec:
accessModes:
- {{ .Values.postgresql.persistence.accessMode }}
{{- if .Values.postgresql.persistence.storageClass }}
storageClassName: {{ .Values.postgresql.persistence.storageClass }}
{{- end }}
resources:
requests:
storage: {{ .Values.postgresql.persistence.size }}
{{- else }}
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
{{- end }}
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When postgresql.persistence.enabled is false, this StatefulSet still defines volumeClaimTemplates (and will still request storage). This makes the persistence toggle ineffective and can cause unexpected PVC provisioning/failures. If persistence is disabled, use an emptyDir volume (or omit the PVC template entirely) instead of creating a PVC.

Copilot uses AI. Check for mistakes.
Comment on lines +42 to +49
{{- if .Values.ner.gpu.enabled }}
resources:
limits:
nvidia.com/gpu: {{ .Values.ner.gpu.count }}
{{- else if .Values.ner.resources }}
resources:
{{- toYaml .Values.ner.resources | nindent 12 }}
{{- end }}
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With ner.gpu.enabled=true, the template hard-codes resources.limits.nvidia.com/gpu and ignores any CPU/memory requests/limits users might set in ner.resources. This makes it hard to run reliably on GPU nodes (OOM/evictions). Consider merging GPU limits into ner.resources (or supporting separate ner.gpu.resources overrides) instead of replacing the entire resources block.

Copilot uses AI. Check for mistakes.
Comment on lines +70 to +74
# Encryption
ENCRYPTION_KEY: ""
# License
NETAPP_CONNECTOR_LICENSE: ""
CONNECTOR_ID: "netappneo"
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These worker.env settings (e.g., ENCRYPTION_KEY, NETAPP_CONNECTOR_LICENSE, CONNECTOR_ID) are not referenced by the templates: the worker Deployment sources these values from the shared -encryption / -license Secrets which are populated from api.env.*. Keeping unused values here is confusing and makes it unclear which values actually take effect; consider removing them or wiring the templates to use worker.env.* consistently.

Suggested change
# Encryption
ENCRYPTION_KEY: ""
# License
NETAPP_CONNECTOR_LICENSE: ""
CONNECTOR_ID: "netappneo"

Copilot uses AI. Check for mistakes.
Comment on lines +103 to +104
# Encryption
ENCRYPTION_KEY: ""
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extractor.env.ENCRYPTION_KEY is defined in values, but the extractor Deployment reads ENCRYPTION_KEY from the shared -encryption Secret (populated from api.env.ENCRYPTION_KEY). This value currently has no effect; consider removing it from extractor.env or updating the templates so the extractor’s encryption key is sourced from extractor.env (or a clearly named shared section).

Suggested change
# Encryption
ENCRYPTION_KEY: ""

Copilot uses AI. Check for mistakes.
*/}}
{{- define "netapp-neo.databaseUrl" -}}
{{- if .Values.postgresql.enabled -}}
postgresql://{{ .Values.postgresql.auth.username }}:{{ .Values.postgresql.auth.password }}@{{ include "netapp-neo.fullname" . }}-postgres:{{ .Values.postgresql.service.port }}/{{ .Values.postgresql.auth.database }}
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

netapp-neo.databaseUrl interpolates username/password/database directly into the URL. If any of these contain reserved URL characters (e.g., @, :, /, #), the resulting DATABASE_URL will be invalid. Consider URL-encoding the components (e.g., via sprig functions) before constructing the connection string.

Suggested change
postgresql://{{ .Values.postgresql.auth.username }}:{{ .Values.postgresql.auth.password }}@{{ include "netapp-neo.fullname" . }}-postgres:{{ .Values.postgresql.service.port }}/{{ .Values.postgresql.auth.database }}
postgresql://{{ .Values.postgresql.auth.username | urlquery }}:{{ .Values.postgresql.auth.password | urlquery }}@{{ include "netapp-neo.fullname" . }}-postgres:{{ .Values.postgresql.service.port }}/{{ .Values.postgresql.auth.database | urlquery }}

Copilot uses AI. Check for mistakes.
Comment on lines +75 to +79
# Microsoft Graph
MS_GRAPH_TENANT_ID: ""
MS_GRAPH_CLIENT_ID: ""
MS_GRAPH_CLIENT_SECRET: ""
MS_GRAPH_CONNECTOR_ID: "netappneo"
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These worker.env Microsoft Graph credentials are also not used by the templates: worker consumes Graph settings via the shared -graph Secret populated from api.env.*. Either remove these unused values or update the Secret/template wiring so the worker’s configuration is sourced from worker.env.* as the values file suggests.

Suggested change
# Microsoft Graph
MS_GRAPH_TENANT_ID: ""
MS_GRAPH_CLIENT_ID: ""
MS_GRAPH_CLIENT_SECRET: ""
MS_GRAPH_CONNECTOR_ID: "netappneo"

Copilot uses AI. Check for mistakes.
# =============================================================================
postgresql:
enabled: true
name: neo-postgres
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

postgresql.name is defined here but the templates always name PostgreSQL resources as {{ include "netapp-neo.fullname" . }}-postgres and never reference this value. Consider removing postgresql.name to avoid confusion, or update the templates to honor it.

Suggested change
name: neo-postgres

Copilot uses AI. Check for mistakes.
@davidvonthenen
Copy link
Contributor

With a helm lint, was good... just the nit:
[INFO] Chart.yaml: icon is recommended

While deploying, I got the following warning:
I0316 08:16:27.156414 14514 warnings.go:107] "Warning: spec.template.metadata.annotations[container.apparmor.security.beta.kubernetes.io/worker]: deprecated since v1.30; use the "appArmorProfile" field instead"

The chart was deployed, and I got the extractor in a crash backoff loop, which I am 100% sure I am doing something wrong. I am guessing I might need to point to a directory to ingest/chunk documents:
my-release-netapp-neo-extractor-85989675c9-hm2l7 0/1 CrashLoopBackOff 2 (16s ago) 3m9s

Other than that, just from a chart/yaml perspective... this looks good to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants