From e65e3edb4a7788531bf11735aa84cc38d9bf0464 Mon Sep 17 00:00:00 2001 From: Min Zhang Date: Tue, 21 Apr 2026 11:47:45 -0400 Subject: [PATCH 1/3] docs: add ztvp-certificates scenario documentation Covers architecture, extraction phases, platform-specific handling (BareMetal/VSphere proxy CA, custom enterprise CAs, image pull trust), ACM Policy distribution, and automatic rollout strategies. Signed-off-by: Min Zhang --- docs/ztvp-certificates.md | 325 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 325 insertions(+) create mode 100644 docs/ztvp-certificates.md diff --git a/docs/ztvp-certificates.md b/docs/ztvp-certificates.md new file mode 100644 index 00000000..fc5c1297 --- /dev/null +++ b/docs/ztvp-certificates.md @@ -0,0 +1,325 @@ +# ZTVP Certificates + +The `ztvp-certificates` chart manages CA certificate extraction, validation, +bundling, and distribution across the Zero Trust Validated Pattern. It runs as +an ArgoCD-managed application in the `openshift-config` namespace at sync-wave +**21**, ensuring certificates are available before any workload that needs TLS +verification. + +## Architecture + +```text + IngressControllers Service CA Cluster trusted-ca-bundle + (openshift-ingress) (openshift-config) (openshift-config-managed) + | | | + +-----------+-----------+--------------------------+ + | + extract-certificates.sh <-- runs as Job (initial) + CronJob (daily) + | + validates & combines + | + v + ConfigMap: ztvp-trusted-ca + (openshift-config) + | + +-----------+-----------+-----------------------+ + | | | + ACM Policy distributes proxyCA patches imagePullTrust + to target namespaces proxy/cluster patches image.config + (e.g. qtodo) (all platforms) (when enabled) +``` + +### Kubernetes Resources + +| Resource | Purpose | +|---|---| +| **ServiceAccount / RBAC** | Grants the extraction Job read access to secrets, configmaps, ingresscontrollers, and proxy across namespaces | +| **ConfigMap (script)** | Holds the templated `extract-certificates.sh` script | +| **Job (initial)** | Runs once at first sync (sync-wave 23, `Prune=false`) to populate the CA bundle | +| **CronJob** | Runs on schedule (default daily at 02:00) for automatic rotation | +| **ACM Policy + Placement** | Distributes the `ztvp-trusted-ca` ConfigMap into target namespaces via ACM governance | +| **ManagedClusterSetBinding** | Binds the `default` ManagedClusterSet in `openshift-config` so the Placement can target `local-cluster` | + +## Extraction Phases + +The extraction script runs through a deterministic sequence of phases. Each +phase is independently gated by values, so the script adapts to the active +configuration. + +| Phase | Gate | What It Does | +|---|---|---| +| 1 -- Custom CA | `customCA.secretRef.enabled` | Reads a user-supplied secret and writes `custom-ca.crt` | +| 2 -- Ingress CA | `autoDetect` | Loops over every `IngressController`, extracts `tls.crt` from the referenced or default router secret | +| 3 -- Service CA | `autoDetect` | Reads `openshift-service-ca.crt` ConfigMap | +| 4 -- Cluster CA Bundle | `autoDetect` | Reads `trusted-ca-bundle` from `openshift-config-managed` (present when a corporate proxy injects CAs) | +| 5 -- Additional Certs | `customCA.additionalCertificates[]` | Reads each additional secret and writes a `.crt` file | +| 6 -- Validation | `validation.enabled` | Checks minimum size and `openssl x509` parse for every `.crt` | +| 7 -- Combine | always | Concatenates all `.crt` files into `tls-ca-bundle.pem`; fails if bundle < 100 bytes | +| 8 -- ConfigMap | always | `oc apply` the `ztvp-trusted-ca` ConfigMap with annotations recording extraction metadata | +| 8.5 -- Proxy CA | `proxyCA.enabled` | Creates a separate ConfigMap with ingress + service CAs only | +| 8.6 -- Proxy Patch | `proxyCA.enabled` | Patches `proxy/cluster` to set `trustedCA` (only if not already set to another value) | +| 9 -- Image Pull Trust | `imagePullTrust.enabled` | Creates a ConfigMap keyed by registry hostname and patches `image.config.openshift.io/cluster` | +| 10 -- Rollout | `rollout.enabled` | Restarts Deployments/StatefulSets that consume the certificate bundle | + +## Scenario Handling + +### Scenario 1: Cloud Cluster with Public Certificates (Default) + +Applies to AWS, Azure, GCP, and any cluster whose ingress uses certificates +signed by a public CA. + +**Active settings:** + +* `autoDetect: true` +* `proxyCA.enabled: true` (default -- ensures ACS Central and other workloads + that verify TLS on routes can trust the ingress CA without per-pod volume mounts) +* `imagePullTrust.enabled: false` + +**What happens:** + +1. The Job auto-detects the ingress CA from each `IngressController`'s router + secret in `openshift-ingress`. +2. The service CA is read from `openshift-service-ca.crt`. +3. If a cluster-wide proxy bundle exists, it is included. +4. All certificates are combined into `ztvp-trusted-ca` and distributed via + ACM Policy to target namespaces. +5. A proxy CA ConfigMap (`ztvp-proxy-ca`) is created with ingress + service + CAs and `proxy/cluster` is patched so the Cluster Network Operator injects + these CAs into all workloads automatically. + +No platform override file is needed. The chart's default `values.yaml` handles +this scenario out of the box. + +### Scenario 2: Bare Metal with Self-Signed Ingress + +Bare metal clusters typically use self-signed certificates for the default +ingress. Since `proxyCA` is enabled by default (see Scenario 1), the ingress +CA is automatically injected cluster-wide. Workloads that verify TLS on routes +(e.g., ACS Central connecting to Keycloak) work without extra configuration. + +**Platform override** (`overrides/values-ztvp-certificates-BareMetal.yaml`): + +```yaml +proxyCA: + enabled: true +``` + +> **Note:** This override is now redundant because the chart default is +> `proxyCA.enabled: true`. It is retained for clarity and backward +> compatibility with older chart versions. + +**Behavior is identical to Scenario 1** -- Phases 8.5 and 8.6 run by default: + +1. Phase 8.5 builds a proxy-specific bundle containing only the ingress and + service CAs (the Cluster Network Operator merges these with system CAs). +2. Phase 8.6 patches `proxy/cluster` to set `spec.trustedCA.name` to + `ztvp-proxy-ca`. +3. The CNO propagates the merged bundle to every node, making the ingress CA + trusted system-wide for all pods without explicit volume mounts. + +### Scenario 3: vSphere with Self-Signed Ingress + +Identical behavior to Bare Metal. vSphere clusters also typically use +self-signed ingress certificates. + +**Platform override** (`overrides/values-ztvp-certificates-VSphere.yaml`): + +```yaml +proxyCA: + enabled: true +``` + +> **Note:** This override is also redundant; the chart default already enables +> `proxyCA`. + +### Scenario 4: Enterprise Custom CA + +When the organization uses a private PKI (e.g., a corporate root CA that +signed the cluster's ingress certificate), the administrator creates a +Kubernetes secret with the CA and enables `customCA.secretRef`. + +**Setup:** + +```bash +oc create secret generic custom-ca-bundle \ + --from-file=ca.crt=/path/to/corporate-root-ca.crt \ + -n openshift-config +``` + +**values-hub.yaml overrides:** + +```yaml +- name: customCA.secretRef.enabled + value: "true" +- name: customCA.secretRef.name + value: custom-ca-bundle +- name: customCA.secretRef.namespace + value: openshift-config +``` + +**What happens:** + +1. Phase 1 extracts the custom CA from the referenced secret. +2. Auto-detect (phases 2-4) still runs, so ingress and service CAs are + included alongside the custom CA. +3. The combined bundle contains both the custom CA and the auto-detected + certificates. + +### Scenario 5: Multiple Additional CAs + +When several external CAs are needed (e.g., corporate root CA, a partner CA, +and an intermediate CA), use `additionalCertificates` via the +`extraValueFiles` mechanism. + +**Configuration** (`overrides/values-ztvp-certificates.yaml`): + +```yaml +customCA: + additionalCertificates: + - name: corporate-root-ca + secretRef: + name: corporate-root-ca + namespace: openshift-config + key: ca.crt + - name: partner-ca + secretRef: + name: partner-ca + namespace: openshift-config + key: ca.crt +``` + +**What happens:** + +1. Phase 5 iterates over each entry and extracts the certificate from its + secret. Missing secrets produce a warning but do not fail the job. +2. All additional certificates are combined with auto-detected and custom CAs + in Phase 7. + +### Scenario 6: Image Pull Trust for Built-In Registry + +When an image registry (e.g., Quay or the embedded OpenShift registry) is +exposed behind the cluster ingress with a self-signed or internal CA, kubelet +image pulls fail with `x509: certificate signed by unknown authority`. The +`imagePullTrust` feature solves this at the node level. + +**values-hub.yaml overrides:** + +```yaml +- name: imagePullTrust.enabled + value: "true" +- name: imagePullTrust.registries[0] + value: quay-registry-quay-quay-enterprise.apps.example.com +``` + +**What happens:** + +1. Phase 9 combines all extracted ingress CAs into a single PEM. +2. A ConfigMap (`ztvp-registry-cas`) is created in `openshift-config` with + each registry hostname as a key and the ingress CA PEM as the value. +3. `image.config.openshift.io/cluster` is patched to set + `additionalTrustedCA.name` to that ConfigMap. +4. The Machine Config Operator rolls the trust configuration out to all nodes. + +### Scenario 7: Custom Source Locations + +In non-standard environments where the ingress CA or service CA are stored in +different locations, `customSource` overrides the default auto-detection +targets. + +```yaml +customSource: + ingressCA: + secretName: my-ingress-ca + secretNamespace: my-namespace + secretKey: tls.crt + serviceCA: + configMapName: my-service-ca + configMapNamespace: my-namespace + configMapKey: service-ca.crt +``` + +Auto-detection will read from the specified locations instead of the standard +OpenShift defaults. + +## Distribution + +Certificate distribution uses **ACM Governance Policies** to replicate the +`ztvp-trusted-ca` ConfigMap from `openshift-config` into each target +namespace. + +```text +openshift-config/ztvp-trusted-ca ---ACM Policy---> qtodo/ztvp-trusted-ca + rhtpa/ztvp-trusted-ca + ... +``` + +The policy uses `fromConfigMap` hub templates so that the ConfigMap data is +always sourced from the hub cluster's copy. Target namespaces are configured +via `distribution.targetNamespaces`. + +**Requirements:** + +* ACM (Advanced Cluster Management) must be installed +* A `ManagedClusterSetBinding` for the `default` cluster set is created + automatically by the chart +* The `Placement` targets clusters with `local-cluster: "true"` + +## Automatic Rollout + +When certificates are updated, consuming workloads need to pick up the new +bundle. The chart supports three rollout strategies: + +| Strategy | Behavior | +|---|---| +| `labeled` (default) | Restarts Deployments/StatefulSets matching `ztvp.io/uses-certificates: "true"` in distribution target namespaces | +| `all` | Restarts all Deployments/StatefulSets in target namespaces | +| `specific` | Restarts only the named resources listed in `rollout.targets` | + +To opt a workload into automatic restart, add the label: + +```yaml +metadata: + labels: + ztvp.io/uses-certificates: "true" +``` + +## Sync Wave Ordering + +The chart's resources are ordered within the ArgoCD sync: + +| Wave | Resources | +|---|---| +| 22 | ServiceAccount, RBAC (Role, RoleBinding, ClusterRole, ClusterRoleBinding) | +| 23 | Initial Job, CronJob, ConfigMap (script) | +| 25 | ManagedClusterSetBinding | +| 26 | ACM Policy, PlacementBinding, Placement | + +The application itself sits at sync-wave **21** in `values-hub.yaml`, ensuring +it deploys before operators and workloads that depend on the CA bundle. + +## Configuration Reference + +### Top-Level Values + +| Value | Default | Description | +|---|---|---| +| `enabled` | `true` | Master toggle for all chart resources | +| `autoDetect` | `true` | Auto-detect ingress, service, and cluster CAs from OpenShift | +| `configMapName` | `ztvp-trusted-ca` | Name of the output ConfigMap | +| `proxyCA.enabled` | `true` | Create a proxy CA ConfigMap and patch `proxy/cluster` | +| `imagePullTrust.enabled` | `false` | Configure node-level registry trust via `image.config` | +| `rollout.enabled` | `true` | Restart consuming workloads after certificate updates | +| `rollout.strategy` | `labeled` | One of: `labeled`, `all`, `specific` | +| `distribution.enabled` | `true` | Distribute CA bundle via ACM Policy | +| `distribution.method` | `acm-policy` | Distribution mechanism | +| `cronJob.schedule` | `0 2 * * *` | Cron schedule for automatic re-extraction | +| `validation.enabled` | `true` | Validate certificate size and format | +| `debug.verbose` | `false` | Enable `set -x` in the extraction script | + +### Platform Override Files + +| File | When Applied | Effect | +|---|---|---| +| `overrides/values-ztvp-certificates.yaml` | Always | Additional CAs, rollout config | +| `overrides/values-ztvp-certificates-BareMetal.yaml` | `clusterPlatform == BareMetal` | Confirms `proxyCA` (redundant; default is already `true`) | +| `overrides/values-ztvp-certificates-VSphere.yaml` | `clusterPlatform == VSphere` | Confirms `proxyCA` (redundant; default is already `true`) | From 759016ed77e740ca28402755d54c7d3b0c8a3012 Mon Sep 17 00:00:00 2001 From: Min Zhang Date: Fri, 8 May 2026 08:23:37 -0400 Subject: [PATCH 2/3] docs: combine BareMetal and vSphere certificate scenarios Merge Scenario 2 (BareMetal) and Scenario 3 (vSphere) into a single scenario since both platforms have identical self-signed ingress behavior and redundant proxyCA overrides. Renumber remaining scenarios accordingly. Signed-off-by: Min Zhang --- docs/ztvp-certificates.md | 45 ++++++++++++++++----------------------- 1 file changed, 18 insertions(+), 27 deletions(-) diff --git a/docs/ztvp-certificates.md b/docs/ztvp-certificates.md index fc5c1297..94d94f72 100644 --- a/docs/ztvp-certificates.md +++ b/docs/ztvp-certificates.md @@ -90,22 +90,28 @@ signed by a public CA. No platform override file is needed. The chart's default `values.yaml` handles this scenario out of the box. -### Scenario 2: Bare Metal with Self-Signed Ingress +### Scenario 2: Bare Metal / vSphere with Self-Signed Ingress -Bare metal clusters typically use self-signed certificates for the default -ingress. Since `proxyCA` is enabled by default (see Scenario 1), the ingress -CA is automatically injected cluster-wide. Workloads that verify TLS on routes -(e.g., ACS Central connecting to Keycloak) work without extra configuration. +Bare metal and vSphere clusters typically use self-signed certificates for the +default ingress. Since `proxyCA` is enabled by default (see Scenario 1), the +ingress CA is automatically injected cluster-wide. Workloads that verify TLS +on routes (e.g., ACS Central connecting to Keycloak) work without extra +configuration. + +**Platform overrides:** + +* `overrides/values-ztvp-certificates-BareMetal.yaml` +* `overrides/values-ztvp-certificates-VSphere.yaml` -**Platform override** (`overrides/values-ztvp-certificates-BareMetal.yaml`): +Both contain: ```yaml proxyCA: enabled: true ``` -> **Note:** This override is now redundant because the chart default is -> `proxyCA.enabled: true`. It is retained for clarity and backward +> **Note:** These overrides are now redundant because the chart default is +> `proxyCA.enabled: true`. They are retained for clarity and backward > compatibility with older chart versions. **Behavior is identical to Scenario 1** -- Phases 8.5 and 8.6 run by default: @@ -117,22 +123,7 @@ proxyCA: 3. The CNO propagates the merged bundle to every node, making the ingress CA trusted system-wide for all pods without explicit volume mounts. -### Scenario 3: vSphere with Self-Signed Ingress - -Identical behavior to Bare Metal. vSphere clusters also typically use -self-signed ingress certificates. - -**Platform override** (`overrides/values-ztvp-certificates-VSphere.yaml`): - -```yaml -proxyCA: - enabled: true -``` - -> **Note:** This override is also redundant; the chart default already enables -> `proxyCA`. - -### Scenario 4: Enterprise Custom CA +### Scenario 3: Enterprise Custom CA When the organization uses a private PKI (e.g., a corporate root CA that signed the cluster's ingress certificate), the administrator creates a @@ -165,7 +156,7 @@ oc create secret generic custom-ca-bundle \ 3. The combined bundle contains both the custom CA and the auto-detected certificates. -### Scenario 5: Multiple Additional CAs +### Scenario 4: Multiple Additional CAs When several external CAs are needed (e.g., corporate root CA, a partner CA, and an intermediate CA), use `additionalCertificates` via the @@ -195,7 +186,7 @@ customCA: 2. All additional certificates are combined with auto-detected and custom CAs in Phase 7. -### Scenario 6: Image Pull Trust for Built-In Registry +### Scenario 5: Image Pull Trust for Built-In Registry When an image registry (e.g., Quay or the embedded OpenShift registry) is exposed behind the cluster ingress with a self-signed or internal CA, kubelet @@ -220,7 +211,7 @@ image pulls fail with `x509: certificate signed by unknown authority`. The `additionalTrustedCA.name` to that ConfigMap. 4. The Machine Config Operator rolls the trust configuration out to all nodes. -### Scenario 7: Custom Source Locations +### Scenario 6: Custom Source Locations In non-standard environments where the ingress CA or service CA are stored in different locations, `customSource` overrides the default auto-detection From 0924d3323d220325ba56be9a889f8873cb22e336 Mon Sep 17 00:00:00 2001 From: Min Zhang Date: Fri, 8 May 2026 11:38:55 -0400 Subject: [PATCH 3/3] docs: address review feedback on ztvp-certificates doc - Add link to the chart directory - Fix "ArgoCD" to "Argo CD" - Remove hardcoded sync-wave numbers to avoid staleness - Renumber phases 8.5/8.6 to 8.1/8.2 - Clarify service CA is read from within the Job Pod - Add "ConfigMap" qualifier to ztvp-trusted-ca references - Link to ACM fromConfigMap documentation - Replace wave numbers with relative ordering in sync table Signed-off-by: Min Zhang --- docs/ztvp-certificates.md | 46 +++++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 23 deletions(-) diff --git a/docs/ztvp-certificates.md b/docs/ztvp-certificates.md index 94d94f72..7a51644a 100644 --- a/docs/ztvp-certificates.md +++ b/docs/ztvp-certificates.md @@ -1,10 +1,9 @@ # ZTVP Certificates -The `ztvp-certificates` chart manages CA certificate extraction, validation, +The [`ztvp-certificates`](../charts/ztvp-certificates/) chart manages CA certificate extraction, validation, bundling, and distribution across the Zero Trust Validated Pattern. It runs as -an ArgoCD-managed application in the `openshift-config` namespace at sync-wave -**21**, ensuring certificates are available before any workload that needs TLS -verification. +an application managed by Argo CD in the `openshift-config` namespace, ensuring +certificates are available before any workload that needs TLS verification. ## Architecture @@ -35,7 +34,7 @@ verification. |---|---| | **ServiceAccount / RBAC** | Grants the extraction Job read access to secrets, configmaps, ingresscontrollers, and proxy across namespaces | | **ConfigMap (script)** | Holds the templated `extract-certificates.sh` script | -| **Job (initial)** | Runs once at first sync (sync-wave 23, `Prune=false`) to populate the CA bundle | +| **Job (initial)** | Runs once at first sync to populate the CA bundle | | **CronJob** | Runs on schedule (default daily at 02:00) for automatic rotation | | **ACM Policy + Placement** | Distributes the `ztvp-trusted-ca` ConfigMap into target namespaces via ACM governance | | **ManagedClusterSetBinding** | Binds the `default` ManagedClusterSet in `openshift-config` so the Placement can target `local-cluster` | @@ -56,8 +55,8 @@ configuration. | 6 -- Validation | `validation.enabled` | Checks minimum size and `openssl x509` parse for every `.crt` | | 7 -- Combine | always | Concatenates all `.crt` files into `tls-ca-bundle.pem`; fails if bundle < 100 bytes | | 8 -- ConfigMap | always | `oc apply` the `ztvp-trusted-ca` ConfigMap with annotations recording extraction metadata | -| 8.5 -- Proxy CA | `proxyCA.enabled` | Creates a separate ConfigMap with ingress + service CAs only | -| 8.6 -- Proxy Patch | `proxyCA.enabled` | Patches `proxy/cluster` to set `trustedCA` (only if not already set to another value) | +| 8.1 -- Proxy CA | `proxyCA.enabled` | Creates a separate ConfigMap with ingress + service CAs only | +| 8.2 -- Proxy Patch | `proxyCA.enabled` | Patches `proxy/cluster` to set `trustedCA` (only if not already set to another value) | | 9 -- Image Pull Trust | `imagePullTrust.enabled` | Creates a ConfigMap keyed by registry hostname and patches `image.config.openshift.io/cluster` | | 10 -- Rollout | `rollout.enabled` | Restarts Deployments/StatefulSets that consume the certificate bundle | @@ -79,12 +78,12 @@ signed by a public CA. 1. The Job auto-detects the ingress CA from each `IngressController`'s router secret in `openshift-ingress`. -2. The service CA is read from `openshift-service-ca.crt`. +2. The service CA is read from `openshift-service-ca.crt` from within the Job Pod. 3. If a cluster-wide proxy bundle exists, it is included. -4. All certificates are combined into `ztvp-trusted-ca` and distributed via +4. All certificates are combined into `ztvp-trusted-ca` ConfigMap and distributed via ACM Policy to target namespaces. 5. A proxy CA ConfigMap (`ztvp-proxy-ca`) is created with ingress + service - CAs and `proxy/cluster` is patched so the Cluster Network Operator injects + CAs and the `proxy/cluster` is patched so the Cluster Network Operator injects these CAs into all workloads automatically. No platform override file is needed. The chart's default `values.yaml` handles @@ -114,11 +113,11 @@ proxyCA: > `proxyCA.enabled: true`. They are retained for clarity and backward > compatibility with older chart versions. -**Behavior is identical to Scenario 1** -- Phases 8.5 and 8.6 run by default: +**Behavior is identical to Scenario 1** -- Phases 8.1 and 8.2 run by default: -1. Phase 8.5 builds a proxy-specific bundle containing only the ingress and +1. Phase 8.1 builds a proxy-specific bundle containing only the ingress and service CAs (the Cluster Network Operator merges these with system CAs). -2. Phase 8.6 patches `proxy/cluster` to set `spec.trustedCA.name` to +2. Phase 8.2 patches `proxy/cluster` to set `spec.trustedCA.name` to `ztvp-proxy-ca`. 3. The CNO propagates the merged bundle to every node, making the ingress CA trusted system-wide for all pods without explicit volume mounts. @@ -244,7 +243,7 @@ openshift-config/ztvp-trusted-ca ---ACM Policy---> qtodo/ztvp-trusted-ca ... ``` -The policy uses `fromConfigMap` hub templates so that the ConfigMap data is +The policy uses [`fromConfigMap`](https://docs.redhat.com/en/documentation/red_hat_advanced_cluster_management_for_kubernetes/2.12/html-single/governance/index#fromConfigMap-function) hub templates so that the ConfigMap data is always sourced from the hub cluster's copy. Target namespaces are configured via `distribution.targetNamespaces`. @@ -276,17 +275,18 @@ metadata: ## Sync Wave Ordering -The chart's resources are ordered within the ArgoCD sync: +The chart's resources are ordered within the Argo CD sync: -| Wave | Resources | +| Order | Resources | |---|---| -| 22 | ServiceAccount, RBAC (Role, RoleBinding, ClusterRole, ClusterRoleBinding) | -| 23 | Initial Job, CronJob, ConfigMap (script) | -| 25 | ManagedClusterSetBinding | -| 26 | ACM Policy, PlacementBinding, Placement | - -The application itself sits at sync-wave **21** in `values-hub.yaml`, ensuring -it deploys before operators and workloads that depend on the CA bundle. +| 1st | ServiceAccount, RBAC (Role, RoleBinding, ClusterRole, ClusterRoleBinding) | +| 2nd | Initial Job, CronJob, ConfigMap (script) | +| 3rd | ManagedClusterSetBinding | +| 4th | ACM Policy, PlacementBinding, Placement | + +The application itself is deployed early in the overall sync order (via +`values-hub.yaml`), ensuring it runs before operators and workloads that depend +on the CA bundle. ## Configuration Reference