-
Notifications
You must be signed in to change notification settings - Fork 24
Description
The current OpenShift deployment configuration uses explicit certificate file paths (--metrics-cert-file and --metrics-key-file) for TLS on the metrics endpoint. This approach requires the controller to be restarted when certificates are renewed by OpenShift's Service CA, which can lead to potential downtime or periods where metrics are unavailable with outdated certificates.
Current Implementation
- Uses explicit certificate files with
--metrics-cert-file=/etc/tls/metrics/tls.crtand--metrics-key-file=/etc/tls/metrics/tls.key - OpenShift Service CA annotation generates certificates in the
metrics-server-certssecret - Certificates are mounted at
/etc/tls/metrics/ - controller-runtime only loads these certificates at startup
Proposed Change (possibly)
Replace the explicit certificate file approach with the certificate directory approach:
- Update the OpenShift manager_metrics_patch.yaml to use
--cert-dir=/etc/tls/metricsinstead of explicit file paths - Keep the same volume mount and OpenShift Service CA annotation configuration
- This will allow controller-runtime to monitor the certificate directory for changes and reload certificates automatically when they are renewed
Certificate Rotation Mechanism
Our implementation has two different approaches for handling TLS certificates:
-
Current approach (OpenShift): Uses explicit certificate files with
--metrics-cert-fileand--metrics-key-fileflags. These certificates are loaded once at startup usingtls.LoadX509KeyPair()and set directly in the TLS config. This approach does not support certificate rotation without pod restart. -
Proposed approach: Use
--cert-dirflag instead, which leverages controller-runtime's built-in certificate handling:- When using
--cert-dir, controller-runtime's metrics server automatically sets up a certificate watcher - The watcher uses fsnotify to monitor certificate files for changes
- It implements a
GetCertificatecallback that's attached to the TLS config - When a new connection is established, the TLS stack calls this method to get the most current certificate
- Certificate changes are detected and applied without requiring server restart
- This is handled automatically by controller-runtime when using the
--cert-dirflag
- When using
Our implementation already includes code for both approaches, but the OpenShift configuration is currently set to use the explicit file path approach rather than the more robust certificate directory approach.
Benefits
- Automatic certificate rotation without requiring pod restarts
- Improved reliability when certificates are renewed
- Consistent with Kubernetes best practices for certificate management
- Prevents potential security issues with expired certificates
Related Configuration Files
/config/openshift/manager_metrics_patch.yaml/config/openshift/metrics_service.yaml
The controller-runtime metrics server supports certificate hot-reloading when using the directory-based approach, which is the preferred method when certificates might be rotated during runtime.