cluster-api: Enhancements - Advanced Actions & Prometheus Metrics#702
cluster-api: Enhancements - Advanced Actions & Prometheus Metrics#702ChayanDass wants to merge 3 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR expands the Cluster API (CAPI) plugin with richer resource actions (scale + pause/resume + machine operations) and adds CAPI-focused Prometheus charts to visualize controller/runtime and kube-state-metrics signals in Headlamp’s details views.
Changes:
- Added a new CAPI metrics charting flow in the Prometheus plugin (reconcile rate/duration, workqueue, replicas, webhook, cache).
- Introduced expanded operational actions across CAPI detail pages (cluster scale dialog, pause/resume reconciliation, machine-level actions).
- Updated the shared Chart component to support optional non-stacked rendering and improved mixed “some plots have data” behavior.
Reviewed changes
Copilot reviewed 22 out of 22 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| prometheus/src/util.ts | Adds CAPI kinds support + CAPI chart config generators and PromQL queries. |
| prometheus/src/index.tsx | Wires CAPI kinds to the new CapiChart container in details view. |
| prometheus/src/components/Chart/Chart/Chart.tsx | Adds stacked option and improves multi-plot merge/state handling. |
| prometheus/src/components/Chart/CapiChart/CapiChart.tsx | New generic container UI for switching CAPI chart variants and timespan/resolution. |
| prometheus/src/components/Chart/CapiChart/CapiClusterCacheChart.tsx | New chart for CAPI cluster cache metrics. |
| prometheus/src/components/Chart/CapiChart/CapiQueueChart.tsx | New chart for controller workqueue depth/adds/retries. |
| prometheus/src/components/Chart/CapiChart/CapiReconcileChart.tsx | New chart for reconcile success/error rates. |
| prometheus/src/components/Chart/CapiChart/CapiReconcileDurationChart.tsx | New chart for reconcile duration quantiles. |
| prometheus/src/components/Chart/CapiChart/CapiReplicasChart.tsx | New chart for replicas/availability metrics from kube-state-metrics. |
| prometheus/src/components/Chart/CapiChart/CapiWebhookChart.tsx | New chart(s) for webhook request rates and latency. |
| prometheus/src/components/Chart/CapiChart/CapiWorkersChart.tsx | New chart for controller worker concurrency. |
| prometheus/locales/en/translation.json | Adds English strings for new CAPI chart labels (needs key alignment). |
| cluster-api/src/components/clusters/Detail.tsx | Switches to unified cluster actions provider. |
| cluster-api/src/components/kubeadmcontrolplanes/Detail.tsx | Adds actions (pause/resume) to KCP detail. |
| cluster-api/src/components/machinedeployments/Detail.tsx | Adds actions (pause/resume) to MachineDeployment detail. |
| cluster-api/src/components/machinesets/Detail.tsx | Adds actions (pause/resume) to MachineSet detail. |
| cluster-api/src/components/machinepools/Detail.tsx | Adds actions (pause/resume) to MachinePool detail. |
| cluster-api/src/components/machinehealthchecks/Detail.tsx | Adds actions (pause/resume) to MachineHealthCheck detail. |
| cluster-api/src/components/machines/Detail.tsx | Adds machine-specific actions on Machine detail. |
| cluster-api/src/components/machines/Actions.tsx | New machine action implementations (view node, replace, pause/resume, provider console). |
| cluster-api/src/components/actions/index.tsx | Expands action set (cluster connect, scale dialog, pause/resume helpers). |
| cluster-api/src/components/common/util.tsx | Minor styling tweak (mx). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| 'ScaledJob', | ||
| 'NodePool', | ||
| 'NodeClaim', | ||
| 'Cluster', | ||
| 'MachineDeployment', | ||
| 'MachineSet', | ||
| 'Machine', | ||
| 'MachinePool', | ||
| 'KubeadmControlPlane', | ||
| ]; |
| @@ -90,16 +101,406 @@ export function GetKubeconfigAction(props: GetKubeconfigActionProps) { | |||
|
|
|||
| const secretName = `${resource.metadata?.name}-kubeconfig`; | |||
| const namespace = resource.metadata?.namespace || 'default'; | |||
| const secretData = Secret.useGet(secretName, namespace); | |||
|
|
|||
| const secretQuery = Secret.useGet(secretName, namespace); | |||
| const secret = secretQuery.data; | |||
| return ( | |||
| <ActionButton | |||
| description="Download Kubeconfig" | |||
| longDescription="Download the Kubeconfig file for this cluster" | |||
| icon={'mdi:cloud-download'} | |||
| onClick={() => { | |||
| connectClusterToHeadlamp(resource, secretData, enqueueSnackbar, loadingRef); | |||
| connectClusterToHeadlamp(resource, secret, enqueueSnackbar, loadingRef); | |||
| }} | |||
| /> | |||
| ); | |||
| const ChartEnabledKinds = [ | ||
| 'Pod', | ||
| 'Deployment', | ||
| 'StatefulSet', | ||
| 'DaemonSet', | ||
| 'ReplicaSet', | ||
| 'Job', | ||
| 'CronJob', | ||
| 'PersistentVolumeClaim', | ||
| 'ScaledObject', | ||
| 'ScaledJob', | ||
| 'NodePool', | ||
| 'NodeClaim', | ||
| 'Cluster', | ||
| 'MachineDeployment', | ||
| 'MachineSet', | ||
| 'Machine', | ||
| 'MachinePool', | ||
| 'KubeadmControlPlane', |
There was a problem hiding this comment.
hmm. This is a good point.
@ChayanDass what do you think?
|
@illume , PTAL |
|
@ChayanDass check the capitalization of the commit messages |
…art configurations and queries Signed-off-by: Chayan Das <01chayandas@gmail.com>
872ad3c to
0d986bd
Compare
- Introduced new action components for managing machines, including viewing, replacing, pausing, and resuming machine reconciliation. - Updated existing components to utilize the new action handlers for better modularity and maintainability. - Enhanced error handling and user feedback in the connect cluster functionality. - Improved the kubeconfig retrieval process and added loading states for better user experience. - Added Prometheus configuration for CAPI metrics scraping. Signed-off-by: Chayan Das <01chayandas@gmail.com>
|
@illume , PTAL Ready for merge ! |
| @@ -171,12 +177,13 @@ export function supportsPrometheusMetrics(resource?: ResourceIdentity): boolean | |||
| if (!kind || !ChartEnabledKinds.includes(kind)) { | |||
| return false; | |||
| } | |||
|
|
|||
| const requiredApiVersion = resourceApiVersionRules[kind]; | |||
| if (requiredApiVersion) { | |||
| return getResourceApiVersion(resource) === requiredApiVersion; | |||
| const apiVersionPattern = resourceApiVersionRules[kind]; | |||
| if (apiVersionPattern) { | |||
| const apiVersion = getResourceApiVersion(resource); | |||
| if (!apiVersion || !apiVersionPattern.test(apiVersion)) { | |||
| return false; | |||
| } | |||
| } | |||
There was a problem hiding this comment.
Version/Groups validation for cluster-API related resources, implemented here!
| @@ -171,12 +177,13 @@ export function supportsPrometheusMetrics(resource?: ResourceIdentity): boolean | |||
| if (!kind || !ChartEnabledKinds.includes(kind)) { | |||
| return false; | |||
There was a problem hiding this comment.
Implemented Regex due to multiple versions of resources!
…and query functions Signed-off-by: Chayan Das <01chayandas@gmail.com>
This PR introduces two major feature sets to the Cluster API (CAPI) plugin:
Enhanced Operational Actions: A unified and expanded set of management actions for CAPI resources, including scaling, pausing/resuming, and machine-level operations.
Prometheus Observability: Integration with the Prometheus plugin to provide real-time metrics visualization for CAPI controllers and resources.