Skip to content

Conversation

@Osselian
Copy link

@Osselian Osselian commented Dec 25, 2025

Refactored the code to separate Kubernetes API interactions from the factory package.
All Kubernetes API calls (such as client.Get and client.List) are now located in clusterResourceInteractions.go.
Existing tests for cluster resource creation and updates were moved to clusterResourceInteractions_test.go, and new tests were added to cover the remaining factory logic.

Summary by CodeRabbit

  • New Features
    • PVC expansion when the StorageClass permits online resizing.
    • Explicit retrieval of etcd client configuration with TLS support for secure cluster connections.
  • Bug Fixes
    • More robust creation/reconciliation of ConfigMaps, PodDisruptionBudgets, StatefulSets and Services, including safer owner/reference handling and quorum-based PDB defaults.
  • Refactor
    • Internal reorganization of factory/reconciliation boundaries to improve testability and clarity.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 25, 2025

📝 Walkthrough

Walkthrough

Moves resource-reconciliation logic into the controller package, adds TLS-aware GetClientConfigFromCluster to build a clientv3.Config from cluster secrets, converts factory CreateOrUpdate... functions into Get... that return constructed objects, and updates NewEtcdClientSet to accept a clientv3.Config.

Changes

Cohort / File(s) Summary
New controller helpers & tests
internal/controller/clusterResourceInteractions.go, internal/controller/clusterResourceInteractions_test.go
Adds controller-level helpers to build/reconcile ConfigMap, PodDisruptionBudget, StatefulSet, headless/client Services; PVC listing/resize and UpdatePersistentVolumeClaims; TLS secret parsing and GetClientConfigFromCluster; reconciliation/delete utilities. Adds comprehensive tests covering reconciliation, PVC expansion, and TLS scenarios.
Controller reconcile integration
internal/controller/etcdcluster_controller.go
Replaces direct factory reconciliation calls with local Get... functions and inserts explicit GetClientConfigFromCluster before calling factory.NewEtcdClientSet(cfg). Updates NewEtcdClientSet invocation to take a clientv3.Config.
Factory: return-object refactor
internal/controller/factory/configmap.go, internal/controller/factory/pdb.go, internal/controller/factory/svc.go, internal/controller/factory/statefulset.go, internal/controller/factory/..._test.go
Converts CreateOrUpdate* functions into Get* variants that return constructed resources (*ConfigMap, *PodDisruptionBudget, *Service, *StatefulSet) instead of reconciling them. Exposes helpers: PodLabels, GetServerProtocol, GenerateEtcdCommand/Args, and volume/volumeMount generators; adapts many tests to new signatures.
Factory: etcd client simplification & tests
internal/controller/factory/etcd_client.go, internal/controller/factory/etcd_client_test.go
Changes NewEtcdClientSet to accept a clientv3.Config (removes in-cluster secret/TLS discovery). Simplifies client construction and adds tests for nil-endpoints and multi-endpoint configs.
Factory: builders & PVC surface cleanup
internal/controller/factory/builders.go, internal/controller/factory/pvc.go, internal/controller/factory/suite_test.go
Removes reconcileOwnedResource/deleteOwnedResource helpers; makes contextWithGVK return errors; removes factory PVC helpers (PVCs, UpdatePersistentVolumeClaims); updates GetPVCName to trim names and default to "data"; adjusts test suite setup.
Factory tests & Service tests expanded
internal/controller/factory/svc.go, internal/controller/factory/svc_test.go, internal/controller/factory/*_test.go
Renames service factory functions to GetHeadlessService/GetClientService returning objects; expands tests to validate name/template merging, ports, selectors, ownership, and template merge/update behavior.
Widespread test updates
internal/controller/factory/*_test.go, internal/controller/*_test.go, internal/controller/factory/statefulset_test.go
Multiple test suites updated to assert on returned resource objects, label/port/name merging, PDB quorum defaults, configMap state, PVC label/name derivation, and exported helper behaviors.

Sequence Diagram

sequenceDiagram
    participant Reconciler as EtcdCluster Reconciler
    participant Ctrl as Controller (clusterResourceInteractions)
    participant K8s as Kubernetes API
    participant Factory as Factory (NewEtcdClientSet)
    participant Etcd as Etcd Cluster

    Reconciler->>Ctrl: GetClientConfigFromCluster(ctx, cluster)
    Ctrl->>K8s: GET Secrets / Services / Endpoints
    K8s-->>Ctrl: Secrets & endpoint info
    Ctrl->>Ctrl: parseTLSSecret / parseTLSSecretCA
    Ctrl-->>Reconciler: clientv3.Config

    Reconciler->>Factory: NewEtcdClientSet(clientv3.Config)
    Factory->>Etcd: create cluster client & per-endpoint clients
    Factory-->>Reconciler: cluster client + single clients

    Reconciler->>Ctrl: GetStatefulSet()/GetPdb()/GetHeadlessService()/GetClientService()/GetClusterStateConfigMap()
    Ctrl->>Ctrl: build resource objects
    Ctrl-->>Reconciler: resource objects
    Reconciler->>K8s: apply/patch created resources (reconcile)
    K8s-->>Reconciler: apply results / errors
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰
I hopped through secrets, seeds, and root,
Sewed configs snug from tls to boot,
I fetched the keys, then shaped each home,
Clients first, then nests to roam,
A burrow built — now time to reboot.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.11% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main refactoring objective: moving Kubernetes API calls from the factory package to the controller layer, which aligns with the PR's core changes.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.5.0)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
internal/controller/factory/pdb.go (1)

33-68: Nil pointer dereference risk when PodDisruptionBudgetTemplate is nil.

Lines 46-47 access cluster.Spec.PodDisruptionBudgetTemplate.Spec without checking if PodDisruptionBudgetTemplate is nil. According to the API definition, PodDisruptionBudgetTemplate is optional, so nil is a valid value. The current implementation will panic with a nil pointer dereference.

While the test in pdb_test.go (lines 215-217) documents this panic behavior, production code should validate inputs and return descriptive errors rather than panicking.

🔎 Proposed fix
 func GetPdb(
 	ctx context.Context,
 	cluster *etcdaenixiov1alpha1.EtcdCluster,
 	rclient client.Client,
 ) (*v1.PodDisruptionBudget, error) {
 	var err error
+
+	if cluster.Spec.PodDisruptionBudgetTemplate == nil {
+		return nil, fmt.Errorf("PodDisruptionBudgetTemplate is nil")
+	}
 
 	pdb := &v1.PodDisruptionBudget{
 		ObjectMeta: metav1.ObjectMeta{
internal/controller/factory/statefulset.go (1)

216-220: Dead code: len(value) == -1 is always false.

len() in Go never returns a negative value. This condition will never be true, making this code path unreachable. If the intent was to check for empty values, use len(value) == 0 or value == "".

🔎 Proposed fix (if intent is to handle empty values)
-		if len(value) == -1 {
+		if value == "" {
 			extraArgs = append(extraArgs, flag)
 
 			continue
 		}
🧹 Nitpick comments (6)
internal/controller/factory/configmap.go (1)

70-75: Improved readability with multi-line formatting.

The multi-line formatting of the clusterService construction improves code readability without changing functionality.

internal/controller/clusterResourceInteractions_test.go (3)

721-723: Empty test context for PVC - placeholder or incomplete?

The Context("when ensuring pvc" block is empty. If this is intentional as a placeholder for future tests, consider adding a TODO comment. Otherwise, this context can be removed.

🔎 Suggested change
 	Context("when ensuring pvc", func() {
+		// TODO: Add PVC-specific tests if needed
 	})

764-767: Remove unnecessary nil check with debug print.

The nil check for cluster with GinkgoWriter.Printf appears to be debug code that should be removed. The test setup in BeforeEach guarantees cluster is non-nil.

🔎 Proposed fix
 	It("should update PVC if the desired size is larger", func() {
-		if cluster == nil {
-			GinkgoWriter.Printf("Cluster in nil!\n")
-		}
 		pvc := &corev1.PersistentVolumeClaim{

1030-1031: Non-English comments detected.

Comments on lines 1030 and 1105 are in Russian ("tls.crt отсутствует", "ca.crt отсутствует"). For consistency and maintainability, consider translating to English.

🔎 Proposed fix
-				// tls.crt отсутствует
+				// tls.crt is missing
 				"tls.key": []byte("key"),
-				// ca.crt отсутствует
+				// ca.crt is missing

Also applies to: 1105-1106

internal/controller/clusterResourceInteractions.go (1)

225-230: Consider setting TLS MinVersion for improved security posture.

The tls.Config{} is created without specifying MinVersion. While Go defaults to TLS 1.2 for client connections, explicitly setting MinVersion: tls.VersionTLS12 (or TLS 1.3 if compatibility allows) documents the security intent and prevents accidental downgrades.

🔎 Proposed fix
-		etcdClient.TLS = &tls.Config{}
+		etcdClient.TLS = &tls.Config{
+			MinVersion: tls.VersionTLS12,
+		}
internal/controller/factory/statefulset.go (1)

206-211: Unusual condition: quota > -1 and make(map[string]string, 0).

  1. quota > -1 is equivalent to quota >= 0 but less readable. Since quota is derived from float64(size.Value()) * 0.95 and size.Value() returns 0 for zero quantities, the check should probably be quota > 0 to avoid setting a meaningless quota-backend-bytes=0.

  2. make(map[string]string, 0) allocates a map with zero capacity, which is valid but unusual. Omitting the capacity (make(map[string]string)) is more idiomatic.

🔎 Proposed improvements
-	if quota > -1 {
+	if quota > 0 {
 		if cluster.Spec.Options == nil {
-			cluster.Spec.Options = make(map[string]string, 0)
+			cluster.Spec.Options = make(map[string]string)
 		}
 		cluster.Spec.Options["quota-backend-bytes"] = strconv.FormatInt(int64(quota), 10)
 	}
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6537618 and a8456d0.

📒 Files selected for processing (18)
  • internal/controller/clusterResourceInteractions.go
  • internal/controller/clusterResourceInteractions_test.go
  • internal/controller/etcdcluster_controller.go
  • internal/controller/factory/builders.go
  • internal/controller/factory/configmap.go
  • internal/controller/factory/configmap_test.go
  • internal/controller/factory/etcd_client.go
  • internal/controller/factory/etcd_client_test.go
  • internal/controller/factory/pdb.go
  • internal/controller/factory/pdb_test.go
  • internal/controller/factory/pvc.go
  • internal/controller/factory/pvc_test.go
  • internal/controller/factory/statefulset.go
  • internal/controller/factory/statefulset_test.go
  • internal/controller/factory/suite_test.go
  • internal/controller/factory/svc.go
  • internal/controller/factory/svc_test.go
  • internal/controller/suite_test.go
💤 Files with no reviewable changes (2)
  • internal/controller/factory/pvc.go
  • internal/controller/factory/builders.go
🧰 Additional context used
🧬 Code graph analysis (10)
internal/controller/factory/etcd_client_test.go (1)
internal/controller/factory/etcd_client.go (1)
  • NewEtcdClientSet (25-45)
internal/controller/factory/configmap.go (3)
api/v1alpha1/etcdcluster_types.go (1)
  • EtcdCluster (91-97)
internal/log/logger.go (1)
  • Debug (82-84)
internal/controller/factory/svc.go (1)
  • GetHeadlessServiceName (42-48)
internal/controller/factory/configmap_test.go (2)
api/v1alpha1/etcdcluster_types.go (5)
  • EtcdCluster (91-97)
  • EtcdClusterSpec (28-53)
  • EtcdClusterStatus (83-85)
  • EtcdConditionReady (57-57)
  • EtcdCondTypeStatefulSetReady (68-68)
internal/controller/factory/configmap.go (1)
  • GetClusterStateConfigMap (37-60)
internal/controller/factory/pdb.go (2)
api/v1alpha1/etcdcluster_types.go (1)
  • EtcdCluster (91-97)
internal/log/logger.go (1)
  • Debug (82-84)
internal/controller/factory/svc_test.go (2)
api/v1alpha1/etcdcluster_types.go (5)
  • EtcdCluster (91-97)
  • EmbeddedService (271-278)
  • EmbeddedObjectMetadata (127-150)
  • EmbeddedMetadataResource (280-284)
  • EtcdClusterSpec (28-53)
internal/controller/factory/svc.go (4)
  • GetServiceName (34-40)
  • GetHeadlessServiceName (42-48)
  • GetHeadlessService (50-95)
  • GetClientService (97-140)
internal/controller/factory/pvc_test.go (2)
api/v1alpha1/etcdcluster_types.go (5)
  • EtcdCluster (91-97)
  • EtcdClusterSpec (28-53)
  • StorageSpec (176-184)
  • EmbeddedPersistentVolumeClaim (228-245)
  • EmbeddedObjectMetadata (127-150)
internal/controller/factory/pvc.go (2)
  • PVCLabels (23-29)
  • GetPVCName (31-37)
internal/controller/factory/pdb_test.go (2)
internal/controller/factory/pdb.go (1)
  • GetPdb (33-69)
api/v1alpha1/etcdcluster_types.go (3)
  • EtcdCluster (91-97)
  • EtcdClusterSpec (28-53)
  • EmbeddedPodDisruptionBudget (248-256)
internal/controller/clusterResourceInteractions.go (4)
internal/controller/factory/configmap.go (1)
  • GetClusterStateConfigMap (37-60)
internal/controller/factory/pdb.go (1)
  • GetPdb (33-69)
internal/controller/factory/statefulset.go (1)
  • GetStatefulSet (49-120)
internal/controller/factory/svc.go (3)
  • GetHeadlessService (50-95)
  • GetClientService (97-140)
  • GetHeadlessServiceName (42-48)
internal/controller/factory/statefulset.go (4)
api/v1alpha1/etcdcluster_types.go (2)
  • EtcdCluster (91-97)
  • PodTemplate (162-171)
internal/log/logger.go (1)
  • Debug (82-84)
internal/controller/factory/labels_builder.go (1)
  • NewLabelsBuilder (5-7)
internal/controller/factory/pvc.go (1)
  • GetPVCName (31-37)
internal/controller/etcdcluster_controller.go (3)
internal/controller/clusterResourceInteractions.go (7)
  • GetClientConfigFromCluster (179-186)
  • CreateOrUpdateClusterStateConfigMap (43-53)
  • CreateOrUpdateStatefulSet (79-89)
  • UpdatePersistentVolumeClaims (126-177)
  • CreateOrUpdateClientService (103-113)
  • CreateOrUpdateHeadlessService (91-101)
  • CreateOrUpdatePdb (55-77)
internal/controller/factory/etcd_client.go (1)
  • NewEtcdClientSet (25-45)
internal/log/logger.go (2)
  • Error (92-94)
  • Debug (82-84)
🪛 ast-grep (0.40.3)
internal/controller/clusterResourceInteractions.go

[warning] 225-225: MinVersionis missing from this TLS configuration. By default, TLS 1.2 is currently used as the minimum when acting as a client, and TLS 1.0 when acting as a server. General purpose web applications should default to TLS 1.3 with all other protocols disabled. Only where it is known that a web server must support legacy clients with unsupported an insecure browsers (such as Internet Explorer 10), it may be necessary to enable TLS 1.0 to provide support. AddMinVersion: tls.VersionTLS13' to the TLS configuration to bump the minimum version to TLS 1.3.
Context: tls.Config{}
Note: [CWE-327]: Use of a Broken or Risky Cryptographic Algorithm [OWASP A03:2017]: Sensitive Data Exposure [OWASP A02:2021]: Cryptographic Failures [REFERENCES]
https://owasp.org/Top10/A02_2021-Cryptographic_Failures

(missing-ssl-minversion-go)

🔇 Additional comments (28)
internal/controller/factory/suite_test.go (1)

29-29: LGTM: Clean removal of unused test client.

The removal of clientWithEmptyScheme from factory tests is consistent and complete. This aligns with the refactoring that reduces the factory package's dependency on Kubernetes client operations.

Also applies to: 47-49, 94-94

internal/controller/factory/etcd_client_test.go (1)

27-68: LGTM: Well-structured client creation tests.

The tests appropriately cover both the nil endpoints case and successful client creation with multiple endpoints. Proper cleanup by closing clients prevents resource leaks during test execution.

internal/controller/factory/configmap.go (1)

37-60: LGTM: Consistent refactoring to return constructed object.

The function signature change from CreateOrUpdateClusterStateConfigMap returning only an error to GetClusterStateConfigMap returning the constructed *corev1.ConfigMap aligns with the PR's goal of separating Kubernetes API interactions from the factory package. Error handling is properly maintained.

internal/controller/factory/etcd_client.go (1)

25-45: Excellent refactoring: separation of concerns achieved.

The new NewEtcdClientSet signature accepting clientv3.Config directly is a significant improvement:

  • Removes all Kubernetes API dependencies from the factory package
  • Makes the function more testable (no need to mock K8s client or secrets)
  • Better separation of concerns: config preparation happens in the controller layer
  • Proper error handling with cleanup (line 40 closes cluster client on failure)

This aligns perfectly with the PR's objective of moving Kubernetes API calls to the controller layer.

internal/controller/factory/pdb_test.go (3)

34-99: LGTM: Comprehensive test coverage for basic PDB creation.

The tests appropriately cover both pre-filled and empty PodDisruptionBudgetTemplate configurations, ensuring the factory function handles various input scenarios correctly.


102-197: Excellent test coverage for quorum calculation logic.

The test suite thoroughly validates:

  • Quorum calculation for different replica counts (1→1, 3→2, 5→3)
  • User-provided MinAvailable/MaxUnavailable overrides
  • Selector label construction
  • UnhealthyPodEvictionPolicy default

This comprehensive coverage ensures the PDB factory function behaves correctly across various configurations.


199-239: Edge case handling tests valuable, but nil template panic should be fixed.

The edge case tests are useful for documenting behavior. However, the test expecting a panic for nil PodDisruptionBudgetTemplate (lines 215-217) highlights a critical issue in the production code (see pdb.go review comment). The function should return an error instead of panicking.

The 0-replicas test (line 220-238) documents that MinAvailable defaults to 1 even when replicas=0, which may warrant separate validation of the quorum calculation logic.

internal/controller/factory/configmap_test.go (2)

48-75: LGTM - Test coverage for new cluster state.

The test correctly validates that GetClusterStateConfigMap returns a ConfigMap with ETCD_INITIAL_CLUSTER_STATE set to "new" for a freshly created cluster.


77-116: LGTM - Test coverage for existing cluster state.

The test properly sets up the cluster with a Ready condition and validates that GetClusterStateConfigMap returns "existing" as the cluster state.

internal/controller/factory/statefulset_test.go (2)

163-215: LGTM - Comprehensive volume generation tests.

Good coverage for EmptyDir vs PVC scenarios and TLS-related volumes.


321-363: LGTM - StatefulSet generation tests.

The integration test validates all required fields including name, namespace, replicas, service name, pod management policy, and container presence.

internal/controller/factory/svc.go (2)

50-95: LGTM - GetHeadlessService refactored with correct return signature.

The function properly returns (*corev1.Service, error) and handles all error paths consistently by returning (nil, err).


97-140: LGTM - GetClientService refactored with correct return signature.

Consistent with GetHeadlessService, this function properly returns the constructed service object on success.

internal/controller/clusterResourceInteractions_test.go (1)

841-1157: LGTM - Comprehensive TLS configuration tests.

Excellent coverage of various TLS scenarios including: without TLS, missing client secret, non-existent client secret, missing tls.crt, missing ca.crt, and missing endpoints.

internal/controller/factory/svc_test.go (3)

48-94: LGTM - Service name function tests.

Good test coverage for GetServiceName and GetHeadlessServiceName including custom templates and empty name fallback scenarios.


148-236: LGTM - Comprehensive service creation tests.

Excellent coverage including service type, ClusterIP, ports, owner references, and selector consistency between headless and client services.


259-303: LGTM - Template merging tests.

Good validation that custom metadata (name, labels, annotations), spec (type, ports), and port naming are properly merged.

internal/controller/etcdcluster_controller.go (3)

110-114: LGTM - Explicit client config retrieval before etcd client creation.

The refactoring correctly separates config retrieval (GetClientConfigFromCluster) from client instantiation (factory.NewEtcdClientSet), improving testability and separation of concerns.


284-299: LGTM - Conditional cluster object reconciliation using local helpers.

The controller now uses package-local CreateOrUpdateClusterStateConfigMap, CreateOrUpdateStatefulSet, and UpdatePersistentVolumeClaims functions, properly separating API interactions from resource generation.


650-667: LGTM - Unconditional object provisioning using local helpers.

The concurrent creation of client service, headless service, and PDB now uses local package functions, maintaining the separation introduced by this refactoring.

internal/controller/clusterResourceInteractions.go (4)

43-53: LGTM - Clean delegation pattern.

CreateOrUpdateClusterStateConfigMap properly delegates to factory.GetClusterStateConfigMap for object generation and reconcileOwnedResource for API interaction.


55-77: LGTM - PDB creation with proper nil handling.

Good handling of the case where PodDisruptionBudgetTemplate is nil - the function correctly deletes the existing PDB instead of creating one.


278-298: LGTM - Robust reconcileOwnedResource implementation.

The function correctly handles create vs update scenarios, merges annotations to preserve existing ones, and properly sets the resource version for updates.


126-177: LGTM - PVC expansion logic with proper safety checks.

Good implementation that:

  1. Filters PVCs by label and name prefix
  2. Checks if StorageClass supports volume expansion before patching
  3. Only patches when desired size is greater than current size
internal/controller/factory/statefulset.go (4)

49-120: LGTM - GetStatefulSet refactored with correct return signature.

The function properly returns (*appsv1.StatefulSet, error) and handles all error paths by returning (nil, err).


122-133: LGTM - PodLabels with proper merging.

The function correctly builds base labels and allows user-provided labels to override them, which is documented with a comment questioning this behavior (line 126).


240-316: LGTM - Volume generation with proper TLS secret handling.

The function correctly handles:

  • EmptyDir vs PVC data volume source
  • Conditional TLS volumes for peer, server, and client certificates

366-413: LGTM - Volume mount generation matches volume generation.

The volume mounts are correctly paired with the volumes generated in generateVolumes, with proper mount paths for data and TLS certificates.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (6)
internal/controller/suite_test.go (1)

97-99: LGTM - Error check now present.

The error check for clientWithEmptyScheme creation is now properly included on line 98. For consistency with k8sClient handling (line 102), consider also adding a nil check:

 clientWithEmptyScheme, err = client.New(cfg, client.Options{Scheme: kruntime.NewScheme()})
 Expect(err).NotTo(HaveOccurred())
+Expect(clientWithEmptyScheme).NotTo(BeNil())
internal/controller/factory/pvc.go (1)

33-39: Inconsistency between trim check and returned value.

The function trims the name for validation but returns the original untrimmed value. If the intent is to accept names with leading/trailing spaces, a name like " myname " would pass validation but return with spaces, potentially causing issues with Kubernetes resource naming (DNS-1123 subdomain rules).

Consider returning the trimmed name for consistency:

🔎 Proposed fix
 func GetPVCName(cluster *etcdaenixiov1alpha1.EtcdCluster) string {
-	if len(strings.Trim(cluster.Spec.Storage.VolumeClaimTemplate.Name, " ")) > 0 {
-		return cluster.Spec.Storage.VolumeClaimTemplate.Name
+	trimmedName := strings.TrimSpace(cluster.Spec.Storage.VolumeClaimTemplate.Name)
+	if len(trimmedName) > 0 {
+		return trimmedName
 	}
 	//nolint:goconst
 	return "data"
 }
internal/controller/factory/statefulset_test.go (1)

387-721: Consider removing or tracking the large commented-out test block.

This ~330 line commented-out block contains old tests for merging logic. The TODO comment indicates these need a rewrite. Consider either:

  1. Removing entirely and tracking in a separate issue
  2. Keeping only a brief TODO reference with an issue link

Large commented-out code blocks reduce readability and may become stale.

Would you like me to open an issue to track the test rewrite, so this commented code can be removed?

internal/controller/clusterResourceInteractions_test.go (3)

722-736: Variable shadowing: ctx shadows package-level variable.

Line 725 declares a local ctx that shadows the package-level ctx from suite_test.go. While this works, it can cause confusion in debugging and may lead to subtle bugs if developers expect to use the suite's context.

Consider renaming to avoid shadowing:

 var (
 	ns         *corev1.Namespace
-	ctx        context.Context
+	testCtx    context.Context
 	fakeClient client.Client
 	cluster    *etcdaenixiov1alpha1.EtcdCluster
 )

761-764: Redundant nil check for cluster.

The cluster is initialized in BeforeEach and will never be nil at this point. This check and debug print can be removed.

 It("should update PVC if the desired size is larger", func() {
-	if cluster == nil {
-		GinkgoWriter.Printf("Cluster in nil!\n")
-	}
 	pvc := &corev1.PersistentVolumeClaim{

1156-1162: Consider using k8s.io/utils/ptr instead of custom helpers.

The stringPointer and boolPointer helper functions can be replaced with ptr.To[T] from k8s.io/utils/ptr, which is already imported in this file (line 27).

-func stringPointer(s string) *string {
-	return &s
-}
-
-func boolPointer(b bool) *bool {
-	return &b
-}
+// Use ptr.To("string") and ptr.To(true) instead
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a8456d0 and 1e1e548.

📒 Files selected for processing (9)
  • internal/controller/clusterResourceInteractions_test.go
  • internal/controller/factory/configmap.go
  • internal/controller/factory/pdb.go
  • internal/controller/factory/pdb_test.go
  • internal/controller/factory/pvc.go
  • internal/controller/factory/pvc_test.go
  • internal/controller/factory/statefulset.go
  • internal/controller/factory/statefulset_test.go
  • internal/controller/suite_test.go
🧰 Additional context used
🧬 Code graph analysis (4)
internal/controller/factory/configmap.go (2)
api/v1alpha1/etcdcluster_types.go (1)
  • EtcdCluster (91-97)
internal/controller/factory/svc.go (1)
  • GetHeadlessServiceName (42-48)
internal/controller/factory/pvc_test.go (2)
api/v1alpha1/etcdcluster_types.go (4)
  • EtcdCluster (91-97)
  • StorageSpec (176-184)
  • EmbeddedPersistentVolumeClaim (228-245)
  • EmbeddedObjectMetadata (127-150)
internal/controller/factory/pvc.go (2)
  • PVCLabels (25-31)
  • GetPVCName (33-39)
internal/controller/factory/statefulset.go (4)
api/v1alpha1/etcdcluster_types.go (2)
  • EtcdCluster (91-97)
  • PodTemplate (162-171)
internal/log/logger.go (1)
  • Debug (82-84)
internal/controller/factory/labels_builder.go (1)
  • NewLabelsBuilder (5-7)
internal/controller/factory/pvc.go (1)
  • GetPVCName (33-39)
internal/controller/factory/pdb_test.go (2)
internal/controller/factory/pdb.go (1)
  • GetPdb (33-73)
api/v1alpha1/etcdcluster_types.go (3)
  • EtcdCluster (91-97)
  • EtcdClusterSpec (28-53)
  • EmbeddedPodDisruptionBudget (248-256)
🔇 Additional comments (13)
internal/controller/factory/statefulset_test.go (1)

48-75: LGTM - PodLabels tests are well structured.

The typo "PodLabesls" from the previous review has been corrected to "PodLabels". Test coverage for label merging and nil handling is appropriate.

internal/controller/factory/configmap.go (1)

37-60: LGTM - Clean refactoring to return constructed ConfigMap.

The function now returns the constructed ConfigMap for external handling, aligning with the PR's goal of separating resource construction from Kubernetes API interactions. Error handling properly returns (nil, err) on failures.

internal/controller/clusterResourceInteractions_test.go (1)

48-1154: LGTM - Comprehensive test coverage for cluster resource interactions.

The test file provides thorough coverage for the moved resource interaction functions including ConfigMap, PDB, StatefulSet, Services, PVC updates, and client config retrieval. The TLS scenarios and edge cases are well covered.

internal/controller/factory/statefulset.go (4)

98-101: Unresolved question in comment - clarify selector label behavior.

The comment on line 99 asks whether labels should come from pod template. This should be resolved and documented, or converted to a tracked TODO. Currently, the selector uses base labels regardless of user overrides, which could cause a mismatch with pods if users override base labels.


122-133: LGTM - PodLabels function with clear merging behavior.

The function correctly merges base labels with user-provided labels, allowing user overrides. The comment on line 126 notes this intentional behavior.


240-316: LGTM - Volume generation with proper TLS secret handling.

The generateVolumes function correctly handles EmptyDir vs PVC storage and adds appropriate TLS secret volumes when security is configured.


195-212: No nil pointer dereference—webhook validation prevents this scenario.

The Default() mutating webhook (api/v1alpha1/etcdcluster_webhook.go, lines 55-65) ensures that when EmptyDir is nil, Resources.Requests is always populated with at least "4Gi" before reconciliation. Line 202 safely accesses this field because the webhook pre-condition is guaranteed. The codebase consistently relies on this pattern throughout.

Likely an incorrect or invalid review comment.

internal/controller/factory/pdb_test.go (2)

102-197: LGTM - Comprehensive quorum-based MinAvailable tests.

Excellent test coverage for the PDB quorum calculation logic, including:

  • Correct MinAvailable for 1, 3, and 5 replicas
  • User-provided overrides taking precedence
  • MaxUnavailable handling
  • Selector label validation
  • UnhealthyPodEvictionPolicy default

199-237: LGTM - Edge case handling tests.

Good coverage for nil PodDisruptionBudgetTemplate (expecting error) and 0 replicas (graceful handling with MinAvailable=1).

internal/controller/factory/pdb.go (4)

33-37: LGTM! Function signature refactoring aligns with PR objectives.

The function rename from CreateOrUpdatePdb to GetPdb and the updated return signature correctly reflect the separation of concerns—the factory now constructs and returns the PodDisruptionBudget object, leaving Kubernetes API interactions to the controller layer.


44-61: LGTM! PDB construction and quorum-based defaulting are correct.

The PodDisruptionBudget construction properly sets up:

  • ObjectMeta with namespace and name
  • Spec with template values and label selector
  • UnhealthyPodEvictionPolicy set to IfHealthyBudget for maintaining cluster availability
  • Smart defaulting: when both MinAvailable and MaxUnavailable are nil, MinAvailable is calculated from quorum size, which is essential for etcd cluster stability

62-72: LGTM! Error handling follows Go best practices.

The error handling correctly follows the (nil, error) return pattern for all failure paths:

  • contextWithGVK errors propagated as (nil, err)
  • SetControllerReference errors wrapped with context as (nil, fmt.Errorf(...))
  • Success path returns (pdb, nil)

The debug logging at line 66 provides useful context for troubleshooting.


40-42: No further action needed. The controller properly handles the nil PodDisruptionBudgetTemplate case by checking it in CreateOrUpdatePdb and deleting any existing PDB before GetPdb is called. The nil template check in GetPdb (lines 40-42) serves as defensive programming for direct calls but is unreachable during normal controller operation, where the higher-level nil check prevents it from being invoked with a nil template.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
internal/controller/factory/pdb_test.go (1)

119-124: Previous issue resolved.

The cleanup logic now correctly checks err == nil before deleting, which is the proper behavior.

🧹 Nitpick comments (3)
internal/controller/factory/pdb_test.go (2)

175-187: Consider removing GenerateName when explicitly setting Name.

At line 177, you set etcdcluster.Name = "test-cluster" but the struct was initialized with GenerateName in BeforeEach. While Kubernetes will use the explicit Name and ignore GenerateName when both are present, this pattern is confusing. Consider either:

  1. Not setting GenerateName in tests that need a specific name, or
  2. Creating a fresh EtcdCluster in this specific test
🔎 Suggested improvement
 		It("should have correct selector for etcd pods", func() {
-			etcdcluster.Spec.Replicas = ptr.To(int32(3))
-			etcdcluster.Name = "test-cluster"
+			etcdcluster = etcdaenixiov1alpha1.EtcdCluster{
+				ObjectMeta: metav1.ObjectMeta{
+					Name:      "test-cluster",
+					Namespace: ns.GetName(),
+					UID:       types.UID(uuid.NewString()),
+				},
+				Spec: etcdaenixiov1alpha1.EtcdClusterSpec{
+					Replicas:                    ptr.To(int32(3)),
+					PodDisruptionBudgetTemplate: &etcdaenixiov1alpha1.EmbeddedPodDisruptionBudget{},
+				},
+			}
 			Expect(k8sClient.Create(ctx, &etcdcluster)).Should(Succeed())

199-237: Edge case tests look good, minor cleanup pattern inconsistency.

The edge case tests properly verify error handling for nil PodDisruptionBudgetTemplate and graceful handling of 0 replicas.

Minor note: These tests use defer k8sClient.Delete(ctx, &etcdcluster) (lines 213, 232) instead of DeferCleanup(k8sClient.Delete, &etcdcluster) used elsewhere in the file. Both work correctly, but DeferCleanup is the preferred Ginkgo idiom as it integrates better with the test lifecycle and provides better error reporting.

internal/controller/factory/pvc.go (1)

33-40: Consider using strings.TrimSpace for broader whitespace handling.

The current implementation only trims space characters (" "). If other whitespace characters (tabs, newlines) should also be trimmed, consider using strings.TrimSpace instead.

🔎 Proposed change (optional)
 func GetPVCName(cluster *etcdaenixiov1alpha1.EtcdCluster) string {
-	trimmedName := strings.Trim(cluster.Spec.Storage.VolumeClaimTemplate.Name, " ")
+	trimmedName := strings.TrimSpace(cluster.Spec.Storage.VolumeClaimTemplate.Name)
 	if len(trimmedName) > 0 {
 		return trimmedName
 	}
 	//nolint:goconst
 	return "data"
 }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1e1e548 and 9f15aff.

📒 Files selected for processing (4)
  • internal/controller/clusterResourceInteractions_test.go
  • internal/controller/factory/pdb_test.go
  • internal/controller/factory/pvc.go
  • internal/controller/factory/pvc_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/controller/clusterResourceInteractions_test.go
🧰 Additional context used
🧬 Code graph analysis (2)
internal/controller/factory/pdb_test.go (2)
internal/controller/factory/pdb.go (1)
  • GetPdb (33-73)
api/v1alpha1/etcdcluster_types.go (3)
  • EtcdCluster (91-97)
  • EtcdClusterSpec (28-53)
  • EmbeddedPodDisruptionBudget (248-256)
internal/controller/factory/pvc_test.go (1)
internal/controller/factory/pvc.go (2)
  • PVCLabels (25-31)
  • GetPVCName (33-40)
🔇 Additional comments (5)
internal/controller/factory/pdb_test.go (3)

34-45: LGTM!

Good use of DeferCleanup for namespace cleanup, which ensures proper resource cleanup even if tests fail.


79-86: LGTM!

The cleanup logic correctly checks err == nil before attempting to delete the PDB, ensuring deletion only occurs when the resource exists.


126-151: Good test coverage for quorum calculations.

The tests correctly verify the quorum formula (n/2 + 1) for various replica counts. The expected values (1 for 1 replica, 2 for 3 replicas, 3 for 5 replicas) align with etcd's quorum requirements.

internal/controller/factory/pvc_test.go (2)

53-106: Good test coverage for PVCLabels function.

The tests comprehensively cover label merging, override priority, and edge cases (nil/empty labels).


108-136: LGTM on the GetPVCName test cases.

The tests correctly verify custom name retrieval, empty name fallback to "data", whitespace-only fallback to "data", and nil template handling. The past review issue (line 134) has been properly addressed.

Also applies to: 149-158

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
internal/controller/factory/pvc_test.go (2)

26-26: Update the test suite description to reflect the actual functions being tested.

The test suite is named "UpdatePersistentVolumeClaims" but tests PVCLabels and GetPVCName functions. According to the refactoring, UpdatePersistentVolumeClaims was removed. Update the suite name to accurately describe the test scope.

🔎 Proposed fix
-var _ = Describe("UpdatePersistentVolumeClaims", func() {
+var _ = Describe("PVC Helper Functions", func() {

or alternatively:

-var _ = Describe("UpdatePersistentVolumeClaims", func() {
+var _ = Describe("PVCLabels and GetPVCName", func() {

108-159: Excellent test coverage for GetPVCName with all edge cases.

The tests comprehensively cover name resolution, trimming behavior, and edge cases. Both previously flagged issues have been correctly addressed:

  • Line 134 now expects "data" for whitespace-only input
  • Line 138 has a distinct description for the whitespace-trimming scenario
Minor: Consider improving description clarity on line 138

The phrase "name with whitespaces" could be more precise:

-		When("VolumeClaimTemplate name with whitespaces", func() {
+		When("VolumeClaimTemplate name has leading/trailing whitespace", func() {

This makes it clearer that the test validates trimming behavior rather than just the presence of whitespace characters.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9f15aff and 52a9fdd.

📒 Files selected for processing (4)
  • internal/controller/clusterResourceInteractions_test.go
  • internal/controller/factory/pdb_test.go
  • internal/controller/factory/pvc.go
  • internal/controller/factory/pvc_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • internal/controller/clusterResourceInteractions_test.go
  • internal/controller/factory/pvc.go
🧰 Additional context used
🧬 Code graph analysis (2)
internal/controller/factory/pdb_test.go (2)
internal/controller/factory/pdb.go (1)
  • GetPdb (33-73)
api/v1alpha1/etcdcluster_types.go (3)
  • EtcdCluster (91-97)
  • EtcdClusterSpec (28-53)
  • EmbeddedPodDisruptionBudget (248-256)
internal/controller/factory/pvc_test.go (2)
api/v1alpha1/etcdcluster_types.go (3)
  • StorageSpec (176-184)
  • EmbeddedPersistentVolumeClaim (228-245)
  • EmbeddedObjectMetadata (127-150)
internal/controller/factory/pvc.go (2)
  • PVCLabels (25-31)
  • GetPVCName (33-40)
🔇 Additional comments (4)
internal/controller/factory/pvc_test.go (1)

53-106: Comprehensive test coverage for PVCLabels.

The test cases thoroughly cover label merging behavior, override precedence, and edge cases (nil/empty labels). The structure is clear and follows BDD best practices.

internal/controller/factory/pdb_test.go (3)

88-99: LGTM! Basic validation tests are correct.

These tests appropriately verify that GetPdb successfully returns a PDB object for both pre-filled and empty template scenarios. The detailed behavioral validation is properly covered in subsequent test contexts.


102-197: LGTM! Comprehensive quorum calculation test coverage.

This test suite thoroughly validates:

  • Correct quorum-based MinAvailable calculations for various replica counts
  • User override precedence over automatic quorum calculation
  • Proper mutual exclusivity between MinAvailable and MaxUnavailable
  • Selector label correctness
  • Appropriate UnhealthyPodEvictionPolicy default

All test logic and expectations are correct.


200-217: LGTM! Nil template validation is correct.

Properly validates that GetPdb returns an error when PodDisruptionBudgetTemplate is nil, preventing PDB creation with incomplete configuration.

Comment on lines +219 to 237
It("should handle 0 replicas gracefully", func() {
etcdcluster := etcdaenixiov1alpha1.EtcdCluster{
ObjectMeta: metav1.ObjectMeta{
GenerateName: "test-etcdcluster-",
Namespace: ns.GetName(),
UID: types.UID(uuid.NewString()),
},
Spec: etcdaenixiov1alpha1.EtcdClusterSpec{
Replicas: ptr.To(int32(0)),
PodDisruptionBudgetTemplate: &etcdaenixiov1alpha1.EmbeddedPodDisruptionBudget{},
},
}
Expect(k8sClient.Create(ctx, &etcdcluster)).Should(Succeed())
DeferCleanup(k8sClient.Delete, &etcdcluster)

pdbObj, err := GetPdb(ctx, &etcdcluster, k8sClient)
Expect(err).ShouldNot(HaveOccurred())
Expect(pdbObj.Spec.MinAvailable).To(Equal(ptr.To(intstr.FromInt32(1))))
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find CalculateQuorumSize method implementation

# First, search for CalculateQuorumSize in the codebase
rg "CalculateQuorumSize" -A 15 --type go

Repository: aenix-io/etcd-operator

Length of output: 5257


🏁 Script executed:

#!/bin/bash
# Search for Replicas validation and 0 replicas handling

echo "=== Searching for Replicas validation ==="
rg "Replicas.*nil|Replicas.*0" --type go -B 2 -A 2

echo ""
echo "=== Checking etcdcluster_webhook for validation ==="
rg "Replicas" internal/controller/factory/pdb.go api/v1alpha1/etcdcluster_webhook.go -B 1 -A 1

Repository: aenix-io/etcd-operator

Length of output: 4639


🏁 Script executed:

#!/bin/bash
# Check if GetPdb function has any guards for 0 replicas
rg "GetPdb" -A 30 internal/controller/factory/pdb.go | head -60

Repository: aenix-io/etcd-operator

Length of output: 1205


🏁 Script executed:

#!/bin/bash
# Check webhook validation and comments about 0 replicas
rg "0.*replica|replica.*0" -B 3 -A 3 --type go api/v1alpha1/etcdcluster_webhook.go api/v1alpha1/etcdcluster_webhook_test.go

Repository: aenix-io/etcd-operator

Length of output: 796


🏁 Script executed:

#!/bin/bash
# Check if there are any comments or documentation about handling 0 replicas in GetPdb or pdb_test
echo "=== GetPdb context around 0 replicas ==="
rg "0|CalculateQuorumSize" internal/controller/factory/pdb.go -B 2 -A 2

echo ""
echo "=== Check pdb_test for context about this test ==="
sed -n '210,240p' internal/controller/factory/pdb_test.go

Repository: aenix-io/etcd-operator

Length of output: 1777


The 0 replicas PDB configuration is semantically incorrect and requires fixing.

Setting MinAvailable=1 when Replicas=0 violates the PDB contract. The system intentionally allows 0 replicas (as confirmed by the webhook test comment: "User should have an opportunity to create cluster with 0 replicas"), but the PDB formula CalculateQuorumSize() returns 1 for 0 replicas, creating a contradiction where at least 1 pod must always be available when none exist.

This blocks valid operational scenarios like scaling down to zero for maintenance or cost optimization. The CalculateQuorumSize() formula (replicas/2 + 1) needs special handling for the 0 replicas edge case:

  • Either skip PDB creation when replicas=0
  • Or explicitly set MinAvailable=0 for this case
🤖 Prompt for AI Agents
internal/controller/factory/pdb_test.go lines 219-237: the test currently
asserts MinAvailable==1 for a cluster with Replicas=0 which is semantically
incorrect; update the PDB generation logic (CalculateQuorumSize or GetPdb) to
special-case replicas==0 and return MinAvailable=0 (or skip creating a PDB),
then update this test to expect MinAvailable==0 (or expect no PDB returned) so
the behavior is consistent with allowing 0 replicas.

@lllamnyp lllamnyp self-assigned this Dec 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants