Extend CacheRuntime phase 2.1: add test for component and remove unnecessary schema field for component #5809
Conversation
Signed-off-by: xliuqq <xlzq1992@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request refactors the cache component management by removing the runtime.Scheme dependency from component managers and updating the CacheEngine to retrieve the scheme directly from the client. It also introduces comprehensive unit tests for the component managers and improves the robustness of StatefulSet construction by handling cases where the service configuration might be nil. Feedback was provided regarding the removal of the unused scheme field in DaemonSetManager and improving the accuracy of the status phase calculation for DaemonSet workloads.
| // DaemonSet always returns Ready phase regardless of actual status | ||
| Expect(status.Phase).To(Equal(datav1alpha1.RuntimePhaseReady)) |
There was a problem hiding this comment.
The test asserts that the DaemonSet phase is always RuntimePhaseReady, which reflects a hardcoded implementation in DaemonSetManager.ConstructComponentStatus. For better accuracy and consistency with the StatefulSetManager implementation, the phase should be calculated based on whether the desired number of pods are actually ready (e.g., checking if ds.Status.DesiredNumberScheduled == ds.Status.NumberReady).
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #5809 +/- ##
==========================================
+ Coverage 58.17% 58.89% +0.71%
==========================================
Files 478 479 +1
Lines 32485 32443 -42
==========================================
+ Hits 18899 19107 +208
+ Misses 12042 11784 -258
- Partials 1544 1552 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
cheyang
left a comment
There was a problem hiding this comment.
Review Summary
Overall direction is correct — removing unnecessary runtime.Scheme dependency from component managers and adding unit tests. However, there are a few issues that should be addressed before merge.
🔴 Blocking Issues
1. DaemonSetManager.ConstructComponentStatus always returns RuntimePhaseReady
daemonset_manager.go — ConstructComponentStatus hard-codes Phase: datav1alpha1.RuntimePhaseReady regardless of actual pod readiness. This means a DaemonSet with 0 ready pods still reports "Ready", which is semantically wrong and inconsistent with StatefulSetManager (which correctly uses desiredReplicas == readyReplicas → Ready).
The test case "should return Ready phase even when some nodes are unavailable" also validates this behavior and comments "DaemonSet always returns Ready phase regardless of actual status", which confirms this is a known bug rather than intentional design.
Suggestion: Use NumberReady == DesiredNumberScheduled → Ready, otherwise NotReady, consistent with the StatefulSet logic.
2. StatefulSetManager.UnavailableReplicas can produce negative values
statefulset_manager.go:ConstructComponentStatus — UnavailableReplicas: sts.Status.CurrentReplicas - sts.Status.AvailableReplicas. When AvailableReplicas > CurrentReplicas (Kubernetes API allows this), the result is a negative int32.
Suggestion: Use max(0, Current-Available) or reconsider the calculation.
🟡 Non-blocking Risks
3. Copy-paste error in DaemonSetManager log messages
daemonset_manager.go — reconcileDaemonSet logs "start to reconciling dst workload" and "create sts workload succeed" but operates on a DaemonSet, not a StatefulSet. Suggest correcting to "ds workload" / "daemonset workload".
4. Scheme acquisition change risk in engine.go
e.Scheme is now obtained via ctx.Client.Scheme() instead of direct assignment. If ctx.Client is a wrapped/delegated client whose Scheme() returns nil, downstream code that still references e.Scheme could break. Low risk since component managers no longer use scheme directly, but worth a quick sanity check.
📋 Missing Tests
5. No test for DaemonSet ConstructComponentStatus with 0 ready pods
Current tests cover "all ready" and "partially unavailable but still Ready". Should add a test for NumberReady == 0 case, especially if the phase logic is fixed.
💡 Optional Improvements
6. Duplicate constructService code across managers
DaemonSetManager.constructService and StatefulSetManager.constructService are nearly identical — this is likely the source of the 30.6% duplication SonarQube flagged. Consider extracting to a shared helper function.
7. Repeated scheme+client setup in test BeforeEach blocks
The test initialization pattern is identical across StatefulSetManager and DaemonSetManager test sections. Could extract to a shared helper.
Signed-off-by: xliuqq <xlzq1992@gmail.com>
| readyReplicas := ds.Status.NumberReady | ||
|
|
||
| runtimePhase := datav1alpha1.RuntimePhaseNotReady | ||
| if desiredReplicas == readyReplicas { |
There was a problem hiding this comment.
Nit: the phase logic added here (and the equivalent in StatefulSetManager) is internally consistent, but worth noting that every current caller overrides the returned Phase — setupMasterInternal/SetupWorkerInternal/SetupClientInternal force it to RuntimePhaseNotReady, and status.go's setXxxComponentStatus methods recalculate it independently. So the phase value returned from ConstructComponentStatus is effectively dead code today.
Not a problem right now — if a future caller relies on the returned phase directly, the logic here is correct. Just flagging for awareness.
Also a minor edge case: when all nodes are tainted and DesiredNumberScheduled == 0, the 0 == 0 check evaluates to RuntimePhaseReady. This is consistent with how the StatefulSet manager and status.go treat DesiredReplicas == 0, so it's fine — just worth being aware of this semantic in corner cases.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Set ServiceName if service is configured | ||
| if component.Service != nil { | ||
| sts.Spec.ServiceName = component.Service.Name | ||
| } |
There was a problem hiding this comment.
For StatefulSet workloads, spec.serviceName is required by Kubernetes validation. With the new conditional, a nil component.Service will produce a StatefulSet with an empty ServiceName, which will be rejected by the API server at create time. Consider either (a) returning an explicit error when component.Service == nil for StatefulSets, or (b) defaulting/creating a headless service name and setting sts.Spec.ServiceName accordingly so the resource is valid.
| It("should handle nil service gracefully", func() { | ||
| component.Service = nil | ||
| err := manager.Reconciler(ctx, component) | ||
| Expect(err).NotTo(HaveOccurred()) | ||
|
|
||
| // Verify StatefulSet was created without ServiceName | ||
| sts := &appsv1.StatefulSet{} | ||
| err = manager.client.Get(ctx, types.NamespacedName{ | ||
| Name: "test-runtime-master", | ||
| Namespace: "fluid", | ||
| }, sts) | ||
| Expect(err).NotTo(HaveOccurred()) | ||
| Expect(sts.Spec.ServiceName).To(BeEmpty()) |
There was a problem hiding this comment.
This test asserts that a StatefulSet reconciliation succeeds when component.Service is nil, but in a real cluster a StatefulSet without spec.serviceName will fail validation. Updating this test to expect an error (or updating the production code to always supply a valid ServiceName for StatefulSets) will prevent false confidence from the fake client, which does not enforce API validation.
| It("should handle nil service gracefully", func() { | |
| component.Service = nil | |
| err := manager.Reconciler(ctx, component) | |
| Expect(err).NotTo(HaveOccurred()) | |
| // Verify StatefulSet was created without ServiceName | |
| sts := &appsv1.StatefulSet{} | |
| err = manager.client.Get(ctx, types.NamespacedName{ | |
| Name: "test-runtime-master", | |
| Namespace: "fluid", | |
| }, sts) | |
| Expect(err).NotTo(HaveOccurred()) | |
| Expect(sts.Spec.ServiceName).To(BeEmpty()) | |
| It("should return an error when service is nil for StatefulSet reconciliation", func() { | |
| component.Service = nil | |
| err := manager.Reconciler(ctx, component) | |
| Expect(err).To(HaveOccurred()) |
| It("should return Ready phase even when some nodes are unavailable", func() { | ||
| ds := &appsv1.DaemonSet{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
| Name: "test-runtime-worker", | ||
| Namespace: "fluid", | ||
| }, | ||
| Status: appsv1.DaemonSetStatus{ | ||
| DesiredNumberScheduled: 3, | ||
| CurrentNumberScheduled: 3, | ||
| NumberAvailable: 2, | ||
| NumberUnavailable: 1, | ||
| NumberReady: 2, | ||
| }, | ||
| } | ||
| Expect(manager.client.Create(ctx, ds)).To(Succeed()) | ||
|
|
||
| status, err := manager.ConstructComponentStatus(ctx, component) | ||
| Expect(err).NotTo(HaveOccurred()) | ||
| Expect(status.DesiredReplicas).To(Equal(int32(3))) | ||
| Expect(status.ReadyReplicas).To(Equal(int32(2))) | ||
| Expect(status.AvailableReplicas).To(Equal(int32(2))) | ||
| Expect(status.UnavailableReplicas).To(Equal(int32(1))) | ||
| // DaemonSet should return NotReady when not all replicas are ready | ||
| Expect(status.Phase).To(Equal(datav1alpha1.RuntimePhaseNotReady)) | ||
| }) |
There was a problem hiding this comment.
The It description says "should return Ready phase" but the assertions (and the inline comment) expect RuntimePhaseNotReady. Please rename the test case string to match the expected behavior to avoid confusing future readers.
Signed-off-by: xliuqq <xlzq1992@gmail.com>
|
cheyang
left a comment
There was a problem hiding this comment.
LGTM. The Scheme removal is clean — no residual references remain in pkg/ddc/cache/. The two blocking bugs from the previous review (DaemonSet always-Ready status and StatefulSet negative UnavailableReplicas) are properly fixed. Service code deduplication into service.go is straightforward. Tests cover creation, idempotency, nil Service, and status phase logic for both manager types. CI is green across K8s 1.22–1.33.
/approve
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cheyang The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |



Ⅰ. Describe what this PR does
add test for component and remove unnecessary schema field for component.
Ⅱ. Does this pull request fix one issue?
part of #5412
Ⅲ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.
unit test
Ⅳ. Describe how to verify it
Ⅴ. Special notes for reviews