Skip to content

Extend CacheRuntime phase 2.1: add test for component and remove unnecessary schema field for component #5809

Merged
cheyang merged 4 commits intofluid-cloudnative:masterfrom
xliuqq:cache_op_test
Apr 29, 2026
Merged

Extend CacheRuntime phase 2.1: add test for component and remove unnecessary schema field for component #5809
cheyang merged 4 commits intofluid-cloudnative:masterfrom
xliuqq:cache_op_test

Conversation

@xliuqq
Copy link
Copy Markdown
Collaborator

@xliuqq xliuqq commented Apr 27, 2026

Ⅰ. Describe what this PR does

add test for component and remove unnecessary schema field for component.

Ⅱ. Does this pull request fix one issue?

part of #5412

Ⅲ. List the added test cases (unit test/integration test) if any, please explain if no tests are needed.

unit test

Ⅳ. Describe how to verify it

Ⅴ. Special notes for reviews

Signed-off-by: xliuqq <xlzq1992@gmail.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the cache component management by removing the runtime.Scheme dependency from component managers and updating the CacheEngine to retrieve the scheme directly from the client. It also introduces comprehensive unit tests for the component managers and improves the robustness of StatefulSet construction by handling cases where the service configuration might be nil. Feedback was provided regarding the removal of the unused scheme field in DaemonSetManager and improving the accuracy of the status phase calculation for DaemonSet workloads.

Comment thread pkg/ddc/cache/component/daemonset_manager.go
Comment on lines +428 to +429
// DaemonSet always returns Ready phase regardless of actual status
Expect(status.Phase).To(Equal(datav1alpha1.RuntimePhaseReady))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The test asserts that the DaemonSet phase is always RuntimePhaseReady, which reflects a hardcoded implementation in DaemonSetManager.ConstructComponentStatus. For better accuracy and consistency with the StatefulSetManager implementation, the phase should be calculated based on whether the desired number of pods are actually ready (e.g., checking if ds.Status.DesiredNumberScheduled == ds.Status.NumberReady).

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 27, 2026

Codecov Report

❌ Patch coverage is 78.02198% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 58.89%. Comparing base (82e490e) to head (68c1e81).
⚠️ Report is 6 commits behind head on master.

Files with missing lines Patch % Lines
pkg/ddc/cache/component/service.go 87.75% 4 Missing and 2 partials ⚠️
pkg/ddc/cache/engine/engine.go 0.00% 5 Missing ⚠️
pkg/ddc/cache/component/statefulset_manager.go 75.00% 2 Missing and 1 partial ⚠️
pkg/ddc/cache/engine/status.go 0.00% 3 Missing ⚠️
pkg/ddc/cache/engine/client.go 0.00% 1 Missing ⚠️
pkg/ddc/cache/engine/master.go 0.00% 1 Missing ⚠️
pkg/ddc/cache/engine/worker.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5809      +/-   ##
==========================================
+ Coverage   58.17%   58.89%   +0.71%     
==========================================
  Files         478      479       +1     
  Lines       32485    32443      -42     
==========================================
+ Hits        18899    19107     +208     
+ Misses      12042    11784     -258     
- Partials     1544     1552       +8     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: xliuqq <xlzq1992@gmail.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Collaborator

@cheyang cheyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Overall direction is correct — removing unnecessary runtime.Scheme dependency from component managers and adding unit tests. However, there are a few issues that should be addressed before merge.

🔴 Blocking Issues

1. DaemonSetManager.ConstructComponentStatus always returns RuntimePhaseReady

daemonset_manager.goConstructComponentStatus hard-codes Phase: datav1alpha1.RuntimePhaseReady regardless of actual pod readiness. This means a DaemonSet with 0 ready pods still reports "Ready", which is semantically wrong and inconsistent with StatefulSetManager (which correctly uses desiredReplicas == readyReplicas → Ready).

The test case "should return Ready phase even when some nodes are unavailable" also validates this behavior and comments "DaemonSet always returns Ready phase regardless of actual status", which confirms this is a known bug rather than intentional design.

Suggestion: Use NumberReady == DesiredNumberScheduled → Ready, otherwise NotReady, consistent with the StatefulSet logic.

2. StatefulSetManager.UnavailableReplicas can produce negative values

statefulset_manager.go:ConstructComponentStatusUnavailableReplicas: sts.Status.CurrentReplicas - sts.Status.AvailableReplicas. When AvailableReplicas > CurrentReplicas (Kubernetes API allows this), the result is a negative int32.

Suggestion: Use max(0, Current-Available) or reconsider the calculation.

🟡 Non-blocking Risks

3. Copy-paste error in DaemonSetManager log messages

daemonset_manager.goreconcileDaemonSet logs "start to reconciling dst workload" and "create sts workload succeed" but operates on a DaemonSet, not a StatefulSet. Suggest correcting to "ds workload" / "daemonset workload".

4. Scheme acquisition change risk in engine.go

e.Scheme is now obtained via ctx.Client.Scheme() instead of direct assignment. If ctx.Client is a wrapped/delegated client whose Scheme() returns nil, downstream code that still references e.Scheme could break. Low risk since component managers no longer use scheme directly, but worth a quick sanity check.

📋 Missing Tests

5. No test for DaemonSet ConstructComponentStatus with 0 ready pods

Current tests cover "all ready" and "partially unavailable but still Ready". Should add a test for NumberReady == 0 case, especially if the phase logic is fixed.

💡 Optional Improvements

6. Duplicate constructService code across managers

DaemonSetManager.constructService and StatefulSetManager.constructService are nearly identical — this is likely the source of the 30.6% duplication SonarQube flagged. Consider extracting to a shared helper function.

7. Repeated scheme+client setup in test BeforeEach blocks

The test initialization pattern is identical across StatefulSetManager and DaemonSetManager test sections. Could extract to a shared helper.

Signed-off-by: xliuqq <xlzq1992@gmail.com>
@cheyang cheyang self-requested a review April 28, 2026 07:00
readyReplicas := ds.Status.NumberReady

runtimePhase := datav1alpha1.RuntimePhaseNotReady
if desiredReplicas == readyReplicas {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the phase logic added here (and the equivalent in StatefulSetManager) is internally consistent, but worth noting that every current caller overrides the returned PhasesetupMasterInternal/SetupWorkerInternal/SetupClientInternal force it to RuntimePhaseNotReady, and status.go's setXxxComponentStatus methods recalculate it independently. So the phase value returned from ConstructComponentStatus is effectively dead code today.

Not a problem right now — if a future caller relies on the returned phase directly, the logic here is correct. Just flagging for awareness.

Also a minor edge case: when all nodes are tainted and DesiredNumberScheduled == 0, the 0 == 0 check evaluates to RuntimePhaseReady. This is consistent with how the StatefulSet manager and status.go treat DesiredReplicas == 0, so it's fine — just worth being aware of this semantic in corner cases.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +105 to 108
// Set ServiceName if service is configured
if component.Service != nil {
sts.Spec.ServiceName = component.Service.Name
}
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For StatefulSet workloads, spec.serviceName is required by Kubernetes validation. With the new conditional, a nil component.Service will produce a StatefulSet with an empty ServiceName, which will be rejected by the API server at create time. Consider either (a) returning an explicit error when component.Service == nil for StatefulSets, or (b) defaulting/creating a headless service name and setting sts.Spec.ServiceName accordingly so the resource is valid.

Copilot uses AI. Check for mistakes.
Comment on lines +201 to +213
It("should handle nil service gracefully", func() {
component.Service = nil
err := manager.Reconciler(ctx, component)
Expect(err).NotTo(HaveOccurred())

// Verify StatefulSet was created without ServiceName
sts := &appsv1.StatefulSet{}
err = manager.client.Get(ctx, types.NamespacedName{
Name: "test-runtime-master",
Namespace: "fluid",
}, sts)
Expect(err).NotTo(HaveOccurred())
Expect(sts.Spec.ServiceName).To(BeEmpty())
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test asserts that a StatefulSet reconciliation succeeds when component.Service is nil, but in a real cluster a StatefulSet without spec.serviceName will fail validation. Updating this test to expect an error (or updating the production code to always supply a valid ServiceName for StatefulSets) will prevent false confidence from the fake client, which does not enforce API validation.

Suggested change
It("should handle nil service gracefully", func() {
component.Service = nil
err := manager.Reconciler(ctx, component)
Expect(err).NotTo(HaveOccurred())
// Verify StatefulSet was created without ServiceName
sts := &appsv1.StatefulSet{}
err = manager.client.Get(ctx, types.NamespacedName{
Name: "test-runtime-master",
Namespace: "fluid",
}, sts)
Expect(err).NotTo(HaveOccurred())
Expect(sts.Spec.ServiceName).To(BeEmpty())
It("should return an error when service is nil for StatefulSet reconciliation", func() {
component.Service = nil
err := manager.Reconciler(ctx, component)
Expect(err).To(HaveOccurred())

Copilot uses AI. Check for mistakes.
Comment on lines +409 to +433
It("should return Ready phase even when some nodes are unavailable", func() {
ds := &appsv1.DaemonSet{
ObjectMeta: metav1.ObjectMeta{
Name: "test-runtime-worker",
Namespace: "fluid",
},
Status: appsv1.DaemonSetStatus{
DesiredNumberScheduled: 3,
CurrentNumberScheduled: 3,
NumberAvailable: 2,
NumberUnavailable: 1,
NumberReady: 2,
},
}
Expect(manager.client.Create(ctx, ds)).To(Succeed())

status, err := manager.ConstructComponentStatus(ctx, component)
Expect(err).NotTo(HaveOccurred())
Expect(status.DesiredReplicas).To(Equal(int32(3)))
Expect(status.ReadyReplicas).To(Equal(int32(2)))
Expect(status.AvailableReplicas).To(Equal(int32(2)))
Expect(status.UnavailableReplicas).To(Equal(int32(1)))
// DaemonSet should return NotReady when not all replicas are ready
Expect(status.Phase).To(Equal(datav1alpha1.RuntimePhaseNotReady))
})
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The It description says "should return Ready phase" but the assertions (and the inline comment) expect RuntimePhaseNotReady. Please rename the test case string to match the expected behavior to avoid confusing future readers.

Copilot uses AI. Check for mistakes.
Signed-off-by: xliuqq <xlzq1992@gmail.com>
@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Collaborator

@cheyang cheyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The Scheme removal is clean — no residual references remain in pkg/ddc/cache/. The two blocking bugs from the previous review (DaemonSet always-Ready status and StatefulSet negative UnavailableReplicas) are properly fixed. Service code deduplication into service.go is straightforward. Tests cover creation, idempotency, nil Service, and status phase logic for both manager types. CI is green across K8s 1.22–1.33.

/approve

@fluid-e2e-bot
Copy link
Copy Markdown

fluid-e2e-bot Bot commented Apr 29, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cheyang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cheyang cheyang merged commit cfa6d47 into fluid-cloudnative:master Apr 29, 2026
17 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants