[FA] Serialize UPDATER_TASK execution and fix no-op config version clearing by coignetp · Pull Request #2956 · DataDog/datadog-operator

coignetp · 2026-04-29T09:46:54Z

What does this PR do?

Ports three fixes from #2952 that are independent of the parseTaskNSN / namespace-from-task refactor.

1. Serialize UPDATER_TASK execution (`taskMu` + `handleTask`)

Adds a taskMu sync.Mutex to Daemon and extracts a handleTask method that holds the lock for the full lifecycle of one task (RUNNING → DONE/ERROR/INVALID_STATE). The inline block in Start() is replaced with return d.handleTask(ctx, req).

Why: Multiple concurrent RC callbacks can race on setTaskState. This matches the datadog-agent's single-writer model.

2. No-op stop clears experiment config version

When canStop returns isNoOp=true (e.g. spec already rolled back due to timeout), the experiment config version is now cleared before returning.

Why: Without this, the backend could not issue new tasks after a timeout-triggered rollback.

3. No-op promote clears experiment config version

Same fix for promoteDatadogAgentExperiment when the experiment is already in a terminal state.

Checklist

PR has at least one valid label: bug, enhancement, refactoring, documentation, tooling, and/or dependencies
PR has a milestone or the qa/skip-qa label
All commits are signed (see: signing commits)

…earing - Add taskMu mutex + handleTask method to serialize concurrent RC task callbacks, preventing races in setTaskState (matches datadog-agent single-writer model) - Clear experiment config version on no-op stop/promote so the backend can issue new tasks after a timeout or already-terminal experiment state Extracted from #2952 (excludes the parseTaskNSN / namespace-from-task refactor).

codecov-commenter · 2026-04-29T09:58:41Z

Codecov Report

❌ Patch coverage is 94.73684% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 41.45%. Comparing base (099f33f) to head (17d9deb).

Files with missing lines	Patch %	Lines
pkg/fleet/daemon.go	94.73%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2956      +/-   ##
==========================================
+ Coverage   41.38%   41.45%   +0.07%     
==========================================
  Files         327      327              
  Lines       28952    28961       +9     
==========================================
+ Hits        11982    12006      +24     
+ Misses      16109    16094      -15     
  Partials      861      861

Flag	Coverage Δ
unittests	`41.45% <94.73%> (+0.07%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
pkg/fleet/daemon.go	`71.83% <94.73%> (+6.37%)`	⬆️

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 099f33f...17d9deb. Read the comment docs.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

datadog-prod-us1-6 · 2026-04-29T10:01:08Z

🎯 Code Coverage (details)
• Patch Coverage: 94.44%
• Overall Coverage: 41.59% (+0.07%)

_{This comment will be updated automatically if new data arrives.

🔗 Commit SHA: 17d9deb | Docs | Datadog PR Page | Give us feedback!}

github-actions Bot added the team/fleet label Apr 29, 2026

coignetp added the enhancement New feature or request label Apr 29, 2026

coignetp added this to the v1.26.0 milestone Apr 29, 2026

coignetp marked this pull request as ready for review April 29, 2026 11:29

coignetp requested a review from a team April 29, 2026 11:29

coignetp requested a review from a team as a code owner April 29, 2026 11:29

Hitsuji-M approved these changes Apr 29, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FA] Serialize UPDATER_TASK execution and fix no-op config version clearing#2956

[FA] Serialize UPDATER_TASK execution and fix no-op config version clearing#2956
coignetp wants to merge 1 commit intomainfrom
paul/fa-fleet-state

coignetp commented Apr 29, 2026

Uh oh!

codecov-commenter commented Apr 29, 2026 •

edited

Loading

Uh oh!

datadog-prod-us1-6 Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

coignetp commented Apr 29, 2026

What does this PR do?

1. Serialize UPDATER_TASK execution (taskMu + handleTask)

2. No-op stop clears experiment config version

3. No-op promote clears experiment config version

Checklist

Uh oh!

codecov-commenter commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

datadog-prod-us1-6 Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1. Serialize UPDATER_TASK execution (`taskMu` + `handleTask`)

codecov-commenter commented Apr 29, 2026 •

edited

Loading