Skip to content

fix(recommendations): apply per-account override when no global ServiceConfig exists (closes #202)#241

Merged
cristim merged 4 commits into
feat/multicloud-web-frontendfrom
fix/issue-202-override-without-global
May 3, 2026
Merged

fix(recommendations): apply per-account override when no global ServiceConfig exists (closes #202)#241
cristim merged 4 commits into
feat/multicloud-web-frontendfrom
fix/issue-202-override-without-global

Conversation

@cristim
Copy link
Copy Markdown
Member

@cristim cristim commented May 3, 2026

Summary

  • Fixes silent drop of per-account AccountServiceOverride when no global ServiceConfig row exists for the same (provider, service) pair (closes fix(recommendations): per-account override silently ignored when no global ServiceConfig exists #202)
  • ResolveServiceConfig extended to handle a nil global by synthesising a safe default baseline (Enabled: true; Provider/Service from call parameters) before applying override fields
  • ResolveAccountConfigsForRecs refactored: global-fetch logic extracted into a globalConfigCache helper (reduces cyclomatic complexity), override lookup now always runs even when global is absent, triple skipped only when both global and override are nil

Test plan

  • TestResolveServiceConfig_NilGlobalWithOverride — override-without-global synthesises correct baseline and applies override fields
  • TestResolveServiceConfig_BothNil — both nil returns nil (no panic)
  • TestResolveAccountConfigsForRecs_OverrideWithoutGlobal_OverrideApplied — regression test for fix(recommendations): per-account override silently ignored when no global ServiceConfig exists #202: entry lands in the map with correct values
  • TestResolveAccountConfigsForRecs_NoGlobalNoOverride_TripleSkipped — renamed + semantics clarified (skip only when both absent)
  • TestResolveAccountConfigsForRecs_OverrideWithGlobal_MergesCorrectly — regression guard for existing feat(recommendations): apply per-account overrides at read time (#196) #200 merge path
  • TestResolveAccountConfigsForRecs_GlobalAbsentCachedNegative — updated: overrideCalls=3 (one per account); global still fetched only once
  • All 483 config package tests pass; scheduler (71) and API (1049) packages clean; go vet clean; gofmt clean; cyclomatic complexity within limit

Summary by CodeRabbit

  • Refactor

    • Improved configuration resolution and caching to avoid redundant lookups and ensure per-account overrides are applied even when no global config exists.
    • Synthesized baseline configs so overrides merge reliably onto a consistent baseline.
  • Tests

    • Expanded coverage with regressions for override-without-global, merge correctness, and lookup deduplication.
  • Bug Fix

    • Clearer timeout reporting for bounded migrations when operations exceed the configured timeout.

…viceConfig absent (closes #202)

ResolveAccountConfigsForRecs previously skipped a triple entirely when
GetServiceConfig returned nil, even if a per-account AccountServiceOverride
existed. The override was silently dropped, causing the recommendation list
to appear unfiltered for that account — contrary to what the operator
configured.

Fix:
- ResolveServiceConfig now accepts provider and service as explicit parameters
  and handles a nil global by synthesising a safe baseline (Enabled: true)
  before applying the override. All scalar/slice fields from the override are
  merged as before.
- ResolveAccountConfigsForRecs no longer short-circuits the override lookup
  when the global is absent. The triple is only skipped when *both* global
  and override are nil (nothing to apply).

Tests:
- TestResolveServiceConfig_NilGlobalWithOverride — override without global
  synthesises correct baseline and applies override fields.
- TestResolveServiceConfig_BothNil — both nil returns nil.
- TestResolveAccountConfigsForRecs_OverrideWithoutGlobal_OverrideApplied —
  regression test for #202: override-without-global entry lands in the map
  with correct values.
- TestResolveAccountConfigsForRecs_NoGlobalNoOverride_TripleSkipped —
  renamed from GlobalAbsent; semantics clarified (skip only when both absent).
- TestResolveAccountConfigsForRecs_OverrideWithGlobal_MergesCorrectly —
  regression test for existing #200 merge path.
- TestResolveAccountConfigsForRecs_GlobalAbsentCachedNegative — updated to
  assert overrideCalls=3 (one per account; previously 0).
@cristim cristim added bug Something isn't working triaged Item has been triaged priority/p2 Backlog-worthy severity/high Significant harm urgency/this-quarter Within the quarter impact/many Affects most users effort/s Hours type/bug Defect labels May 3, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 09c9011b-a994-4e13-a48b-d050d8bd0b22

📥 Commits

Reviewing files that changed from the base of the PR and between 0751f5a and d5ce72a.

📒 Files selected for processing (1)
  • internal/server/app.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/server/app.go

📝 Walkthrough

Walkthrough

ResolveAccountConfigsForRecs now memoizes global ServiceConfig lookups, deduplicates by (account, provider, service), and only omits a triple when both global and per-account override are absent. ResolveServiceConfig now accepts provider/service and will synthesize a baseline global when an override exists but global is nil.

Changes

Resolve & Caching Flow

Layer / File(s) Summary
Data Shape / API
internal/config/resolver.go
ResolveServiceConfig signature changed to ResolveServiceConfig(provider, service string, global *ServiceConfig, override *AccountServiceOverride) *ServiceConfig.
Core Resolution
internal/config/resolver.go
When global==nil and override!=nil, synthesize baseline ServiceConfig{Enabled:true, Provider:provider, Service:service}, copy baseline slice fields, then apply scalar and slice overrides. If override==nil, return global unchanged (may be nil).
Lookup Caching
internal/config/recommendation_overrides.go
Added globalConfigCache that memoizes store.GetServiceConfig(ctx, provider, service) results (including negative misses); lookup() returns (nil, nil) when no global exists.
Iteration / Deduping
internal/config/recommendation_overrides.go
Use a seen set to dedupe by (accountID,provider,service). For each unique triple, fetch cached global then per-account override; skip only when both are absent. Call ResolveServiceConfig(rec.Provider, rec.Service, global, override).
Integration Tests
internal/config/recommendation_overrides_test.go
Renamed and tightened skip test; added tests ensuring overrides apply without a global, merge correctly when both exist, negative caching for globals, and deduped lookups when both absent.
Unit Tests
internal/config/resolver_test.go
Updated tests to pass provider/service args. Added NilGlobalWithOverride (synthesized baseline applied) and BothNil (returns nil).
Minor Runtime Change
internal/server/app.go
runMigrationsBounded now checks errors.Is(err, context.DeadlineExceeded) and returns timeout-wrapped errors based on the migrations context rather than using time.After.

Sequence Diagram(s)

sequenceDiagram
    participant RecsLoop as Recs Loop
    participant Cache as globalConfigCache
    participant Store as Config Store
    participant Resolver as ResolveServiceConfig

    RecsLoop->>Cache: lookup(provider, service)
    alt cache miss
        Cache->>Store: GetServiceConfig(provider, service)
        Store-->>Cache: ServiceConfig or nil
    end
    RecsLoop->>Store: GetAccountServiceOverride(account, provider, service)
    RecsLoop->>Resolver: ResolveServiceConfig(provider, service, global?, override?)
    Resolver-->>RecsLoop: resolved ServiceConfig or nil
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 I stitched a baseline from a whisper thin,
When globals hid, the override hopped in.
No more silent skips beneath the moon—
Each account's choice now hums a tune.
Carrots for configs, victory's bite!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: applying per-account overrides when no global ServiceConfig exists, directly addressing issue #202.
Linked Issues check ✅ Passed The PR implements Option 1 from issue #202: synthesizing a baseline ServiceConfig when global is nil and applying the override, with comprehensive test coverage and fixes the nil-deref risk in ResolveServiceConfig.
Out of Scope Changes check ✅ Passed Changes to runMigrationsBounded context timeout handling are a minor related fix flagged in PR objectives; all code changes directly support fixing the recommendation override bug.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/issue-202-override-without-global

Review rate limit: 4/5 reviews remaining, refill in 12 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@cristim
Copy link
Copy Markdown
Member Author

cristim commented May 3, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/config/recommendation_overrides.go (1)

102-125: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Dedup skipped triples independently of resolved.

Using resolved as the seen-set means Line 119 skips without recording the triple. For repeated recs with the same (account, provider, service) and no global/override, GetAccountServiceOverride will run once per rec instead of once per triple, which adds unnecessary store traffic on a hot path. A separate seen map fixes this and keeps the dedupe contract true even when nothing is written to the result.

Proposed fix
 func ResolveAccountConfigsForRecs(
 	ctx context.Context,
 	store AccountConfigReader,
 	recs []RecommendationRecord,
 ) (map[string]*ServiceConfig, error) {
 	resolved := make(map[string]*ServiceConfig)
+	seenTriples := make(map[string]struct{})
 	if len(recs) == 0 {
 		return resolved, nil
 	}
@@
 		accountID := *rec.CloudAccountID
 		key := AccountConfigKey(accountID, rec.Provider, rec.Service)
-		if _, seen := resolved[key]; seen {
+		if _, seen := seenTriples[key]; seen {
 			continue
 		}
+		seenTriples[key] = struct{}{}
 
 		global, err := cache.lookup(ctx, store, rec.Provider, rec.Service)
 		if err != nil {
 			return resolved, err
 		}
@@
 		if global == nil && override == nil {
 			continue
 		}
 
 		resolved[key] = ResolveServiceConfig(rec.Provider, rec.Service, global, override)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/config/recommendation_overrides.go` around lines 102 - 125, The loop
currently uses the result map `resolved` to deduplicate triples, so when a
triple has no global and no override it gets skipped without recording and can
be processed repeatedly; introduce a separate seen set (e.g., `seen :=
map[string]struct{}{}`) and replace the `if _, seen := resolved[key]; seen {
continue }` check with `if _, ok := seen[key]; ok { continue }`, and ensure you
set `seen[key] = struct{}{}` as soon as you decide to skip or process the triple
(i.e., before `continue` and also before calling `ResolveServiceConfig`/writing
into `resolved`), leaving calls to `cache.lookup`,
`store.GetAccountServiceOverride`, and `ResolveServiceConfig` unchanged except
for using the new dedupe map.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@internal/config/recommendation_overrides.go`:
- Around line 102-125: The loop currently uses the result map `resolved` to
deduplicate triples, so when a triple has no global and no override it gets
skipped without recording and can be processed repeatedly; introduce a separate
seen set (e.g., `seen := map[string]struct{}{}`) and replace the `if _, seen :=
resolved[key]; seen { continue }` check with `if _, ok := seen[key]; ok {
continue }`, and ensure you set `seen[key] = struct{}{}` as soon as you decide
to skip or process the triple (i.e., before `continue` and also before calling
`ResolveServiceConfig`/writing into `resolved`), leaving calls to
`cache.lookup`, `store.GetAccountServiceOverride`, and `ResolveServiceConfig`
unchanged except for using the new dedupe map.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a0dad4b1-ad33-4706-bc58-0c5353b34a83

📥 Commits

Reviewing files that changed from the base of the PR and between c8a78cb and f70c451.

📒 Files selected for processing (4)
  • internal/config/recommendation_overrides.go
  • internal/config/recommendation_overrides_test.go
  • internal/config/resolver.go
  • internal/config/resolver_test.go

Using resolved as the seen-set meant triples with no global and no
override were never recorded, causing redundant store lookups on
duplicate recs for the same (account, provider, service) triple.

Introduce a dedicated seen map that records every triple on first visit,
regardless of whether it writes to resolved.

Also add TestResolveAccountConfigsForRecs_NoGlobalNoOverride_DedupedCorrectly
as a regression guard.
@cristim
Copy link
Copy Markdown
Member Author

cristim commented May 3, 2026

Fixed the CR finding: introduced a dedicated seen map to dedup triples independently of resolved. Added regression guard test.

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

Tip

For best results, initiate chat on the files or code changes.

@cristim: Thanks for the update! I'll kick off a new review to verify the fix with the dedicated seen map for deduplication and the new regression guard test.

[review]

Return the same timeout-shaped error when the migration goroutine observes
the bounded context deadline before the select timeout branch wins. This keeps
runMigrationsBounded deterministic under CI scheduling.
@cristim
Copy link
Copy Markdown
Member Author

cristim commented May 3, 2026

@CodeRabbit review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@cristim
Copy link
Copy Markdown
Member Author

cristim commented May 3, 2026

Addressed the pre-commit failure in 0751f5a by normalizing runMigrationsBounded context deadline errors into the existing migration timeout error shape. @coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

Tip

For best results, initiate chat on the files or code changes.

@cristim: Thanks for the update! I'll kick off a new review to verify the normalization of runMigrationsBounded context deadline errors into the migration timeout error shape from commit 0751f5a70.

[review]

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
internal/server/app.go (1)

215-225: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use one timeout source here.

time.After(timeout) can still win the race and return the unwrapped timeout error, so this change does not fully normalize the result. Switching the select to migCtx.Done()/migCtx.Err() would make the timeout path deterministic and preserve the underlying deadline cause.

♻️ Suggested adjustment
   select {
   case err := <-done:
     if errors.Is(err, context.DeadlineExceeded) {
       return fmt.Errorf("migration timed out after %s: %w", timeout, err)
     }
     return err
-  case <-time.After(timeout):
-    cancelMig()
+  case <-migCtx.Done():
     <-done
-    return fmt.Errorf("migration timed out after %s", timeout)
+    return fmt.Errorf("migration timed out after %s: %w", timeout, migCtx.Err())
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/server/app.go` around lines 215 - 225, The select currently races
between done and time.After(timeout) causing an unwrapped timeout path; replace
the time.After branch with a branch that waits on migCtx.Done() and inspects
migCtx.Err() so the code deterministically uses the migration context deadline
as the timeout source. In the select that references the done channel and
cancelMig(), change the second case to <-migCtx.Done(), call cancelMig() if
needed, then read <-done and return migCtx.Err() (or wrap it like you do for
context.DeadlineExceeded) so the returned error preserves the underlying context
deadline error instead of an unwrapped timeout.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@internal/server/app.go`:
- Around line 215-225: The select currently races between done and
time.After(timeout) causing an unwrapped timeout path; replace the time.After
branch with a branch that waits on migCtx.Done() and inspects migCtx.Err() so
the code deterministically uses the migration context deadline as the timeout
source. In the select that references the done channel and cancelMig(), change
the second case to <-migCtx.Done(), call cancelMig() if needed, then read <-done
and return migCtx.Err() (or wrap it like you do for context.DeadlineExceeded) so
the returned error preserves the underlying context deadline error instead of an
unwrapped timeout.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5afeff61-8485-4492-9d15-a97ecaa0d6af

📥 Commits

Reviewing files that changed from the base of the PR and between f70c451 and 0751f5a.

📒 Files selected for processing (3)
  • internal/config/recommendation_overrides.go
  • internal/config/recommendation_overrides_test.go
  • internal/server/app.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/config/recommendation_overrides_test.go

Select on the migration context deadline instead of a separate timer so
runMigrationsBounded consistently wraps the underlying deadline error.
@cristim
Copy link
Copy Markdown
Member Author

cristim commented May 3, 2026

Addressed latest CR comment in d5ce72a: runMigrationsBounded now uses migCtx.Done() as the timeout source and wraps migCtx.Err(), removing the separate time.After race. @coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

Tip

For best results, initiate chat on the files or code changes.

@cristim: Thanks for the update! Using migCtx.Done() as the single timeout source and wrapping migCtx.Err() is the idiomatic Go approach — it eliminates the time.After leak and removes the select-race between two independent timers. I'll kick off a new review to verify the change.

[review]

@cristim
Copy link
Copy Markdown
Member Author

cristim commented May 3, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@cristim cristim merged commit c6588bb into feat/multicloud-web-frontend May 3, 2026
4 checks passed
@cristim cristim deleted the fix/issue-202-override-without-global branch May 3, 2026 20:35
@cristim
Copy link
Copy Markdown
Member Author

cristim commented May 3, 2026

Both CR items are already addressed in prior commits on this branch:

Actionable item (recommendation_overrides.go dedup): Fixed in commit a94fdb8 — introduced a separate seen := make(map[string]struct{}) map so triples with global == nil && override == nil are recorded as seen on first encounter and don't re-trigger cache.lookup + GetAccountServiceOverride on duplicate recs. Test TestResolveAccountConfigsForRecs_NoGlobalNoOverride_DedupedCorrectly pins the 3-rec/same-triple/assert-called-once contract.

Outside-diff item (internal/server/app.go timeout race): Also already fixed — commit d5ce72a replaced time.After(timeout) with <-migCtx.Done() and wraps migCtx.Err(), making the timeout path deterministic. This was part of the CR loop on the fix(server) commits, not the original #202 scope, but it's in the diff now. No further action needed.

No new commits required — everything CR flagged is resolved.

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 3, 2026

Tip

For best results, initiate chat on the files or code changes.

@cristim: Thanks for the detailed summary! Both items look well-addressed:

  • Dedup (recommendation_overrides.go): The dedicated seen map correctly decouples deduplication from the resolved map, ensuring cache.lookup + GetAccountServiceOverride are not redundantly called for repeated (accountID, provider, service) triples when both global and override are absent. The pinning test TestResolveAccountConfigsForRecs_NoGlobalNoOverride_DedupedCorrectly is a solid regression guard.
  • Timeout race (internal/server/app.go): Using migCtx.Done() as the single timeout source and wrapping migCtx.Err() is the idiomatic Go approach — it eliminates the time.After leak and the non-deterministic select race between two independent timers.

Kicking off a fresh review now.

[review]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working effort/s Hours impact/many Affects most users priority/p2 Backlog-worthy severity/high Significant harm triaged Item has been triaged type/bug Defect urgency/this-quarter Within the quarter

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant