Skip to content

fix(db/azure): flush stale monthly_cost cache + Azure recurring cost (closes #252)#256

Merged
cristim merged 3 commits into
feat/multicloud-web-frontendfrom
fix/issue-252-followup
May 4, 2026
Merged

fix(db/azure): flush stale monthly_cost cache + Azure recurring cost (closes #252)#256
cristim merged 3 commits into
feat/multicloud-web-frontendfrom
fix/issue-252-followup

Conversation

@cristim
Copy link
Copy Markdown
Member

@cristim cristim commented May 4, 2026

Summary

Follow-up to PR #254 which fixed the Go struct types and AWS parsers but left two gaps that keep the deployed app returning `monthly_cost: 0` for all 33 recs (17 AWS, 16 Azure):

Gap 1 — Stale cache (migration 000046): Pre-#254 rows in the `recommendations` table store `monthly_cost` as a literal JSON `0` (from the old `float64` field). Reading those rows post-deploy produces a pointer-to-zero, which the API re-emits as 0 — indistinguishable from a legitimate $0 recurring charge.

Fix: targeted in-place JSONB rewrite (not TRUNCATE) — sets `monthly_cost = null` for all rows where the current value is 0. This causes the frontend to render "—" (honest: data not yet collected) rather than "$0" (misleading). Existing rows stay intact; the daily scheduled collector will overwrite them with correct values on its next run.

Why not TRUNCATE + cold-start? Azure collection alone exceeds the API Lambda's 60s timeout (#257). A forced cold-start collect on the next request would cause a 502. Filed #257 as a separate perf issue.

Gap 2 — Azure parsers (ExtractedFields + 4 converters): PR #254 never updated the Azure parsers. Azure Reservation recommendations are always all-upfront (single payment, no recurring monthly charge), so `RecurringMonthlyCost` should be a non-nil `*float64` pointing to 0.0 — meaning "no recurring charge" rather than nil ("data not available"). Added `RecurringMonthlyCost *float64` to `ExtractedFields`, set to `float64Ptr(0)` in both `extractLegacy` and `extractModern`, and wired it through all four active service converters (compute, cache, database, cosmosdb).

GCP: Left nil — the GCP Recommender API does not expose commitment cost per period. Frontend renders "—" which is accurate.

Related: #257 (Azure Lambda timeout — separate perf issue, not fixed here)

Test plan

  • `providers/azure/internal/recommendations` tests: `TestExtract_Legacy_RecurringMonthlyCostIsZeroPointer` and `TestExtract_Modern_RecurringMonthlyCostIsZeroPointer` assert the field is a non-nil pointer-to-zero for both Legacy and Modern API response shapes
  • `providers/azure/services/compute/client_test.go`: `TestComputeClient_ConvertAzureVMRecommendation_PopulatesAllFields` now asserts `RecurringMonthlyCost != nil && *RecurringMonthlyCost == 0`
  • All 345 Azure tests pass, all 4261 root module tests pass, all 319 pkg tests pass
  • After deploy: migration 000046 runs, stale `monthly_cost: 0` rows are rewritten to `null`, frontend shows "—" for existing rows. The daily scheduler tick will then write correct values: AWS RI/SP recs get actual recurring cost, Azure recs get $0 (all-upfront), GCP shows "—" (data not available)

Closes #252

Summary by CodeRabbit

  • Bug Fixes

    • Corrected stale monthly cost values in recommendations that were previously cached as zero.
  • New Features

    • Added consistent recurring monthly cost handling for Azure reservation recommendations.
    • Enhanced Azure VM recommendations with SKU catalogue enrichment to automatically populate CPU and memory specifications.

cristim added 2 commits May 4, 2026 13:52
…gap 1)

Pre-PR-#254 rows in the recommendations table store monthly_cost as a
literal JSON 0 (old float64 field). After PR #254 changed the field to
*float64, reading those rows produces a pointer-to-zero, which the API
re-emits as 0 — indistinguishable from a legitimate all-upfront zero.

Migration 000046 truncates the recommendations table and resets
last_collected_at to NULL, which triggers a synchronous cold-start
collect on the next ListRecommendations call. The table is a pure
cache (scheduler re-fetches from cloud APIs on every collect), so
TRUNCATE has no data-loss risk.
…ons (closes #252)

Azure Reservation recommendations are always all-upfront (single payment,
no recurring monthly charge). PR #254 added RecurringMonthlyCost to the
common.Recommendation struct and wired it through the scheduler but never
updated the Azure parsers, so all 16 Azure recs still returned monthly_cost
null (renders as "—" after cache refresh) instead of the accurate value 0
("$0 — no recurring charge").

Changes:
- providers/azure/internal/recommendations: add RecurringMonthlyCost *float64
  to ExtractedFields with float64Ptr(0) set in both extractLegacy and
  extractModern. All Azure recs are all-upfront by API convention.
- providers/azure/services/{compute,cache,database,cosmosdb}: pass
  RecurringMonthlyCost: f.RecurringMonthlyCost through to common.Recommendation.
- Tests: assert RecurringMonthlyCost is non-nil pointer-to-zero for both Legacy
  and Modern fixtures in converter_test.go and for the compute converter in
  client_test.go.

GCP CUDs: RecurringMonthlyCost left nil — the GCP Recommender API does not
expose commitment cost per period (only savings), so nil ("—") is the accurate
frontend rendering.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0fc149a9-4693-482b-a111-80ce84099bc9

📥 Commits

Reviewing files that changed from the base of the PR and between bd331c4 and 8a94ad9.

📒 Files selected for processing (9)
  • internal/database/postgres/migrations/000046_invalidate_monthly_cost_cache.down.sql
  • internal/database/postgres/migrations/000046_invalidate_monthly_cost_cache.up.sql
  • providers/azure/internal/recommendations/converter.go
  • providers/azure/internal/recommendations/converter_test.go
  • providers/azure/services/cache/client.go
  • providers/azure/services/compute/client.go
  • providers/azure/services/compute/client_test.go
  • providers/azure/services/cosmosdb/client.go
  • providers/azure/services/database/client.go

📝 Walkthrough

Walkthrough

This PR fixes a bug where monthly cost data was hardcoded to 0 by explicitly tracking RecurringMonthlyCost in the Azure provider and invalidating stale monthly_cost values in the database. The change ensures reservations correctly distinguish between "no recurring monthly charge" (non-nil 0) and "data not available" (nil).

Changes

Monthly Cost Data Cleanup & Azure RecurringMonthlyCost

Layer / File(s) Summary
Database Migration
internal/database/postgres/migrations/000046_invalidate_monthly_cost_cache.{up,down}.sql
Migration rewrites recommendation payload.monthly_cost from 0 to null for stale rows and documents that rollback is intentionally skipped to avoid reintroducing the bug.
Data Shape
providers/azure/internal/recommendations/converter.go
ExtractedFields struct gains RecurringMonthlyCost *float64 field; helper float64Ptr() added to create non-nil zero pointers, distinguishing explicit zero from nil.
Core Extraction Logic
providers/azure/internal/recommendations/converter.go
Both extractLegacy and extractModern now set RecurringMonthlyCost to float64Ptr(0) since Azure reservations are all-upfront with no recurring monthly charge.
Service Converters
providers/azure/services/{cache,compute,cosmosdb,database}/client.go
Struct literals for returned common.Recommendation values reformatted (whitespace/alignment only); field assignments and values unchanged.
Tests & Validation
providers/azure/internal/recommendations/converter_test.go, providers/azure/services/compute/client_test.go
Added tests asserting RecurringMonthlyCost is non-nil with value 0 for both legacy and modern extractors; compute tests also validate SKU-catalogue enrichment (VCPU/MemoryGB from cached SKU lookup) and fallback behavior when catalogue unavailable.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • PR #254: Both PRs modify the same RecurringMonthlyCost propagation plumbing across provider parsers and scheduler, so they work on the same backend data-shape fix for issue #252.
  • PR #229: Both PRs modify Azure compute conversion and add SKU-catalogue enrichment tests in providers/azure/services/compute/, so they extend the same compute module.

Poem

🐰 A zero that's not null, but truly there,
marks reservations with transparent care.
No monthly cost? Be explicit, be true—
not unknown, but a pointer to zero!

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning The PR partially addresses #252's backend objectives (adding RecurringMonthlyCost, targeted cache flush), but does not implement the frontend fixes required by #252 (null rendering, percentage sanity checks, unit test). Frontend changes are out of scope for this PR per the objectives; ensure #252 is not closed until frontend fixes are also complete and merged.
Docstring Coverage ⚠️ Warning Docstring coverage is 55.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: fixing stale monthly_cost cache and adding Azure recurring cost support, with explicit issue reference.
Out of Scope Changes check ✅ Passed All changes directly address #252 backend requirements and #257 follow-up. Cache migration, Azure extractor enhancements, and converter reformatting are all in scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/issue-252-followup

Review rate limit: 4/5 reviews remaining, refill in 12 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

@cristim cristim added type/bug Defect priority/p1 Next up; this sprint severity/medium Moderate harm urgency/this-sprint Within the current sprint impact/many Affects most users effort/m Days triaged Item has been triaged labels May 4, 2026
…write

Azure recommendation collection alone takes >60s and exceeds the API Lambda
timeout — a forced cold-start collect (triggered by TRUNCATE + resetting
last_collected_at) would cause a 502 on the first request after deploy.

Switch to a targeted in-place payload rewrite:
  UPDATE recommendations
     SET payload = jsonb_set(payload, '{monthly_cost}', 'null'::jsonb)
   WHERE (payload->>'monthly_cost')::text = '0';

This converts stale pre-PR-#254 zeros to null so the frontend renders "—"
(honest: data not yet collected) rather than "$0" (misleading: implies $0
recurring). Existing rows stay in place; the daily scheduler tick writes
correct values on its next run.
@cristim
Copy link
Copy Markdown
Member Author

cristim commented May 4, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

✅ Actions performed

Full review triggered.

@cristim cristim merged commit c8c3ba9 into feat/multicloud-web-frontend May 4, 2026
4 checks passed
@cristim
Copy link
Copy Markdown
Member Author

cristim commented May 4, 2026

CodeRabbit review pass 1 — summary

CodeRabbit's first review found zero actionable items. 🎉

Pre-merge warnings (both dismissed — out of scope):

  1. Linked Issues check ⚠️ — CR noted the PR partially addresses fix(frontend/recs): Monthly cost column always $0; Effective % shows 100% / negatives — scheduler hardcodes MonthlyCost=0 #252 (backend only, not frontend). This is intentional by design. The PR description explicitly states that frontend rendering of null monthly_cost is addressed by PR fix(frontend/scheduler): monthly cost shows $0 and effective % shows 100%/negatives (#252) #254 (already merged). This PR closes the remaining backend gaps.

  2. Docstring Coverage at 55.56% ⚠️ — This is a pre-existing repo-wide metric, not introduced by this PR. The two new functions added (float64Ptr helper, extractLegacy/extractModern updates) are in a package that already had coverage below 80%. Fixing docstring coverage across the entire providers/azure/ module is out of scope for this bug fix.

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 4, 2026

Tip

For best results, initiate chat on the files or code changes.

@cristim Thanks for the clear summary! The two dismissed warnings make sense in context:

  1. Linked Issues — The frontend null rendering was already handled in the merged PR #254, so this PR correctly scopes itself to the remaining backend gaps.
  2. Docstring Coverage — A pre-existing metric unrelated to this targeted bug fix.

Triggering a new review now.

[review]

cristim added a commit that referenced this pull request May 6, 2026
…isibility (closes #321) (#324)

* fix(db/aws): flush stale recs cache missing on_demand_cost (migration 000049)

PR #312 added on_demand_cost population for AWS SP rows but shipped no
cache-invalidation migration. Pre-#312 AWS rows in recommendations.payload
lack the on_demand_cost key, so the frontend falls back to the broken
reconstruction formula (which double-counts amortization), rendering
implausibly high Effective % values (e.g. ~86% instead of ~22%).

Migration 000049 follows #256's pattern: set monthly_cost to null on
AWS rows that lack on_demand_cost, so the frontend renders "—" until
the next scheduler tick repopulates with correct values. The WHERE clause
is scoped strictly to AWS rows missing the field (idempotent on re-run).

Closes #321

* fix(recommendations/aws): warn-log when on-demand cost inputs are nil

Add structured warn-logs in the RI and SP parsers when the AWS CE API
fields that populate OnDemandCost are absent from the response:

- parseAWSCostDetails (parser_ri.go): logs when EstimatedMonthlyOnDemandCost
  is nil, which causes OnDemandCost=0 → stored as nil by the scheduler's
  nonZeroPtr helper → frontend reconstruction fallback.

- parseSavingsPlanDetail (parser_sp.go): logs when
  CurrentAverageHourlyOnDemandSpend is nil, same downstream effect.

Both logs follow the respective file's existing logging style (fmt.Printf
for parser_ri.go, log.Printf for parser_sp.go). The logs make it
observable when the fallback path is taken, aiding diagnosis for issue #321
and future regression detection.

* fix: apply CodeRabbit auto-fixes

Fixed 1 file(s) based on 1 unresolved review comment.

Co-authored-by: CodeRabbit <noreply@coderabbit.ai>

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: CodeRabbit <noreply@coderabbit.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

effort/m Days impact/many Affects most users priority/p1 Next up; this sprint severity/medium Moderate harm triaged Item has been triaged type/bug Defect urgency/this-sprint Within the current sprint

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant