Add anomaly detection — baseline comparison for acute deviations (#589) by erikdarlingdata · Pull Request #606 · erikdarlingdata/PerformanceMonitor

erikdarlingdata · 2026-03-17T15:37:23Z

Summary

New AnomalyDetector compares analysis window against 24-hour baseline
Detects CPU spikes (σ-based), wait spikes (ratio-based), blocking/deadlock spikes, I/O latency anomalies
Wired into AnalysisService pipeline between fact collection and scoring
Skips all detection when no baseline data exists (prevents false positives on new servers)

Detection types:

Type	Signal	Scoring
`ANOMALY_CPU_SPIKE`	Peak CPU deviates from baseline mean	2σ=0.5, 4σ=1.0
`ANOMALY_WAIT_{type}`	Wait type 5x+ increase or new	5x=0.5, 20x=1.0
`ANOMALY_BLOCKING_SPIKE`	Blocking events 3x+ baseline	3x=0.5, 10x=1.0
`ANOMALY_DEADLOCK_SPIKE`	Deadlocks 3x+ baseline	3x=0.5, 10x=1.0
`ANOMALY_READ_LATENCY`	Read latency deviation	2σ=0.5, 4σ=1.0
`ANOMALY_WRITE_LATENCY`	Write latency deviation	2σ=0.5, 4σ=1.0

Test scenarios added:

CPU spike anomaly (10% baseline → 95% spike)
Blocking spike anomaly (0 baseline → 50 events + 10 deadlocks)
Wait spike anomaly (minimal PAGEIOLATCH → 8M ms flood)

Test plan

dotnet build — 0 errors
dotnet test — 138 tests pass (131 existing + 7 new)
Live test against sql2022 with HammerDB data

Closes #589

🤖 Generated with Claude Code

Summary by CodeRabbit

Release Notes

New Features
- Integrated anomaly detection into the analysis pipeline to automatically identify CPU spikes, wait-time anomalies, blocking and deadlock situations, and I/O latency issues by comparing current performance against a 24-hour baseline.
Tests
- Added comprehensive test coverage for anomaly detection scenarios including CPU spike, blocking spike, and wait spike anomalies with validation of detection accuracy and anomaly metadata.

New AnomalyDetector compares the analysis window against a 24-hour baseline period to detect acute deviations. Detects: - CPU spikes: peak CPU deviation from baseline mean (σ-based scoring) - Wait spikes: wait types with 5x+ increase or new in analysis window - Blocking spikes: blocking/deadlock counts 3x+ above baseline - I/O latency anomalies: read/write latency deviation from baseline Scoring: CPU/IO use standard deviation (2σ=0.5, 4σ=1.0). Waits use ratio (5x=0.5, 20x=1.0). Blocking uses ratio (3x=0.5, 10x=1.0). Global safety: skips all anomaly detection when no baseline data exists (prevents everything looking anomalous on new servers). Uses strict boundary exclusion to prevent analysis window data leaking into baseline. Wired into AnalysisService pipeline between fact collection and scoring. Tool recommendations added for all anomaly fact types. Test scenarios: CPU spike anomaly (10% baseline → 95% spike), blocking spike anomaly (0 baseline → 50 events), wait spike anomaly (minimal PAGEIOLATCH baseline → 8M ms flood). 7 new tests, 138 total passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-03-17T15:37:43Z

📝 Walkthrough

Walkthrough

This change implements anomaly detection by comparing runtime metrics against a 24-hour baseline, detecting CPU spikes, wait-time anomalies, blocking/deadlock events, and I/O latency deviations. The AnomalyDetector integrates into the analysis pipeline, produces anomaly facts with scored severity, and includes comprehensive test coverage with data seeding utilities.

Changes

Cohort / File(s)	Summary
Core Anomaly Detection `Lite/Analysis/AnomalyDetector.cs`	New class implementing baseline-comparison anomaly detection for CPU, wait-time, blocking/deadlock, and I/O latency metrics. Uses 24-hour baseline with statistical thresholds (2σ for deviation, 3–5x for ratio) and returns list of Fact objects with detailed metadata.
Analysis Pipeline Integration `Lite/Analysis/AnalysisService.cs`	Integrates AnomalyDetector into the main AnalyzeAsync flow; calls DetectAnomaliesAsync after fact collection and before scoring to enrich facts with anomaly detections.
Anomaly Scoring `Lite/Analysis/FactScorer.cs`	Adds ScoreAnomalyFact private method to score CPU/latency anomalies via deviation-based scaling and wait/blocking anomalies via ratio-based scaling; includes "anomaly" in context sources for amplification.
Test Scenarios & Data Seeding `Lite.Tests/ScenarioTests.cs`, `Lite/Analysis/TestDataSeeder.cs`	Adds seven scenario tests (CPU spike, blocking/deadlock spikes, wait floods) and supporting infrastructure: RunFullPipelineWithAnomaliesAsync helper, baseline window properties (BaselineStart, BaselineEnd), and range-seeding utilities (SeedCpuUtilizationInRangeAsync, SeedWaitStatsInRangeAsync).
Tool Mapping for Anomalies `Lite/Mcp/McpAnalysisTools.cs`	Expands ToolRecommendations with new anomaly groups (ANOMALY_CPU, ANOMALY_WAIT, ANOMALY_BLOCKING, ANOMALY_IO); enhances GetForStoryPath to map dynamic anomaly-prefixed keys to corresponding tool groups.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main change: introducing anomaly detection with baseline comparison to identify acute deviations.
Linked Issues check	✅ Passed	The implementation addresses all core requirements from `#589`: rolling baselines, spike detection via standard deviation thresholds, anomaly detection integrated into analysis pipeline, and comprehensive test coverage.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to implementing anomaly detection: detector class, pipeline integration, test additions, fact scoring, test data seeding, and MCP tool recommendations.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/anomaly-detection-589

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

Lite.Tests/ScenarioTests.cs (1)

434-464: Consider extracting shared logic between pipeline helpers.

RunFullPipelineWithAnomaliesAsync largely duplicates RunFullPipelineAsync (lines 335-360), differing only in the anomaly detection step (lines 447-450). Consider refactoring to reduce duplication:

♻️ Optional refactor to consolidate helpers

 private async Task<(List<AnalysisStory> Stories, Dictionary<string, Fact> Facts)> RunFullPipelineAsync(
-    Func<TestDataSeeder, Task> seedAction)
+    Func<TestDataSeeder, Task> seedAction, bool includeAnomalies = false)
 {
     await _duckDb.InitializeAsync();
     await _duckDb.InitializeAnalysisSchemaAsync();

     var seeder = new TestDataSeeder(_duckDb);
     await seedAction(seeder);

     var collector = new DuckDbFactCollector(_duckDb);
     var context = TestDataSeeder.CreateTestContext();
     var facts = await collector.CollectFactsAsync(context);

+    if (includeAnomalies)
+    {
+        var anomalyDetector = new AnomalyDetector(_duckDb);
+        var anomalies = await anomalyDetector.DetectAnomaliesAsync(context);
+        facts.AddRange(anomalies);
+    }
+
     var scorer = new FactScorer();
     scorer.ScoreAll(facts);

     var graph = new RelationshipGraph();
     var engine = new InferenceEngine(graph);
     var stories = engine.BuildStories(facts);

     var factsByKey = facts
         .Where(f => f.Severity > 0)
         .ToDictionary(f => f.Key, f => f);

     return (stories, factsByKey);
 }

-private async Task<(List<AnalysisStory> Stories, Dictionary<string, Fact> Facts)> RunFullPipelineWithAnomaliesAsync(
-    Func<TestDataSeeder, Task> seedAction)
-{
-    // ... duplicated code ...
-}
+private Task<(List<AnalysisStory> Stories, Dictionary<string, Fact> Facts)> RunFullPipelineWithAnomaliesAsync(
+    Func<TestDataSeeder, Task> seedAction)
+    => RunFullPipelineAsync(seedAction, includeAnomalies: true);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Lite.Tests/ScenarioTests.cs` around lines 434 - 464,
RunFullPipelineWithAnomaliesAsync duplicates most of RunFullPipelineAsync;
extract the shared pipeline steps (DuckDb.InitializeAsync /
InitializeAnalysisSchemaAsync, seeding via TestDataSeeder,
DuckDbFactCollector.CollectFactsAsync, FactScorer.ScoreAll,
RelationshipGraph/InferenceEngine.BuildStories and the final severity filter)
into a single helper method (e.g., RunFullPipelineCore or RunFullPipelineAsync
with an optional parameter or Func to inject anomaly detection), then have
RunFullPipelineWithAnomaliesAsync call that helper and only perform the
AnomalyDetector.DetectAnomaliesAsync step (and merging anomalies into facts) via
the injected behavior; update callers to use the consolidated helper to remove
the duplicated code in RunFullPipelineWithAnomaliesAsync and
RunFullPipelineAsync.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@Lite.Tests/ScenarioTests.cs`:
- Around line 434-464: RunFullPipelineWithAnomaliesAsync duplicates most of
RunFullPipelineAsync; extract the shared pipeline steps (DuckDb.InitializeAsync
/ InitializeAnalysisSchemaAsync, seeding via TestDataSeeder,
DuckDbFactCollector.CollectFactsAsync, FactScorer.ScoreAll,
RelationshipGraph/InferenceEngine.BuildStories and the final severity filter)
into a single helper method (e.g., RunFullPipelineCore or RunFullPipelineAsync
with an optional parameter or Func to inject anomaly detection), then have
RunFullPipelineWithAnomaliesAsync call that helper and only perform the
AnomalyDetector.DetectAnomaliesAsync step (and merging anomalies into facts) via
the injected behavior; update callers to use the consolidated helper to remove
the duplicated code in RunFullPipelineWithAnomaliesAsync and
RunFullPipelineAsync.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d7607e8d-11da-4e41-9e7d-83c44a955116

📥 Commits

Reviewing files that changed from the base of the PR and between 5c08471 and 3b43644.

📒 Files selected for processing (6)

Lite.Tests/ScenarioTests.cs
Lite/Analysis/AnalysisService.cs
Lite/Analysis/AnomalyDetector.cs
Lite/Analysis/FactScorer.cs
Lite/Analysis/TestDataSeeder.cs
Lite/Mcp/McpAnalysisTools.cs

coderabbitai Bot reviewed Mar 17, 2026

View reviewed changes

erikdarlingdata merged commit f1f3fe9 into dev Mar 17, 2026
5 checks passed

ClaudioESSilva mentioned this pull request Mar 17, 2026

[FEATURE] Enhancement - Growth Rates and VLF Counts #567

Closed

5 tasks

This was referenced Mar 17, 2026

Fix 6 verified Lite bugs from code review #611

Merged

Port ErikAI analysis engine to Dashboard (#590) #615

Merged

erikdarlingdata deleted the feature/anomaly-detection-589 branch April 9, 2026 00:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add anomaly detection — baseline comparison for acute deviations (#589)#606

Add anomaly detection — baseline comparison for acute deviations (#589)#606
erikdarlingdata merged 1 commit into
devfrom
feature/anomaly-detection-589

erikdarlingdata commented Mar 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 17, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

erikdarlingdata commented Mar 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Detection types:

Test scenarios added:

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

erikdarlingdata commented Mar 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 17, 2026 •

edited

Loading