Skip ring-buffer records missing SystemHealth values in CPU collector (#989)#990
Merged
Merged
Conversation
…#989) Some RING_BUFFER_SCHEDULER_MONITOR records lack a complete SystemHealth block, so the ProcessUtilization / SystemIdle XML values extract as NULL. The Dashboard collector inserts into NOT NULL columns, so a single bad record fails the whole INSERT atomically. Nothing is ever inserted, so @max_sample_time stays NULL, every run rescans the full 7-day window and re-hits the same bad records — the collector never recovers. - install/18: extract ProcessUtilization/SystemIdle once via CROSS APPLY, filter out records where either is NULL. Valid rows now insert, @max_sample_time advances, recovery is immediate. - Lite RemoteCollectorService.Cpu.cs: same CROSS APPLY + NULL filter. Lite's DuckDB columns are nullable so it never hard-failed, but it stored NULL samples that skew the CPU chart. Chose to drop malformed records rather than ISNULL(...,0): a fabricated 0 reads as a real "0% CPU" sample and misleads the charts; a record with no SystemHealth block is not a CPU reading at all. Verified: installed against SQL2022, collector runs clean; synthetic test confirms a record with an empty <SystemHealth/> is filtered out while a valid record passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #989. Some
RING_BUFFER_SCHEDULER_MONITORrecords lack a completeSystemHealthblock, so theProcessUtilization/SystemIdleXML values extract asNULL.The Dashboard collector inserts into
NOT NULLcolumns, so a single malformed record fails the wholeINSERTatomically. Nothing is ever inserted →@max_sample_timestaysNULL→ every run rescans the full 7-day window and re-hits the same bad records → the collector never recovers.Changes
install/18_collect_cpu_utilization_stats.sql— extractProcessUtilization/SystemIdleonce viaCROSS APPLY, then filter out records where either isNULL. Valid rows now insert,@max_sample_timeadvances, recovery is immediate (not a 7-day wait).Lite/Services/RemoteCollectorService.Cpu.cs— sameCROSS APPLY+ NULL filter. Lite's DuckDB columns are nullable so it never hard-failed, but it storedNULLsamples that skew the CPU chart.Dropped malformed records rather than
ISNULL(..., 0)(the reporter's suggestion): a fabricated0reads as a real "0% CPU" sample and misleads the charts; a record with noSystemHealthblock is not a CPU reading at all.Test plan
EXEC collect.cpu_utilization_stats_collector @debug=1runs clean.<SystemHealth/>is filtered out while a valid42/50record passes.🤖 Generated with Claude Code