Track connection panic metrics by leynos · Pull Request #281 · leynos/wireframe

leynos · 2025-08-08T01:32:02Z

Summary

count connection task panics via wireframe_connection_panics_total
surface panic count in tests and docs with guidance for Prometheus alerts

closes #217

Testing

make fmt
make markdownlint
make nixie (fails: too many arguments/failed to link packages)
make lint
make test

https://chatgpt.com/codex/tasks/task_e_68953c8ac6f08322a60414af8a1d1406

Summary by Sourcery

Instrument connection tasks to record panics as a new Prometheus metric and update tests and documentation to use and verify the new counter.

New Features:

Introduce wireframe_connection_panics_total counter and inc_connection_panics API to record panicking connection tasks.

Enhancements:

Increment the connection panic counter in spawn_connection_task when a task panics.

Documentation:

Update the production resilience guide to explain the panic metric, its usage in Prometheus, and alerting guidance.

Tests:

Add unit tests and integration tests to verify that connection panics increment the new metric.

sourcery-ai · 2025-08-08T01:32:07Z

Reviewer's Guide

This PR introduces a new Prometheus counter to track connection task panics, wires it into the connection task’s catch_unwind handler, adds unit and integration tests to verify the metric is recorded, and extends the production resilience guide with alerting recommendations.

ER diagram for Prometheus connection panic metric

erDiagram
    CONNECTION_TASK ||--o{ CONNECTION_PANIC_METRIC : triggers
    CONNECTION_PANIC_METRIC {
        string name
        int count
    }

Class diagram for new connection panic metric tracking

classDiagram
    class Metrics {
        +inc_connection_panics()
        +CONNECTION_PANICS : &str
    }
    class ConnectionTask {
        +spawn_connection_task()
    }
    ConnectionTask ..> Metrics : uses

File-Level Changes

Change	Details	Files
Define and implement a new counter for connection task panics	Add CONNECTION_PANICS constant with HELP and TYPE comments Implement feature-gated inc_connection_panics() function	`src/metrics.rs`
Instrument connection task wrapper to record panics	Invoke inc_connection_panics() in the catch_unwind panic branch	`src/server/connection.rs`
Add tests to ensure panic metric increments	Add async rstest in connection module to snapshot and assert panic counter Add unit test in tests/metrics.rs for inc_connection_panics()	`src/server/connection.rs` `tests/metrics.rs`
Update documentation with panic metric guidance	Describe wireframe_connection_panics_total in production resilience guide Add Prometheus rate() and recorder setup recommendations	`docs/hardening-wireframe-a-guide-to-production-resilience.md`

Assessment against linked issues

Issue	Objective	Addressed
#217	Add a metrics counter to track the number of panics in connection tasks.	✅
#217	Ensure metrics collection can be disabled via configuration (e.g., feature flag or recorder not installed).	✅
#217	Document the new metric and provide guidance for operators on monitoring and alerting (e.g., Prometheus, Grafana).	✅

Possibly linked issues

Add metrics/monitoring for caught panics in connection tasks #217: The PR adds a wireframe_connection_panics_total metric to count panics, fulfilling the issue's monitoring requirement.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2025-08-08T01:32:09Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Summary by CodeRabbit

New Features
- Introduced a new metric to track connection panics, enabling improved monitoring and observability via Prometheus and Grafana dashboards.
Documentation
- Updated guidance to include details on the new panic metric and instructions for enabling and visualising it.
Tests
- Added tests to verify that the connection panic metric increments correctly when triggered.

Walkthrough

Introduce a new Prometheus-compatible metric, wireframe_connection_panics_total, to track panics in connection tasks. Integrate metric incrementation into the panic handling logic, update documentation with observability instructions, and add tests to verify correct metric behaviour when a connection task panics.

Changes

Cohort / File(s)	Change Summary
Metric Definition & API `src/metrics.rs`	Define `CONNECTION_PANICS` metric constant and add `inc_connection_panics` function to increment the counter.
Connection Panic Instrumentation & Testing `src/server/connection.rs`	Increment the panic counter in the connection panic handler; add a test to verify metric incrementation.
Metrics Test `tests/metrics.rs`	Add a test to ensure the panic metric increments with the correct peer address label.
Documentation Update `docs/hardening-wireframe-a-guide-to-production-resilience.md`	Document the new metric, its purpose, setup instructions, and dashboard integration guidance.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Server
    participant MetricsRecorder

    Client->>Server: Initiate connection
    Server->>Server: spawn_connection_task
    Server-->>Server: Connection task panics
    Server->>MetricsRecorder: inc_connection_panics(peer_addr)
    MetricsRecorder-->>MetricsRecorder: Increment wireframe_connection_panics_total
    Server->>Server: Handle panic (log, isolate)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~15 minutes

Assessment against linked issues

Objective	Addressed	Explanation
Metrics are collected when connection tasks panic (#217)	✅
Metrics include relevant dimensions (peer_addr pattern, timestamp) (#217)	✅
Metrics collection can be disabled via configuration (#217)	✅
Documentation explains how to set up monitoring dashboards (#217)	✅
Minimal performance impact on happy path (#217)	✅

Assessment against linked issues: Out-of-scope changes

No out-of-scope changes detected.

Possibly related PRs

Refactor worker task #239: Extends the panic handling logic in spawn_connection_task by adding metric incrementation.
Implement metrics & observability #213: Builds upon the metrics module by introducing a new metric counter for connection panics.

Suggested reviewers

codescene-delta-analysis

Poem

When panics arise in the server's domain,
A metric now counts each crash and each strain.
Peer address in tow, the numbers ascend,
Grafana and Prometheus, your new steadfast friends.
So monitor calmly, let dashboards delight—
For resilience is measured, both day and night!
📈✨

📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4428c6e and 8155f33.

📒 Files selected for processing (3)

src/metrics.rs (2 hunks)
src/server/connection.rs (3 hunks)
tests/metrics.rs (1 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.rs

📄 CodeRabbit Inference Engine (AGENTS.md)

**/*.rs: Function documentation must include clear examples demonstrating the usage and outcome of the function. Test documentation should omit examples where the example serves only to reiterate the test logic.
No single code file may be longer than 400 lines. Long switch statements or dispatch tables should be broken up by feature and constituents colocated with targets. Large blocks of test data should be moved to external data files.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
Every module must begin with a module level (//! ) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Use concat!() to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums. Derive std::error::Error (via the thiserror crate) for any condition the caller might inspect, retry, or map to an HTTP status....

Files:

tests/metrics.rs
src/metrics.rs
src/server/connection.rs

⚙️ CodeRabbit Configuration File

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

Adhere to single responsibility and CQRS

Place function attributes after doc comments.

Do not use return in single-line functions.

Move conditionals with >2 branches into a predicate function.

Avoid unsafe unless absolutely necessary.

Every module must begin with a //! doc comment that explains the module's purpose and utility.

Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

Lints must not be silenced except as a last resort.

#[allow] is forbidden.

Only narrowly scoped #[expect(lint, reason = "...")] is allowed.

No lint groups, no blanket or file-wide suppression.

Include FIXME: with link if a fix is expected.

Use rstest fixtures for shared setup and to avoid repetition between tests.

Replace duplicated tests with #[rstest(...)] parameterised cases.

Prefer mockall for mocks/stubs.

Prefer .expect() over .unwrap()

Ensure that any API or behavioural changes are reflected in the documentation in docs/

Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

Files must not exceed 400 lines in length

Large modules must be decomposed

Long match statements or dispatch tables should be decomposed by domain and collocated with targets

Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.

Files:

tests/metrics.rs
src/metrics.rs
src/server/connection.rs

🧬 Code Graph Analysis (1)

src/server/connection.rs (2)

src/metrics.rs (2)

inc_connection_panics (108-108)

inc_connection_panics (111-111)

tests/metrics.rs (1)

connection_panic_metric_increments (66-83)

🔍 MCP Research (1 server)

Deepwiki:

Issue Add metrics/monitoring for caught panics in connection tasks #217 specifies the need for a metrics counter to track panics in connection tasks, with minimal overhead and configurability to disable metrics collection if needed. Metrics should use standard libraries like Prometheus and avoid logging sensitive data. The metric should include relevant dimensions such as peer address and timestamp, and documentation should explain monitoring setup. (linked_issue Add metrics/monitoring for caught panics in connection tasks #217)
The wireframe project uses the metrics crate for instrumentation, with all metric functions defined in src/metrics.rs. Metrics are conditionally compiled with the metrics feature, which can be disabled to make metric functions no-ops. Metric names use a wireframe_ prefix and are defined as pub const strings. Counters are incremented using counter!(NAME).increment(value). Metrics can include labels for additional context. (src/metrics.rs)
Panic handling in connection tasks is implemented by wrapping connection futures in catch_unwind to isolate panics and prevent them from crashing the server. When a panic occurs, the panic message and peer address are logged as errors using tracing::error!. The server continues accepting new clients after a panic. (linked_issue Add metrics/monitoring for caught panics in connection tasks #217, src/server/connection.rs)
The new metric wireframe_connection_panics_total follows the naming convention and is incremented inside the panic handling block of the connection task wrapper, consistent with the team's approach to metrics and panic handling. (src/server/connection.rs)

🔇 Additional comments (3)

src/metrics.rs (1)

27-34: Metric addition meets standards

Constant, docs and helper follow existing style, compile-time gating and no-label counter avoid previous cardinality concern. No further action required.

Also applies to: 91-111

tests/metrics.rs (1)

65-83: Test asserts the exact expected count—good

The tightened assert_eq!(1, count) prevents false positives. Implementation aligns with prior guidance.

src/server/connection.rs (1)

222-225: Avoid global recorder collisions

Installing a global recorder inside a test (recorder.install()) panics if any other test already set one and can race under parallel execution. Use metrics::with_local_recorder with a spawned single-thread runtime, gate the test with serial_test, or guard install() behind a Once to ensure exclusive registration.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch codex/add-production-monitoring-for-connection-panics

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Explain this complex logic.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai explain this code block.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and explain its main purpose.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
@coderabbitai generate unit tests to generate unit tests for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

sourcery-ai

Hey @leynos - I've reviewed your changes and they look great!

Prompt for AI Agents

Please address the comments from this code review:
## Individual Comments

### Comment 1
<location> `docs/hardening-wireframe-a-guide-to-production-resilience.md:171` </location>
<code_context>
+Each occurrence also increments the `wireframe_connection_panics_total`
+counter, enabling alerts on unexpected spikes. Operators can chart
+`rate(wireframe_connection_panics_total[5m])` in Prometheus and create Grafana
+panels to visualise instability. To emit this metric, enable the `metrics`
+Cargo feature and install a recorder such as `metrics_exporter_prometheus`,
+which exposes an HTTP endpoint for scraping.
</code_context>

<issue_to_address>
'Visualise' uses en-gb spelling; en-oxendic prefers 'visualize'.

The word 'visualise' should be spelled 'visualize' to conform to en-oxendic spelling conventions.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

leynos · 2025-08-08T08:55:51Z

@coderabbitai review

coderabbitai · 2025-08-08T08:55:57Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

leynos · 2025-08-08T09:02:28Z

@coderabbitai review

coderabbitai · 2025-08-08T09:02:33Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

docs/hardening-wireframe-a-guide-to-production-resilience.md (1)

168-174: Use Oxford “-ize” spelling

Replace “visualise” with “visualize” to follow en-GB-oxendict conventions.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ce4bce5 and 6c0dc4a.

📒 Files selected for processing (4)

docs/hardening-wireframe-a-guide-to-production-resilience.md (1 hunks)
src/metrics.rs (3 hunks)
src/server/connection.rs (3 hunks)
tests/metrics.rs (1 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

**/*.md

⚙️ CodeRabbit Configuration File

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

Use en-GB-oxendict (-ize / -our) spelling and grammar

Paragraphs and bullets must be wrapped to 80 columns, except where a long URL would prevent this (in which case, silence MD013 for that line)

Code blocks should be wrapped to 120 columns.

Headings must not be wrapped.

Documents must start with a level 1 heading

Headings must correctly increase or decrease by no more than one level at a time

Use GitHub-flavoured Markdown style for footnotes and endnotes.

Numbered footnotes must be numbered by order of appearance in the document.

Files:

docs/hardening-wireframe-a-guide-to-production-resilience.md

**/*.rs

⚙️ CodeRabbit Configuration File

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

Adhere to single responsibility and CQRS

Place function attributes after doc comments.

Do not use return in single-line functions.

Move conditionals with >2 branches into a predicate function.

Avoid unsafe unless absolutely necessary.

Every module must begin with a //! doc comment that explains the module's purpose and utility.

Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

Lints must not be silenced except as a last resort.

#[allow] is forbidden.

Only narrowly scoped #[expect(lint, reason = "...")] is allowed.

No lint groups, no blanket or file-wide suppression.

Include FIXME: with link if a fix is expected.

Use rstest fixtures for shared setup and to avoid repetition between tests.

Replace duplicated tests with #[rstest(...)] parameterised cases.

Prefer mockall for mocks/stubs.

Prefer .expect() over .unwrap()

Ensure that any API or behavioural changes are reflected in the documentation in docs/

Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

Files must not exceed 400 lines in length

Large modules must be decomposed

Long match statements or dispatch tables should be decomposed by domain and collocated with targets

Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.

Files:

src/server/connection.rs
tests/metrics.rs
src/metrics.rs

🧬 Code Graph Analysis (1)

src/metrics.rs (1)

src/extractor.rs (1)

peer_addr (362-362)

🪛 LanguageTool

docs/hardening-wireframe-a-guide-to-production-resilience.md

[style] ~171-~171: Would you like to use the Oxford spelling “visualize”? The spelling ‘visualise’ is also correct.
Context: ...Prometheus and create Grafana panels to visualise instability. To emit this metric, enabl...

(OXFORD_SPELLING_Z_NOT_S)

🔍 MCP Research (1 server)

Deepwiki:

The WireframeServer architecture uses a factory pattern to spawn worker tasks that accept connections and spawn connection tasks wrapped with panic recovery using catch_unwind. When a panic occurs in a connection task, it is caught and logged with the peer address, and now also increments the wireframe_connection_panics_total metric. This ensures that panics do not crash the server and are observable via metrics. (docs/hardening-wireframe-a-guide-to-production-resilience.md)
The connection actor model and asynchronous messaging system use PushHandle and SessionRegistry with weak references to manage connection lifecycles and prevent resource leaks. Panic isolation and metric incrementing integrate with this architecture to provide resilience and observability. (docs/hardening-wireframe-a-guide-to-production-resilience.md)
Observability and monitoring documentation describes structured logging with tracing spans for connections and requests, and metrics collection integrated with Prometheus exporters. The new panic metric wireframe_connection_panics_total fits into this observability framework, enabling operators to monitor panic rates over time and create Grafana dashboards for visualization. (docs/hardening-wireframe-a-guide-to-production-resilience.md)
The production deployment documentation emphasizes the importance of enabling the metrics Cargo feature and installing a recorder like metrics_exporter_prometheus to expose an HTTP scraping endpoint for Prometheus. It also details how the panic metric counter supports operational monitoring and alerting for connection panics. (docs/hardening-wireframe-a-guide-to-production-resilience.md)

🔇 Additional comments (2)

tests/metrics.rs (1)

65-82: Test looks sound

The test correctly asserts that the counter increments under a local recorder.

src/server/connection.rs (1)

41-41: Instrumentation correctly placed

Incrementing the panic counter inside the catch_unwind branch is appropriate.

coderabbitai

Actionable comments posted: 5

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6c0dc4a and 4428c6e.

📒 Files selected for processing (4)

docs/hardening-wireframe-a-guide-to-production-resilience.md (1 hunks)
src/metrics.rs (2 hunks)
src/server/connection.rs (3 hunks)
tests/metrics.rs (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

docs/**/*.md

📄 CodeRabbit Inference Engine (docs/contents.md)

Conventions for writing project documentation as described in the documentation style guide

docs/**/*.md: Use British English based on the Oxford English Dictionary (en-oxendict) for documentation text.
The word "outwith" is acceptable in documentation.
Keep US spelling when used in an API, for example color.
Use the Oxford comma in documentation text.
Treat company names as collective nouns in documentation (e.g., "Lille Industries are expanding").
Write headings in sentence case in documentation.
Use Markdown headings (#, ##, ###, etc.) in order without skipping levels.
Follow markdownlint recommendations for Markdown files.
Provide code blocks and lists using standard Markdown syntax.
Always provide a language identifier for fenced code blocks; use plaintext for non-code text.
Use - as the first level bullet and renumber lists when items change.
Prefer inline links using [text](url) or angle brackets around the URL; avoid reference-style links like [foo][bar].
Ensure blank lines before and after bulleted lists and fenced blocks in Markdown.
Ensure tables have a delimiter line below the header row in Markdown.
Expand any uncommon acronym on first use, for example, Continuous Integration (CI).
Wrap paragraphs at 80 columns in documentation.
Wrap code at 120 columns in documentation.
Do not wrap tables in documentation.
Use sequentially numbered footnotes referenced with [^1] and place definitions at the end of the file.
Where it adds clarity, include Mermaid diagrams in documentation.
When embedding figures, use ![alt text](path/to/image) and provide concise alt text describing the content.
Add a brief description before each Mermaid diagram in documentation for screen readers.

Document examples showing how to deprecate old message versions gracefully

Write the official documentation for the new features. Create separate guides for "Duplex Messaging & Pushes", "Streaming Responses", and "Message Fragmentation". Each guide must include runna...

Files:

docs/hardening-wireframe-a-guide-to-production-resilience.md

docs/**/*.{md,rs}

📄 CodeRabbit Inference Engine (docs/multi-packet-and-streaming-responses-design.md)

docs/**/*.{md,rs}: The official documentation and examples must exclusively use the declarative Response model for handler responses.
The async-stream pattern must be documented as the canonical approach for dynamic stream generation.

Files:

docs/hardening-wireframe-a-guide-to-production-resilience.md

**/*.md

📄 CodeRabbit Inference Engine (AGENTS.md)

**/*.md: Documentation must use en-GB-oxendict spelling and grammar. (EXCEPTION: the naming of the LICENSE file, which is to be left unchanged for community consistency.)
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown files must be wrapped at 120 columns.
Tables and headings in Markdown files must not be wrapped.
Use dashes (-) for list bullets in Markdown files.
Use GitHub-flavoured Markdown footnotes ([^1]) for references and footnotes.

Files:

docs/hardening-wireframe-a-guide-to-production-resilience.md

⚙️ CodeRabbit Configuration File

**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")

Use en-GB-oxendict (-ize / -our) spelling and grammar

Headings must not be wrapped.

Documents must start with a level 1 heading

Headings must correctly increase or decrease by no more than one level at a time

Use GitHub-flavoured Markdown style for footnotes and endnotes.

Numbered footnotes must be numbered by order of appearance in the document.

Files:

docs/hardening-wireframe-a-guide-to-production-resilience.md

**/*.rs

📄 CodeRabbit Inference Engine (AGENTS.md)

**/*.rs: Function documentation must include clear examples demonstrating the usage and outcome of the function. Test documentation should omit examples where the example serves only to reiterate the test logic.
No single code file may be longer than 400 lines. Long switch statements or dispatch tables should be broken up by feature and constituents colocated with targets. Large blocks of test data should be moved to external data files.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error consider using Arc to reduce the amount of data returned.
Write unit and behavioural tests for new functionality. Run both before and after making any change.
Every module must begin with a module level (//! ) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated with cargo doc.
Prefer immutable data and avoid unnecessary mut bindings.
Handle errors with the Result type instead of panicking where feasible.
Avoid unsafe code unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments.
Do not use return in single-line functions.
Use predicate functions for conditional criteria with more than two branches.
Lints must not be silenced except as a last resort.
Lint rule suppressions must be tightly scoped and include a clear reason.
Prefer expect over allow.
Prefer .expect() over .unwrap().
Use concat!() to combine long string literals rather than escaping newlines with a backslash.
Prefer semantic error enums. Derive std::error::Error (via the thiserror crate) for any condition the caller might inspect, retry, or map to an HTTP status....

Files:

src/metrics.rs
src/server/connection.rs
tests/metrics.rs

⚙️ CodeRabbit Configuration File

**/*.rs: * Seek to keep the cyclomatic complexity of functions no more than 12.

Adhere to single responsibility and CQRS

Place function attributes after doc comments.

Do not use return in single-line functions.

Move conditionals with >2 branches into a predicate function.

Avoid unsafe unless absolutely necessary.

Every module must begin with a //! doc comment that explains the module's purpose and utility.

Comments and docs must follow en-GB-oxendict (-ize / -our) spelling and grammar

Lints must not be silenced except as a last resort.

#[allow] is forbidden.

Only narrowly scoped #[expect(lint, reason = "...")] is allowed.

No lint groups, no blanket or file-wide suppression.

Include FIXME: with link if a fix is expected.

Use rstest fixtures for shared setup and to avoid repetition between tests.

Replace duplicated tests with #[rstest(...)] parameterised cases.

Prefer mockall for mocks/stubs.

Prefer .expect() over .unwrap()

Ensure that any API or behavioural changes are reflected in the documentation in docs/

Ensure that any completed roadmap steps are recorded in the appropriate roadmap in docs/

Files must not exceed 400 lines in length

Large modules must be decomposed

Long match statements or dispatch tables should be decomposed by domain and collocated with targets

Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.

Files:

src/metrics.rs
src/server/connection.rs
tests/metrics.rs

🔍 MCP Research (1 server)

Deepwiki:

The panic handling mechanism in connection tasks is robust and tested: panics in connection tasks are caught and logged without crashing the server, allowing continued acceptance of new connections. This is supported by existing tests and documented in the WireframeServer wiki and test suite. The new metric addition supplements this by providing observability into panic frequency for operational monitoring. (src/server.rs, tests/world.rs, docs/hardening-wireframe-a-guide-to-production-resilience.md)
The documentation specifies that to emit the panic metric, the metrics Cargo feature must be enabled and a recorder such as metrics_exporter_prometheus installed to expose an HTTP scraping endpoint. This enables operators to monitor panic rates over time and create Grafana dashboards for visualization and alerting. (docs/hardening-wireframe-a-guide-to-production-resilience.md)

⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: build-test

🔇 Additional comments (1)

src/server/connection.rs (1)

40-43: Good instrumentation.

Incrementing the panic counter here fulfils the monitoring objective without exposing labels.

leynos · 2025-08-08T22:54:06Z

@sourcery-ai review

sourcery-ai

Hey @leynos - I've reviewed your changes and they look great!

Prompt for AI Agents

Please address the comments from this code review:
## Individual Comments

### Comment 1
<location> `src/server/connection.rs:219` </location>
<code_context>
+    /// Panics increment the connection panic counter.
+    #[rstest]
+    #[tokio::test]
+    async fn connection_panic_metric_increments(
+        factory: impl Fn() -> WireframeApp + Send + Sync + Clone + 'static,
+    ) {
</code_context>

<issue_to_address>
Consider replacing the complex async integration test with a simple unit test that directly checks the metric increment.

Instead of spinning up a real TCP listener, driver tasks and I/O just to verify that `inc_connection_panics()` was called, you can drop that entire async test and replace it with a tiny unit‐test that directly exercises the metric. For example:

```rust
#[cfg(test)]
mod metrics_tests {
    use metrics_util::debugging::{DebugValue, DebuggingRecorder};
    use crate::metrics::{self, CONNECTION_PANICS};

    #[test]
    fn inc_connection_panics_increments_counter() {
        // Arrange: install a debugging recorder
        let recorder = DebuggingRecorder::new();
        let snapshotter = recorder.snapshotter();
        recorder.install().expect("failed to install recorder");

        // Act: invoke the panic‐increment function directly
        metrics::inc_connection_panics();

        // Assert: we saw exactly one increment on the CONNECTION_PANICS counter
        let metrics = snapshotter.snapshot().into_vec();
        assert!(metrics.iter().any(|(key, _, _, value)| {
            key.key().name() == CONNECTION_PANICS
                && matches!(value, DebugValue::Counter(c) if *c == 1)
        }), "expected CONNECTION_PANICS == 1, got {metrics:#?}");
    }
}
```

Steps:
1. Remove the entire `async fn connection_panic_metric_increments` and its TCP boilerplate.
2. Add the above `inc_connection_panics_increments_counter` test.
3. Verify `cargo test` still passes and coverage for `inc_connection_panics()` remains.
</issue_to_address>

### Comment 2
<location> `docs/hardening-wireframe-a-guide-to-production-resilience.md:168` </location>
<code_context>
 Connection tasks are wrapped with `catch_unwind` to log and discard panics.
 Each panicking connection is isolated so it cannot terminate the entire server.

+Each occurrence also increments the `wireframe_connection_panics_total`
+counter, enabling alerts on unexpected spikes. The counter intentionally omits
+peer address labels to limit cardinality and protect personally identifiable
</code_context>

<issue_to_address>
This paragraph is not wrapped to 80 columns as required by the style guide.

Please wrap this paragraph so that no line exceeds 80 columns, in accordance with the documentation formatting standards.
</issue_to_address>

### Comment 3
<location> `docs/hardening-wireframe-a-guide-to-production-resilience.md:169` </location>
<code_context>
 Each panicking connection is isolated so it cannot terminate the entire server.

+Each occurrence also increments the `wireframe_connection_panics_total`
+counter, enabling alerts on unexpected spikes. The counter intentionally omits
+peer address labels to limit cardinality and protect personally identifiable
+information. Operators can chart `rate(wireframe_connection_panics_total[5m])`
</code_context>

<issue_to_address>
This line is part of a paragraph that exceeds the 80 column limit.

Ensure all lines in paragraphs and bullet points are wrapped to 80 columns for readability and consistency.
</issue_to_address>

### Comment 4
<location> `docs/hardening-wireframe-a-guide-to-production-resilience.md:170` </location>
<code_context>

+Each occurrence also increments the `wireframe_connection_panics_total`
+counter, enabling alerts on unexpected spikes. The counter intentionally omits
+peer address labels to limit cardinality and protect personally identifiable
+information. Operators can chart `rate(wireframe_connection_panics_total[5m])`
+in Prometheus and create Grafana panels to visualize instability. To emit this
</code_context>

<issue_to_address>
This line is part of a paragraph that is not wrapped to 80 columns.

Please reformat this paragraph so that each line does not exceed 80 columns.
</issue_to_address>

### Comment 5
<location> `docs/hardening-wireframe-a-guide-to-production-resilience.md:171` </location>
<code_context>
+Each occurrence also increments the `wireframe_connection_panics_total`
+counter, enabling alerts on unexpected spikes. The counter intentionally omits
+peer address labels to limit cardinality and protect personally identifiable
+information. Operators can chart `rate(wireframe_connection_panics_total[5m])`
+in Prometheus and create Grafana panels to visualize instability. To emit this
+metric, enable the `metrics` Cargo feature and install a recorder such as
</code_context>

<issue_to_address>
This line is part of a paragraph that is not wrapped to 80 columns.

Wrap this paragraph to 80 columns as per documentation standards.
</issue_to_address>

### Comment 6
<location> `docs/hardening-wireframe-a-guide-to-production-resilience.md:172` </location>
<code_context>
+counter, enabling alerts on unexpected spikes. The counter intentionally omits
+peer address labels to limit cardinality and protect personally identifiable
+information. Operators can chart `rate(wireframe_connection_panics_total[5m])`
+in Prometheus and create Grafana panels to visualize instability. To emit this
+metric, enable the `metrics` Cargo feature and install a recorder such as
+`metrics_exporter_prometheus`, which exposes an HTTP endpoint for scraping.
</code_context>

<issue_to_address>
This line is part of a paragraph that is not wrapped to 80 columns.

Please ensure all paragraph lines are wrapped to 80 columns.
</issue_to_address>

### Comment 7
<location> `docs/hardening-wireframe-a-guide-to-production-resilience.md:173` </location>
<code_context>
+peer address labels to limit cardinality and protect personally identifiable
+information. Operators can chart `rate(wireframe_connection_panics_total[5m])`
+in Prometheus and create Grafana panels to visualize instability. To emit this
+metric, enable the `metrics` Cargo feature and install a recorder such as
+`metrics_exporter_prometheus`, which exposes an HTTP endpoint for scraping.
+
</code_context>

<issue_to_address>
This line is part of a paragraph that is not wrapped to 80 columns.

Reformat this paragraph so that each line is wrapped to 80 columns.
</issue_to_address>

### Comment 8
<location> `docs/hardening-wireframe-a-guide-to-production-resilience.md:174` </location>
<code_context>
+information. Operators can chart `rate(wireframe_connection_panics_total[5m])`
+in Prometheus and create Grafana panels to visualize instability. To emit this
+metric, enable the `metrics` Cargo feature and install a recorder such as
+`metrics_exporter_prometheus`, which exposes an HTTP endpoint for scraping.
+
 ### 3.2 Leak-Proof Registries with `Weak`/`Arc`
</code_context>

<issue_to_address>
This line is part of a paragraph that is not wrapped to 80 columns.

Please wrap this paragraph to 80 columns for consistency with the documentation style guide.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

Track connection panic metrics

6c0dc4a

leynos added the codex label Aug 8, 2025 — with ChatGPT Codex Connector

codescene-delta-analysis Bot approved these changes Aug 8, 2025

View reviewed changes

sourcery-ai Bot reviewed Aug 8, 2025

View reviewed changes

Comment thread docs/hardening-wireframe-a-guide-to-production-resilience.md Outdated

coderabbitai Bot requested changes Aug 8, 2025

View reviewed changes

Comment thread src/metrics.rs

Omit peer address label from panic metric

4428c6e

codescene-delta-analysis Bot approved these changes Aug 8, 2025

View reviewed changes

coderabbitai Bot approved these changes Aug 8, 2025

View reviewed changes

coderabbitai Bot requested changes Aug 8, 2025

View reviewed changes

Comment thread docs/hardening-wireframe-a-guide-to-production-resilience.md

Comment thread src/metrics.rs

Comment thread src/metrics.rs

Comment thread src/server/connection.rs

Comment thread tests/metrics.rs

Document connection panic metric

8155f33

codescene-delta-analysis Bot approved these changes Aug 8, 2025

View reviewed changes

coderabbitai Bot approved these changes Aug 8, 2025

View reviewed changes

leynos merged commit 7bc3f76 into main Aug 8, 2025
5 checks passed

leynos deleted the codex/add-production-monitoring-for-connection-panics branch August 8, 2025 22:54

sourcery-ai Bot reviewed Aug 8, 2025

View reviewed changes

coderabbitai Bot mentioned this pull request Aug 9, 2025

Replace connection panic integration test with unit test #296

Merged

Conversation

leynos commented Aug 8, 2025 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

ER diagram for Prometheus connection panic metric

Class diagram for new connection panic metric tracking

File-Level Changes

Assessment against linked issues

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai Bot commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Assessment against linked issues

Assessment against linked issues: Out-of-scope changes

Possibly related PRs

Suggested reviewers

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

leynos commented Aug 8, 2025

Uh oh!

coderabbitai Bot commented Aug 8, 2025

Uh oh!

leynos commented Aug 8, 2025

Uh oh!

coderabbitai Bot commented Aug 8, 2025

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

leynos commented Aug 8, 2025

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

leynos commented Aug 8, 2025 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented Aug 8, 2025 •

edited

Loading

coderabbitai Bot commented Aug 8, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)