fix(socket): round event-payload log truncation to UTF-8 boundary#1818
Conversation
…nyhumansai#1814) `handle_sio_event` truncated `data.to_string()` with a naked byte slice `[..data.to_string().len().min(500)]`, which panicked the core thread on multi-byte UTF-8 characters straddling byte 500 — real Sentry repro was a WhatsApp ingest payload with Cyrillic `'н'` at bytes 499..501 (OPENHUMAN-TAURI-KC). Bind the JSON string once and route the slice through `crate::openhuman::util::floor_char_boundary`, the same helper used by the body-preview fix in tinyhumansai#1751. Adds a regression test that constructs a payload straddling byte 500 and asserts no panic. Closes tinyhumansai#1814
📝 WalkthroughWalkthroughSerialize the incoming socket ChangesPayload logging and regression test
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/socket/event_handlers.rs`:
- Around line 41-45: The debug log in event_handlers.rs currently prints raw
payload content via the log::debug! call with event_name and the sliced payload
(&payload[..floor_char_boundary(&payload, 500)]); change this to avoid emitting
any payload bytes and instead log only non-sensitive metadata such as
event_name, payload length (payload.len()), the floor_char_boundary value or a
truncated boolean, and any correlation IDs if available; remove or redact the
payload slice usage and ensure functions/variables referenced (event_name,
payload, floor_char_boundary) are used only to compute safe metadata rather than
printing raw content.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: dea599bb-decf-498f-8465-78b5b59c3b92
📒 Files selected for processing (1)
src/openhuman/socket/event_handlers.rs
…addresses @coderabbitai on event_handlers.rs:45) Remove the log::debug! line that emitted up to 500 bytes of raw Socket.IO event payload content. Webhook bodies, channel messages, and Composio trigger payloads can carry secrets, PII, and auth tokens — logging any slice of them violates the project's "never log secrets or full PII" rule. The log::info! line above already records event_name and data_bytes (the byte length), which is sufficient for throughput and size diagnostics. Replace the payload-content debug line with a metadata-only one that emits only the event name. Side effects: - `floor_char_boundary` import removed (no longer referenced). - Regression test renamed from `handle_sio_event_does_not_panic_on_multibyte_char_at_log_truncation_boundary` to `handle_sio_event_payload_redacted_no_panic_on_multibyte_boundary` to reflect the actual invariant: payload content is never sliced, so the multi-byte boundary panic site no longer exists. Test now additionally confirms via `catch_unwind` that the fixture would have panicked under the old slice-at-500 approach, making the regression guard self-validating.
sanil-23
left a comment
There was a problem hiding this comment.
CI failure diagnosis — pre-existing on main, not caused by this PR's diff
Two Rust CI checks are failing on this PR:
- Rust Core Tests + Quality (job 76185890715)
- Rust Core Coverage (cargo-llvm-cov) (job 76185891161)
Both fail on exactly one test:
---- openhuman::composio::auth_retry::tests::retries_once_only_even_when_second_call_still_errors ----
assertion left == right failed: compound retry: outer (auth_retry.rs, #1708) × inner (execute_tool_with_post_oauth_retry, #1707) = 4 gateway hits.
left: 2
right: 4
Root cause: this test lives in src/openhuman/composio/auth_retry_tests.rs:241 and was introduced by PRs #1707/#1708. This PR's only commit (6cea84eb) touches exclusively src/openhuman/socket/event_handlers.rs — zero overlap with composio/. The failure reproduces on the same commit hash that is also the base of this branch (git diff upstream/main..HEAD shows one file changed).
Action: maintainer should investigate auth_retry_tests.rs:241 on main independently. This PR is not the cause and cannot fix it without touching unrelated code.
PR Submission Checklist: the first run failed but subsequent runs (25920056478, 25920164796) both passed — the failure was transient/timing, not a real issue.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/openhuman/socket/event_handlers.rs (1)
35-40: ⚡ Quick winAdd correlation metadata to the new ingress logs.
These new pre-dispatch logs are safely redacted, but they omit correlation context, which weakens traceability for webhook/composio flows.
Suggested minimal update
let payload = data.to_string(); + let correlation_id = data + .get("correlationId") + .and_then(|v| v.as_str()) + .unwrap_or("unknown"); log::info!( - "[socket] event received: name={} data_bytes={}", + "[socket] event received: name={} correlation_id={} data_bytes={}", event_name, + correlation_id, payload.len() ); - log::debug!("[socket] event dispatch: name={}", event_name); + log::debug!( + "[socket] event dispatch: name={} correlation_id={}", + event_name, + correlation_id + );As per coding guidelines: "
src/**/*.rs: All new/changed behavior in Rust core must include verbose diagnostics logging with stable grep-friendly prefixes like[domain],[rpc]and correlation fields; never log secrets or full PII".🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/socket/event_handlers.rs` around lines 35 - 40, The ingress pre-dispatch logs use event_name and payload.len() but lack correlation metadata; update the log::info! and log::debug! calls in the event_handlers.rs handler to include stable, grep-friendly correlation fields (for example cor_id, trace_id, request_id or connection_id) drawn from the existing request/socket context (e.g., a context struct or header map available in scope) while preserving redaction rules; format them as key=value pairs (e.g., cor_id={:?} trace_id={:?}) so both log lines include the same correlation fields alongside event_name and payload.len().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/openhuman/socket/event_handlers.rs`:
- Around line 35-40: The ingress pre-dispatch logs use event_name and
payload.len() but lack correlation metadata; update the log::info! and
log::debug! calls in the event_handlers.rs handler to include stable,
grep-friendly correlation fields (for example cor_id, trace_id, request_id or
connection_id) drawn from the existing request/socket context (e.g., a context
struct or header map available in scope) while preserving redaction rules;
format them as key=value pairs (e.g., cor_id={:?} trace_id={:?}) so both log
lines include the same correlation fields alongside event_name and
payload.len().
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 0501442b-52e9-4c27-9eb5-4ba1270d3b12
📒 Files selected for processing (1)
src/openhuman/socket/event_handlers.rs
graycyrus
left a comment
There was a problem hiding this comment.
Walkthrough
Clean, focused fix for a real Sentry-traced panic (#1814) where a multi-byte UTF-8 char straddling byte 500 crashed the debug-log truncation in handle_sio_event. The fix evolved across 3 commits — from floor_char_boundary safe-slicing to full payload redaction from logs (PII/secrets policy). The regression test is thorough and self-validating. No issues worth blocking on.
Change Summary
| File | Change type | Description |
|---|---|---|
src/openhuman/socket/event_handlers.rs |
Bug fix + security hardening | Binds data.to_string() once, removes raw payload from all log lines, adds comprehensive regression test |
Per-file Analysis
src/openhuman/socket/event_handlers.rs
Production code (lines 29–46)
Good changes:
data.to_string()called once instead of 3 times — eliminates redundant allocations- Payload content fully removed from logs — stronger than the issue asked for, aligns with project PII/secrets policy
[socket]log prefix maintained, consistent with sibling files (manager.rs,ws_loop.rs,mod.rs)- Comment block (lines 29–33) clearly explains the rationale for payload redaction
Test (lines 396–447)
Excellent regression test:
- Fixture precisely targets the failure mode (Cyrillic
'н'at bytes 499..501) catch_unwindvalidates the fixture itself would trigger the old panic — self-documenting- Uses
"anything.unhandled"event name to exercise the pre-match log path - No real names, emails, or PII in test data — uses constructed payloads
- Inline
mod testsplacement matches module convention
Notes
- CodeRabbit's correlation-ID nitpick (review 4298632896) is still unaddressed — not blocking, but worth a follow-up
- PR title says "round to UTF-8 boundary" but final code removes the payload entirely — cosmetic mismatch with the evolution across commits, not actionable
- All known-issues watchlist items clean: no bare
.unwrap(), no PII in logs,[socket]prefix present, test coverage solid
Summary
&data.to_string()[..data.to_string().len().min(500)]log-truncation slice inhandle_sio_eventthat aborted the core thread when an incoming Socket.IO payload contained a multi-byte UTF-8 char straddling byte 500.data.to_string()once and route the slice through the existingcrate::openhuman::util::floor_char_boundaryhelper — same helper fix(memory_tree, e2e tests ): deterministic query_topic ordering + robust CEF cleanup #1751 used for the equivalent body-preview bug.event_handlers::testsmodule that constructs a payload whose JSON serialization places a multi-byte char across byte 500 and asserts no panic.Problem
OPENHUMAN-TAURI-KC fires a
level=fatal,mechanism=panicevent whenever a Socket.IO event payload is at least 500 bytes long and contains a multi-byte UTF-8 character straddling byte 500. Real-world Sentry repro: a WhatsApp ingest payload with the 2-byte Cyrillic'н'at bytes 499..501 panicked the core thread on receipt.Root cause is one naked byte slice used purely for debug-log truncation in
src/openhuman/socket/event_handlers.rs:38:The bug also re-serializes the JSON twice (
data.to_string()is called twice per event on the hot path).Solution
Bind
let payload = data.to_string();once, then slice via&payload[..floor_char_boundary(&payload, 500)]. The helper already lives insrc/openhuman/util.rsand is documented as rounding a byte index down to the nearest UTF-8 boundary — exactly the semantics needed for a log preview that should never expand past its cap.Same helper
markdown_body_preview()switched to in #1751 after the equivalent panic family hit memory-tree ingest. The doubledata.to_string()call disappears as a side effect.Submission Checklist
event_handlers.rsis exercised by the new regression test plus the four pre-existinghandle_sio_event_*tests, all of which now run against the newlet payload = ...binding.## Related— N/A: no coverage-matrix feature IDs were changed.docs/RELEASE-MANUAL-SMOKE.md) — N/A: pure log-string fix, no release-cut surface touched.Closes #NNNin the## RelatedsectionImpact
data.to_string()is now called once instead of twice per event on the hot path.Related
policy.rscommand log truncation: Panic on multi-byte command in security policy log truncation #1813markdown_body_preview_respects_utf8_boundary_and_byte_captest insrc/openhuman/memory/tree/ingest.rs:501(fix(memory_tree, e2e tests ): deterministic query_topic ordering + robust CEF cleanup #1751).AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
fix/1814-socket-char-boundary-panic6cea84ebValidation Run
pnpm --filter openhuman-app format:check— N/A: noapp/files changed.pnpm typecheck— N/A: no TS files changed.cargo test --lib -- openhuman::socket::event_handlers→ 18 passed; 0 failed.cargo fmt --check -- event_handlers.rsclean;cargo checkclean (only pre-existing unrelated warnings inslack_backfill.rs).Validation Blocked
command:N/Aerror:N/Aimpact:N/ABehavior Changes
handle_sio_eventno longer panics when an incoming Socket.IO payload contains a multi-byte UTF-8 character at byte 500 of its JSON serialization.Parity Contract
ready,error,webhook:request,composio:trigger,*:message, default) is untouched.payloadbinding changed; thematch event_nameblock is identical.Duplicate / Superseded PR Handling
Summary by CodeRabbit
Bug Fixes
Logging
Tests