Skip to content

fix(channels): preserve overflow segments instead of dropping them (#1041)#1051

Merged
senamakel merged 2 commits into
tinyhumansai:mainfrom
sanil-23:fix/1041-segmentation-overflow-drop
Apr 30, 2026
Merged

fix(channels): preserve overflow segments instead of dropping them (#1041)#1051
senamakel merged 2 commits into
tinyhumansai:mainfrom
sanil-23:fix/1041-segmentation-overflow-drop

Conversation

@sanil-23
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 commented Apr 30, 2026

Summary

  • Fix issue [Bug] UI Not Rendering Full LLM Response (List Items / Markdown Truncated #1041: long agent replies were missing roughly half their content in the UI (bullet lists + trailing paragraphs disappeared).
  • Root cause was in segment_for_delivery (src/openhuman/channels/providers/presentation.rs): .take(MAX_SEGMENTS) silently dropped every paragraph past the 5th cap.
  • New cap_segments(segments, max, joiner) helper preserves all content by merging overflow into the final segment instead of discarding it.
  • Pre-existing MAX_SEGMENTS = 5 cap retained — bubble count is still bounded; no inter-bubble delay regression.
  • New regression test runs the exact agent reply text from the issue's transcript screenshot and asserts every bullet and every trailing paragraph survives.

Problem

The web channel applies a "human-feeling" presentation layer that splits long replies into multiple chat bubbles. The split was implemented as paragraphs.into_iter().take(MAX_SEGMENTS).collect() (and the same pattern on the sentence-fallback path).

Once a reply produced more than MAX_SEGMENTS = 5 natural paragraphs, every paragraph past the cap was silently dropped before the segments were ever published to the frontend. The final chat_done event still carried full_response with the complete text, but the frontend's onDone handler in ChatRuntimeProvider skips the full-response branch when segment_total > 0 (it trusts segments already delivered everything) — so the truncation was unrecoverable on the client side.

The reporter's screenshots show this exactly: an 11-paragraph draft email rendered as the first 5 bubbles, then 2m ago and end of message. Bullets and the entire sign-off were gone.

Solution

Replace .take(MAX_SEGMENTS) with a cap_segments helper that:

  • Returns the input unchanged when it already fits under the cap.
  • Otherwise keeps the first max - 1 segments as-is and concatenates the remaining tail into a single trailing segment with the appropriate joiner (\n\n for paragraph strategy, " " for sentence strategy).
  • Treats max == 0 as a no-op rather than panicking.

Applied at both call sites in segment_for_delivery. The MAX_SEGMENTS constant and the inter-bubble delay model (segment_delay) are unchanged — bubble count is still bounded, the user just no longer loses content past the cap.

The previously-passing test max_segments_respected was renamed to max_segments_respected_without_dropping_content and now also asserts that every input paragraph appears in the joined output — the assertion that would have caught this bug originally.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per docs/TESTING-STRATEGY.md — 4 new tests + 1 enhanced existing test in presentation_tests.rs
  • N/A: behaviour-only change — coverage matrix unchanged (no new/renamed feature row)
  • N/A: behaviour-only change — no new feature IDs
  • N/A — no new external network dependencies introduced
  • N/A — no release-cut surface touched (purely internal segmentation logic; observable as "long replies render fully" but doesn't change UX surface)
  • Linked issue closed via Closes #1041 below

Impact

  • Runtime: web-channel chat delivery only. No other channel paths or domains touched.
  • Performance: identical to baseline — same bubble count published, same inter-bubble delay, just one of the bubbles is now potentially longer (the tail).
  • Compatibility: backward-compatible. Existing replies that fit under the cap behave identically. Replies that previously hit the cap now deliver their full content instead of partial.
  • Migration: none.

Related

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes

    • Prevented content loss when segment limits are exceeded by capping delivered segments and merging any overflow into the final segment so all original content is retained.
  • Tests

    • Added unit and end-to-end regression tests to verify capped delivery preserves all input content and handles edge cases (including zero max).

…inyhumansai#1041)

`segment_for_delivery` capped delivered segments at MAX_SEGMENTS=5 via
`.take(MAX_SEGMENTS)`, silently discarding every paragraph past the
fifth. Long agent replies (e.g. drafts with bullet lists + several
trailing paragraphs) lost roughly half their content before reaching
the UI — exactly the symptom reported in tinyhumansai#1041 (UI shows the first 5
bubbles, bullets and tail prose are gone, message ends mid-thought).

New `cap_segments(segments, max, joiner)` helper keeps the first
`max - 1` segments intact and concatenates any overflow into the
final segment with the appropriate joiner (`\n\n` for paragraph
strategy, `" "` for sentence strategy). Bounded number of bubbles
preserved; zero content loss.

Tests: `issue_1041_transcript_preserves_bullets_and_trailing_paragraphs`
runs the exact agent reply text from the issue's transcript screenshot
and asserts every bullet + every trailing paragraph survives delivery.
The previously-passing `max_segments_respected` test was renamed to
`max_segments_respected_without_dropping_content` and now also asserts
that all 10 input paragraphs appear in the joined output — the
assertion that would have caught this bug originally.

Closes tinyhumansai#1041

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sanil-23 sanil-23 requested a review from a team April 30, 2026 14:20
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 30, 2026

📝 Walkthrough

Walkthrough

The presentation segmentation now uses a cap_segments helper to merge overflow segments into a single final segment instead of truncating extras, preserving all original content when paragraph or sentence splitting would exceed MAX_SEGMENTS (fixes lost list/trailing content).

Changes

Cohort / File(s) Summary
Segmentation Logic
src/openhuman/channels/providers/presentation.rs
Adds cap_segments helper; replaces .take(MAX_SEGMENTS) truncation with logic that preserves the first max-1 segments and merges remaining overflow into one tail segment (joiner "\n\n" for paragraphs, " " for sentences). Adds debug logging for overflow merges.
Test Coverage
src/openhuman/channels/providers/presentation_tests.rs
Updates MAX_SEGMENTS regression to assert no paragraph content is lost. Adds unit tests for cap_segments (passthrough under cap, overflow merge behavior, max_segments=0 no-op) and an end-to-end transcript regression covering bullet lists and trailing content (issue #1041).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I stitched the segments, gentle and neat,
Overflow bunched into one cozy seat.
Bullets and paragraphs sing without fear,
Every line saved — hop, cheer, cheer, cheer! 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(channels): preserve overflow segments instead of dropping them (#1041)' accurately summarizes the main change: replacing truncation logic with content-preserving overflow handling.
Linked Issues check ✅ Passed The PR fully addresses issue #1041 by implementing cap_segments to preserve all overflow content instead of dropping it, verified with regression tests using the exact failing transcript.
Out of Scope Changes check ✅ Passed All changes are scoped to presentation.rs and its tests, directly implementing the fix for issue #1041 without unrelated modifications.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Review rate limit: 4/5 reviews remaining, refill in 12 minutes.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/openhuman/channels/providers/presentation_tests.rs (1)

75-80: ⚡ Quick win

Strengthen the no-drop assertion to check full paragraph text, not just markers.

Current checks can pass even if paragraph bodies are partially truncated. Using each original paras[i] as the assertion target gives tighter regression protection.

Proposed patch
-    for i in 0..10 {
-        assert!(
-            joined.contains(&format!("Paragraph number {} ", i)),
-            "paragraph {} missing from delivered segments",
-            i
-        );
-    }
+    for (i, para) in paras.iter().enumerate() {
+        assert!(joined.contains(para), "paragraph {} missing from delivered segments", i);
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/channels/providers/presentation_tests.rs` around lines 75 - 80,
The test currently checks only for the paragraph marker via
joined.contains(&format!("Paragraph number {} ", i)), which can miss truncated
bodies; update the assertion inside the for loop (for i in 0..10) to assert that
the concatenated result named joined contains the full original paragraph string
paras[i] (i.e., use joined.contains(&paras[i])) and keep the failure message
referencing i so missing or truncated paragraphs fail the test.
src/openhuman/channels/providers/presentation.rs (1)

261-268: ⚡ Quick win

Add a debug log when overflow is merged into the tail segment.

This branch materially changes delivery shape and is worth logging for quick field debugging/regression triage.

Proposed patch
 fn cap_segments(segments: Vec<String>, max: usize, joiner: &str) -> Vec<String> {
     if max == 0 || segments.len() <= max {
         return segments;
     }
+    let original = segments.len();
     let mut iter = segments.into_iter();
     let mut result: Vec<String> = (&mut iter).take(max - 1).collect();
     let tail: Vec<String> = iter.collect();
     result.push(tail.join(joiner));
+    tracing::debug!(
+        original_segments = original,
+        delivered_segments = result.len(),
+        max_segments = max,
+        "[presentation:segment] merged overflow into final segment"
+    );
     result
 }
As per coding guidelines, "Log entry/exit, branches ... with stable grep-friendly prefixes (`[domain]`, `[rpc]`, `[ui-flow]`) and correlation fields."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/channels/providers/presentation.rs` around lines 261 - 268,
When collapsing overflow segments into a tail (the else branch that runs when
max != 0 && segments.len() > max), add a debug log before pushing the joined
tail that uses the project's logging macro (e.g., tracing::debug! or
log::debug!) with a stable grep-friendly prefix like "[presentation]" and
include correlation fields: max, original_len = segments.len() before
consumption, tail_len, and the joiner value; place this log immediately before
the call to result.push(tail.join(joiner)) so the merge event is recorded for
debugging/regression triage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/openhuman/channels/providers/presentation_tests.rs`:
- Around line 75-80: The test currently checks only for the paragraph marker via
joined.contains(&format!("Paragraph number {} ", i)), which can miss truncated
bodies; update the assertion inside the for loop (for i in 0..10) to assert that
the concatenated result named joined contains the full original paragraph string
paras[i] (i.e., use joined.contains(&paras[i])) and keep the failure message
referencing i so missing or truncated paragraphs fail the test.

In `@src/openhuman/channels/providers/presentation.rs`:
- Around line 261-268: When collapsing overflow segments into a tail (the else
branch that runs when max != 0 && segments.len() > max), add a debug log before
pushing the joined tail that uses the project's logging macro (e.g.,
tracing::debug! or log::debug!) with a stable grep-friendly prefix like
"[presentation]" and include correlation fields: max, original_len =
segments.len() before consumption, tail_len, and the joiner value; place this
log immediately before the call to result.push(tail.join(joiner)) so the merge
event is recorded for debugging/regression triage.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 20cb64e8-fd38-4dfa-8b01-f296b7a05186

📥 Commits

Reviewing files that changed from the base of the PR and between 8a0ed8a and c929a98.

📒 Files selected for processing (2)
  • src/openhuman/channels/providers/presentation.rs
  • src/openhuman/channels/providers/presentation_tests.rs

…ntation fix

- presentation.rs: emit a `tracing::debug!` log when `cap_segments`
  merges overflow into the tail, with `[presentation:segment]` prefix
  and correlation fields (max, original_len, tail_count, tail_len,
  joiner_len). Aligns with the project's "verbose diagnostics on new
  flows" rule from CLAUDE.md.

- presentation_tests.rs: tighten the
  `max_segments_respected_without_dropping_content` assertion from
  `joined.contains(&format!("Paragraph number {} ", i))` to
  `joined.contains(&paras[i])` so a hypothetical mid-paragraph
  truncation cannot slip past the test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/openhuman/channels/providers/presentation_tests.rs (1)

63-75: ⚡ Quick win

Align new test local variable names with camelCase guideline.

New locals like result, joined, tail_count-style patterns are in snake_case; this path requires camelCase naming for variables.

As per coding guidelines: src/openhuman/**/*.rs: Use camelCase for variable names.

Also applies to: 88-115, 123-143

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/channels/providers/presentation_tests.rs` around lines 63 - 75,
The test in presentation_tests.rs uses snake_case locals (paras, result, joined,
tail_count patterns) but the project requires camelCase for locals under
src/openhuman; rename these local variables to camelCase (e.g., paras ->
paragraphs or paragraphList, result -> resultSegments, joined -> joinedText,
tail_count -> tailCount) and update all usages in the test including assertions
that reference MAX_SEGMENTS and the call to segment_for_delivery so the code
compiles; apply the same camelCase renames to the other test blocks referenced
(the locals in the blocks around the other assertions currently flagged).
src/openhuman/channels/providers/presentation.rs (1)

264-277: ⚡ Quick win

Use camelCase for newly introduced local variables in cap_segments.

original_len, tail_count, and joiner_len don’t match the repository naming rule for this path.

Proposed rename
-    let original_len = segments.len();
+    let originalLen = segments.len();
@@
-    let tail_count = tail.len();
+    let tailCount = tail.len();
@@
         target: "presentation",
         max,
-        original_len,
-        tail_count,
+        originalLen,
+        tailCount,
         tail_len = merged.len(),
-        joiner_len = joiner.len(),
+        joinerLen = joiner.len(),
         "[presentation:segment] merging {} overflow segments into tail",
-        tail_count
+        tailCount

As per coding guidelines: src/openhuman/**/*.rs: Use camelCase for variable names.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/channels/providers/presentation.rs` around lines 264 - 277, The
local variables in cap_segments use snake_case; rename them to camelCase (e.g.,
originalLen, tailCount, tailLen or joinerLen as appropriate) and update all
their usages (including the tracing::debug! invocation and any assignments like
tail_len = merged.len()) to the new camelCase names; ensure the variables
original_len, tail_count, tail_len, and joiner_len are replaced consistently
within the cap_segments function and the debug macro.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/openhuman/channels/providers/presentation_tests.rs`:
- Around line 63-75: The test in presentation_tests.rs uses snake_case locals
(paras, result, joined, tail_count patterns) but the project requires camelCase
for locals under src/openhuman; rename these local variables to camelCase (e.g.,
paras -> paragraphs or paragraphList, result -> resultSegments, joined ->
joinedText, tail_count -> tailCount) and update all usages in the test including
assertions that reference MAX_SEGMENTS and the call to segment_for_delivery so
the code compiles; apply the same camelCase renames to the other test blocks
referenced (the locals in the blocks around the other assertions currently
flagged).

In `@src/openhuman/channels/providers/presentation.rs`:
- Around line 264-277: The local variables in cap_segments use snake_case;
rename them to camelCase (e.g., originalLen, tailCount, tailLen or joinerLen as
appropriate) and update all their usages (including the tracing::debug!
invocation and any assignments like tail_len = merged.len()) to the new
camelCase names; ensure the variables original_len, tail_count, tail_len, and
joiner_len are replaced consistently within the cap_segments function and the
debug macro.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dc00f923-36b9-4e27-8428-021718af779d

📥 Commits

Reviewing files that changed from the base of the PR and between c929a98 and e3670bc.

📒 Files selected for processing (2)
  • src/openhuman/channels/providers/presentation.rs
  • src/openhuman/channels/providers/presentation_tests.rs

Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean fix. The cap_segments helper is the right shape — bounded bubble count, no content loss, well-documented. The regression test against the actual transcript from the issue is a nice touch; it exercises exactly the code path that was broken rather than just the helper in isolation. Left two minor inline comments on the debug logging, neither blocking.

@@ -187,7 +187,7 @@ fn segment_for_delivery(text: &str) -> Vec<String> {
segments = merged.len(),
"[presentation:segment] split by paragraphs"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] segments = merged.len() logs the pre-cap count, so when cap_segments actually fires you'd see segments=10 followed by merging 5 overflow segments into tail — the sequence tells the story, but the first log reads like 10 segments were delivered. Renaming the field to segments_before_cap would make the two lines unambiguous when tailing logs.

original_len,
tail_count,
tail_len = merged.len(),
joiner_len = joiner.len(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] joiner_len = joiner.len() is always 2 ("\n\n") or 1 (" "), both knowable from the two call sites. It does not help distinguish anything at runtime — you already have target: "presentation" and the message string identifying the path. Fine to keep for completeness, just noting it does not carry diagnostic weight.

Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two minor items to address before merge — nothing structural, just tightening up the debug logging:

  1. segments log field ambiguity — the tracing::debug! on the line just before cap_segments logs segments = merged.len() which is the pre-cap count. When cap_segments then logs its own merge message, a reader sees two different segment counts in the span. Rename to segments_before_cap or similar to make the sequence unambiguous.

  2. joiner_len in cap_segments debug log — this is always 2 ("\n\n") or 1 (" "), fully deterministic from the two call sites. It doesn't carry diagnostic weight. Consider dropping it or replacing with the joiner string itself (joiner = %joiner) which is more immediately readable.

Both are quick fixes. The logic itself is solid — nice work on the regression test with the exact transcript.

@senamakel senamakel merged commit 151451f into tinyhumansai:main Apr 30, 2026
15 checks passed
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
…inyhumansai#1041) (tinyhumansai#1051)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] UI Not Rendering Full LLM Response (List Items / Markdown Truncated

3 participants