fix(truncation): accurate overflow counts and omission indicators#833
fix(truncation): accurate overflow counts and omission indicators#833pszymkowiak merged 4 commits intodevelopfrom
Conversation
The overflow message "+N more" in condense_unified_diff was lying. The `changes` vec was capped at 15 entries, so `changes.len() - 10` could only reach 5 — reporting "+5 more" when 190 lines were actually truncated. Fix: track `added + removed` directly and compute true overflow as `(added + removed) - 10`. Adds 7 accuracy tests across 4 modules to lock in correct overflow reporting: - diff_cmd: overflow count matches true total (200 changes → "+190 more") - diff_cmd: no spurious overflow message for 8 changes - git: format_status overflow count is exact (25 staged → "+10 more") - git: compact_diff recovery hint present when hunk is truncated - grep: documents uncapped-vector invariant that prevents the diff bug - filter: smart_truncate kept + reported_more == total_lines Closes the test gap flagged in testing-patterns.md backlog for diff_cmd.rs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>
Two silent truncation bugs in git.rs: 1. compact_diff: hunk truncation showed "... (truncated)" with no count. An LLM had no way to know if 5 or 450 lines were dropped. Replace with "... (N lines truncated)" by tracking hunk_skipped separately from hunk_shown — flush the count at each hunk/file boundary. 2. filter_log_output: commit body beyond 3 lines was silently dropped. A 20-line body explaining a breaking change would show 3 lines with no indicator that 17 were omitted. Now appends "[+N lines omitted]" when body_omitted > 0. Adds 2 RED-then-GREEN tests to lock in the new behavior. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>
There was a problem hiding this comment.
Pull request overview
This PR improves the accuracy and clarity of truncation/overflow reporting in RTK’s diff/log condensation paths, and adds tests to prevent regressions in overflow counts and omission indicators.
Changes:
- Fix
condense_unified_diffoverflow reporting to use true total changes (added + removed) instead of a capped vector length. - Enhance
git::compact_diffto report per-hunk/file truncation with explicit line counts, and enhancefilter_log_outputto indicate omitted commit body lines. - Add “truncation accuracy” unit tests across multiple modules to lock in correct overflow/omission behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| src/grep_cmd.rs | Adds a new truncation/overflow-focused unit test. |
| src/git.rs | Updates diff compaction truncation reporting and git log body omission signaling; adds related tests. |
| src/filter.rs | Adds a unit test asserting smart-truncate overflow accounting. |
| src/diff_cmd.rs | Fixes unified diff overflow computation and adds regression tests for accurate counts. |
| } else if hunk_shown < max_hunk_lines && !line.starts_with("\\") { | ||
| // Context line | ||
| if hunk_lines > 0 { | ||
| if hunk_shown > 0 { | ||
| result.push(format!(" {}", line)); | ||
| hunk_lines += 1; | ||
| hunk_shown += 1; | ||
| } | ||
| } |
There was a problem hiding this comment.
compact_diff increments hunk_skipped only for added/removed lines beyond max_hunk_lines. Context lines (and any other non +/- lines in the hunk) that occur after the limit are currently ignored without incrementing hunk_skipped, so the reported "(N lines truncated)" can undercount the true number of lines dropped. Consider incrementing hunk_skipped for any additional hunk lines once hunk_shown >= max_hunk_lines (excluding the "\ No newline" marker if desired) so the count matches the actual truncation.
| let all_body_lines: Vec<&str> = lines | ||
| .map(|l| l.trim()) | ||
| .filter(|l| { | ||
| !l.is_empty() | ||
| && !l.starts_with("Signed-off-by:") | ||
| && !l.starts_with("Co-authored-by:") | ||
| }) | ||
| .take(3) | ||
| .collect(); | ||
| let body_omitted = all_body_lines.len().saturating_sub(3); | ||
| let body_lines = &all_body_lines[..all_body_lines.len().min(3)]; | ||
|
|
There was a problem hiding this comment.
filter_log_output now collects all body lines into all_body_lines to compute body_omitted. For long commit messages this can allocate and retain a large Vec<&str> unnecessarily. You can keep the first 3 body lines while streaming through the iterator and count the remainder (or count total without storing) to compute the omission indicator with O(1) extra memory.
| // 25 staged files, default status_max_files = 15 | ||
| // Should show 15, overflow = 25 - 15 = 10, report "+10 more" | ||
| let mut porcelain = String::from("## main...origin/main\n"); | ||
| for i in 0..25 { | ||
| porcelain.push_str(&format!("M staged_file_{}.rs\n", i)); | ||
| } | ||
| let result = format_status_output(&porcelain); | ||
| assert!( | ||
| result.contains("+10 more"), | ||
| "Expected '+10 more' for 25 staged files (max_files=15), got:\n{}", |
There was a problem hiding this comment.
test_format_status_overflow_count_exact asserts +10 more based on an assumed default status_max_files = 15. Since format_status_output reads this limit from config::limits(), this test will start failing if the configured default changes. Consider deriving the expected overflow from config::limits().status_max_files (e.g., 25 - max_files) and asserting on that value instead of a hard-coded 10.
| // 25 staged files, default status_max_files = 15 | |
| // Should show 15, overflow = 25 - 15 = 10, report "+10 more" | |
| let mut porcelain = String::from("## main...origin/main\n"); | |
| for i in 0..25 { | |
| porcelain.push_str(&format!("M staged_file_{}.rs\n", i)); | |
| } | |
| let result = format_status_output(&porcelain); | |
| assert!( | |
| result.contains("+10 more"), | |
| "Expected '+10 more' for 25 staged files (max_files=15), got:\n{}", | |
| // 25 staged files, respect configured status_max_files from config::limits() | |
| // Should show up to max_files, overflow = 25 - max_files, report "+<overflow> more" | |
| let max_files = config::limits().status_max_files; | |
| let total_files = 25; | |
| let mut porcelain = String::from("## main...origin/main\n"); | |
| for i in 0..total_files { | |
| porcelain.push_str(&format!("M staged_file_{}.rs\n", i)); | |
| } | |
| let result = format_status_output(&porcelain); | |
| let expected_overflow = total_files - max_files; | |
| let expected_overflow_str = format!("+{} more", expected_overflow); | |
| assert!( | |
| result.contains(&expected_overflow_str), | |
| "Expected '{}' for {} staged files (max_files={}), got:\n{}", | |
| expected_overflow_str, | |
| total_files, | |
| max_files, |
| #[test] | ||
| fn test_grep_overflow_uses_uncapped_total() { | ||
| // Confirm the grep overflow invariant: matches vec is never capped before overflow calc. | ||
| // If total_matches > per_file, overflow = total_matches - per_file (not capped). | ||
| // This documents that grep_cmd.rs avoids the diff_cmd bug (cap at N then compute N-10). | ||
| let per_file = config::limits().grep_max_per_file; | ||
| let total_matches = per_file + 42; | ||
| let overflow = total_matches - per_file; | ||
| assert_eq!(overflow, 42, "overflow must equal true suppressed count"); | ||
| // Demonstrate why capping before subtraction is wrong: | ||
| let hypothetical_cap = per_file + 5; | ||
| let capped = total_matches.min(hypothetical_cap); | ||
| let wrong_overflow = capped - per_file; | ||
| assert_ne!( | ||
| wrong_overflow, overflow, | ||
| "capping before subtraction gives wrong overflow" | ||
| ); | ||
| } |
There was a problem hiding this comment.
test_grep_overflow_uses_uncapped_total doesn't exercise any production code paths in grep_cmd—it only asserts arithmetic on local variables, so it would still pass even if the implementation started capping matches before computing overflow. Consider removing this test or replacing it with a test that validates the actual formatted output/overflow behavior (e.g., by extracting the per-file formatting/overflow calculation into a helper that can be unit-tested deterministically).
pszymkowiak
left a comment
There was a problem hiding this comment.
LGTM — all three truncation bugs fixed correctly, 9 tests added.
One minor note: compact_diff can emit two consecutive truncation messages when both per-hunk (100 lines) and global max_lines limits fire simultaneously. Rare in practice — can fix later.
Keep truncation accuracy fix (HEAD) — accurate overflow counts in condense_unified_diff. Drop test_condense_unified_no_truncation from develop (incompatible). Add test_no_truncation_large_diff and test_long_lines_not_truncated from develop (test compute_diff, no conflict). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Run cargo fmt on diff_cmd.rs and init.rs (line length violations) - Add Unreleased section to CHANGELOG.md with the two bug fixes from this PR (required by doc review CI gate) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Florian BRUNIAUX <florian@bruniaux.com>
Summary
fix(diff):
condense_unified_diffreported+5 morewhen 190 lines were truncated —changesvec was capped at 15, sochanges.len() - 10maxed at 5. Fix: compute overflow as(added + removed) - 10directly.fix(git):
compact_diffhunk truncation showed vague... (truncated)with no count. Fix: trackhunk_skippedseparately, flush... (N lines truncated)at each hunk/file boundary.fix(git):
filter_log_outputsilently dropped commit body lines beyond 3. A 20-line breaking-change description showed 3 lines with no indicator. Fix: append[+N lines omitted]when body is cut.Adds 9 accuracy tests across 4 modules locking in correct overflow behavior.
Root cause pattern: cap a collection before computing the overflow count, then report
capped_len - showninstead oftrue_total - shown.Test plan
cargo test— 1122 passed, 0 regressionscargo clippy --all-targets— 0 errorsrtk git diffon a large diff confirms count in truncation linertk git logon a commit with long body shows[+N lines omitted]🤖 Generated with Claude Code