fix(parsers): recover Kimi K2 tool calls when section_end is missing by keivenchang · Pull Request #8208 · ai-dynamo/dynamo

keivenchang · 2026-04-15T02:13:19Z

Overview:

Kimi K2 tool calls were getting silently dropped whenever the model hit max_tokens before emitting the section_end marker. The parser just threw away everything if that closing tag was missing -- same bug exists in Moonshot's reference Python parser and got copied into vLLM and SGLang too.

Details:

the root cause is in extract_tool_calls (kimi_k2_parser.rs) -- the old code discarded the entire tool call section when section_end wasnt there. now it treats missing section_end as "section goes to EOF" and pulls out whatever complete individual calls it can find
the streaming jail (jail.rs) had a related issue: find_tool_call_end_position_kimi_k2 couldn't tell "found section_end" apart from "section_end is missing", so it would early-exit and swallow parallel calls. changed the return type to Option<usize> so None skips the early-exit path, and finalize() recovers all the calls at stream end
4 new parser unit tests, 3 new streaming integration tests that reproduce the customer-reported scenario (DIS-1765)

Where should the reviewer start?

docs/agents/kimi_k2.md, then kimi_k2_parser.rs

Related Issues:

Relates to DIS-1765

/coderabbit profile chill

coderabbitai · 2026-04-15T02:18:48Z

Walkthrough

Refactors the Kimi K2 tool-call parser to handle truncated output gracefully by making the section-end marker optional. Updates the parser API to return Option<usize> instead of usize, modifying underlying implementations and callers to treat missing section-end markers as incomplete rather than errors.

Changes

Cohort / File(s)	Summary
Documentation `docs/agents/kimi_k2.md`	New documentation page describing the tool-call truncation bug and documenting the Python regex fix and Rust-side implementation status, including TODO items to upstream changes.
Kimi K2 Parser `lib/parsers/src/tool_calling/xml/kimi_k2_parser.rs`	Changed `find_tool_call_end_position_kimi_k2` return type from `usize` to `Option<usize>`; returns `None` when `section_end` marker is missing. Updated `extract_tool_calls` to extract complete tool calls from truncated output lacking `section_end`.
Parser API `lib/parsers/src/tool_calling/parsers.rs`	Updated `find_tool_call_end_position` return type from `usize` to `Option<usize>`. All parser branches now wrap results in `Some(...)`. Added doc comment clarifying that `None` indicates unclosed/incomplete tool-call sections.
Stream Handling `lib/llm/src/protocols/openai/chat_completions/jail.rs`	Modified `JailedStream::should_end_jail` early-exit logic to treat `find_tool_call_end_position` as optional; returns `(false, accumulated_content.len())` when `None` is returned instead of unconditionally splitting.
Test Coverage `lib/llm/tests/test_streaming_tool_parsers.rs`	Added `make_chunk` test helper and three integration-style tests: baseline case with complete markers, DIS-1765 reproduction with truncated output (no `section_end`), and multi-tool-call variant with truncation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: fixing Kimi K2 tool call recovery when the section_end marker is missing.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Description check	✅ Passed	PR description follows the template with all required sections: Overview explains the bug clearly, Details describe specific changes and affected files, Where should the reviewer start identifies key files, and Related Issues references DIS-1765.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

lib/llm/src/protocols/openai/chat_completions/jail.rs (1)

803-824: ⚠️ Potential issue | 🟠 Major

Preserve the terminal finish reason on the new finalize() recovery path.

These Kimi None cases now intentionally defer emission to finalize(), but the parsed tool-call branch in create_tool_call_choice() always builds the emitted chunk with finish_reason: None. For a truncated stream ending with FinishReason::Length, the recovered tool call comes out without any terminal reason, and fix_finish_reason() cannot restore it because it only rewrites existing Some(Stop) values. Please thread base_choice.finish_reason through the successful parsed-tool-call path so recovered Kimi responses still terminate with Length (or get rewritten from Stop to ToolCalls).

Suggested fix

-                    let choice = create_choice_stream(
-                        choice_index,
-                        Some(Role::Assistant),
-                        normal_text.as_deref().unwrap_or(""),
-                        Some(tool_call_chunks),
-                        None,
-                        None,
-                        None,
-                    );
+                    let choice = create_choice_stream(
+                        choice_index,
+                        Some(Role::Assistant),
+                        normal_text.as_deref().unwrap_or(""),
+                        Some(tool_call_chunks),
+                        base_choice.finish_reason,
+                        base_choice.stop_reason.clone(),
+                        base_choice.logprobs.clone(),
+                    );

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@lib/llm/src/protocols/openai/chat_completions/jail.rs` around lines 803 -
824, create_tool_call_choice() currently constructs the emitted tool-call chunk
with finish_reason: None causing recovered Kimi tool-call responses (from
finalize()) to lose terminal reasons like FinishReason::Length; update the
parsed-tool-call branch in create_tool_call_choice() to propagate the original
base_choice.finish_reason into the emitted chunk (use base_choice.finish_reason
instead of None) so finalize() recovers tool calls with the correct terminal
reason and fix_finish_reason() can continue to rewrite Stop→ToolCalls as before;
ensure any helper paths that build the emitted choice (the parsed tool-call
success branch where try_tool_call_parse_aggregate(...) and
find_tool_call_end_position(...) return) use base_choice.finish_reason and keep
existing behavior for non-parsed branches.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@lib/llm/src/protocols/openai/chat_completions/jail.rs`:
- Around line 803-824: create_tool_call_choice() currently constructs the
emitted tool-call chunk with finish_reason: None causing recovered Kimi
tool-call responses (from finalize()) to lose terminal reasons like
FinishReason::Length; update the parsed-tool-call branch in
create_tool_call_choice() to propagate the original base_choice.finish_reason
into the emitted chunk (use base_choice.finish_reason instead of None) so
finalize() recovers tool calls with the correct terminal reason and
fix_finish_reason() can continue to rewrite Stop→ToolCalls as before; ensure any
helper paths that build the emitted choice (the parsed tool-call success branch
where try_tool_call_parse_aggregate(...) and find_tool_call_end_position(...)
return) use base_choice.finish_reason and keep existing behavior for non-parsed
branches.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b62536c8-0684-494a-b37f-c0a1eaf514d8

📥 Commits

Reviewing files that changed from the base of the PR and between 45364b5 and c0f12a8.

📒 Files selected for processing (5)

docs/agents/kimi_k2.md
lib/llm/src/protocols/openai/chat_completions/jail.rs
lib/llm/tests/test_streaming_tool_parsers.rs
lib/parsers/src/tool_calling/parsers.rs
lib/parsers/src/tool_calling/xml/kimi_k2_parser.rs

When the model hits max_tokens or EOS before emitting <|tool_calls_section_end|>, the parser now treats the rest of the string as the section body and extracts any complete individual tool calls (those with <|tool_call_begin|> + args + <|tool_call_end|>). This matches the one-line regex fix proposed upstream for Moonshot's reference Python parser, where `re.findall` silently returns [] when section_end is absent. Parser change (kimi_k2_parser.rs): - extract_tool_calls: missing section_end → section extends to EOF - find_tool_call_end_position_kimi_k2: returns Option<usize>, None when section_end absent (signals incomplete section to the streaming jail) Streaming jail change (jail.rs): - should_end_jail: when find_tool_call_end_position returns None, skip early-exit so parallel tool calls keep accumulating until stream end; finalize() recovers them via the lenient parser Tests: - 4 new parser unit tests for truncation scenarios - 3 new streaming integration tests (complete section, single call truncated, multiple calls truncated) Docs: - docs/agents/kimi_k2.md: reference parser bug analysis with actual and proposed Python code, upstream TODO checklist DIS-1765

Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>

Reviewer feedback: this file captures troubleshooting and repro steps rather than user-facing docs. Moved to ~/notes/tool-calling/kimi_k2.md. Signed-off-by: Keiven Chang <keivenc@nvidia.com>

Add universal test cases that apply to every tool-call parser, filling in the gaps in the per-parser coverage by surfacing the matrix as live cargo-test output instead of a static doc. This first revision covers two cases for all 18 registered parsers: - case_1_single_call -- single tool-call happy path (18/18 pass) - case_5_missing_end_token_recovery (PR #8208 generalized) -- 2 pass (kimi_k2, mistral), 5 N/A, 11 KnownBroken (each tagged for follow-up work to generalize the PR #8208 fix) The framework is a four-state FixtureCase enum: Sample (parser handles correctly), KnownBroken (parser drops the call today; the test asserts the broken state and fails when a future fix actually adds recovery, forcing the fixture to be upgraded), NotApplicable (format genuinely lacks the concept; reason printed), Unimplemented (CI hard-fail). The four states distinguish honest gaps from silent ones, which is the matrix-sparseness problem this scaffold is meant to solve. Layout: - lib/parsers/src/tool_calling/test_cases/{mod,normalize,tests}.rs - lib/parsers/src/tool_calling/test_cases/fixtures/<parser>.rs (x18) Existing 134 inline parser tests are untouched; each test module gets a TODO breadcrumb pointing at the contract suite for future trimming once the suite stabilizes. Future revisions: more universal cases (parallel calls, malformed args, finish_reason, etc.), adversarial streaming chunkings, and a CI matrix-report block. Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>

CASE.16 was misleading — it sat next to CASE.1–CASE.15 generic test categories but is actually a per-incident bookkeeping marker, not a contract every parser must cover. Renames in this commit: - TEST_CASES.md: CASE.16 retired; REPORT.<n> introduced as a separate taxonomy with explicit semantics. - xml/kimi_k2_parser.rs: CASE.16 (PR #8208) → REPORT.8208. - test_streaming_tool_parsers.rs: CASE.16 → REPORT.8208 (same incident). - reasoning/base_parser.rs: drops the CASE.16 marker from two tests that had no incident reference; only CASE.8 (the real category) remains. - dsml/parser.rs chart: notes that DSv4 has no REPORT.<n> tests yet because no customer bugs have been filed against V4. Linear DIS-1842 description updated with the same rename so the chart and the in-code labels stay consistent. Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>

…omments REPORT.<n> was duplicating information already conveyed by the (PR #N) parenthetical inside the same #[test] line. Reverts the taxonomy: - TEST_CASES.md: drops REPORT.<n> bullet and section; documents the inline (PR #N) / (DIS-NNNN) convention instead. - xml/kimi_k2_parser.rs: REPORT.8208 → just (PR #8208) in CASE.5 comment. - test_streaming_tool_parsers.rs: same. - dsml/parser.rs chart: drops REPORT.<n> reference; notes V4 has no customer incidents yet. CASE.16 stays retired (no longer referenced anywhere). Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>

keivenchang requested a review from a team as a code owner April 15, 2026 02:13

pull-request-size Bot added the size/L label Apr 15, 2026

github-actions Bot added fix documentation Improvements or additions to documentation frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` labels Apr 15, 2026

keivenchang self-assigned this Apr 15, 2026

coderabbitai Bot reviewed Apr 15, 2026

View reviewed changes

copy-pr-bot Bot temporarily deployed to GITLAB April 15, 2026 05:03 Inactive

keivenchang force-pushed the keivenchang/fix-kimi-k2-tool-call-boundary branch from d87bb6f to c78fac3 Compare April 15, 2026 05:04

copy-pr-bot Bot temporarily deployed to GITLAB April 15, 2026 05:04 Inactive

keivenchang requested review from ayushag-nv and richardhuo-nv April 15, 2026 05:06

copy-pr-bot Bot temporarily deployed to GITLAB April 15, 2026 08:52 Inactive

grahamking approved these changes Apr 15, 2026

View reviewed changes

rmccorm4 reviewed Apr 20, 2026

View reviewed changes

Comment thread docs/agents/kimi_k2.md Outdated

keivenchang added 3 commits April 20, 2026 14:38

style: cargo fmt on test_streaming_tool_parsers.rs and parsers.rs

385c6be

Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>

chore: remove kimi_k2.md from repo, move to private notes

118a390

Reviewer feedback: this file captures troubleshooting and repro steps rather than user-facing docs. Moved to ~/notes/tool-calling/kimi_k2.md. Signed-off-by: Keiven Chang <keivenc@nvidia.com>

keivenchang force-pushed the keivenchang/fix-kimi-k2-tool-call-boundary branch from c78fac3 to 118a390 Compare April 20, 2026 21:40

copy-pr-bot Bot temporarily deployed to GITLAB April 20, 2026 21:40 Inactive

richardhuo-nv approved these changes Apr 20, 2026

View reviewed changes

rmccorm4 enabled auto-merge (squash) April 20, 2026 22:28

rmccorm4 merged commit 47cfbd4 into main Apr 20, 2026
90 checks passed

rmccorm4 deleted the keivenchang/fix-kimi-k2-tool-call-boundary branch April 20, 2026 22:28

copy-pr-bot Bot had a problem deploying to GITLAB April 20, 2026 22:29 Failure

keivenchang mentioned this pull request Apr 25, 2026

test(parsers): cross-parser contract test scaffold (DIS-1842 part 1) #8721

Closed

This was referenced Apr 25, 2026

test(parsers): add cross-parser contract test scaffold (part 1) #8724

Closed

test(parsers): unified cases × all parsers, coverage matrix (step 1) #8728

Closed

keivenchang mentioned this pull request Apr 27, 2026

docs(parsers): clear rustdoc invalid_html_tags warnings on doc comments #8775

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(parsers): recover Kimi K2 tool calls when section_end is missing#8208

fix(parsers): recover Kimi K2 tool calls when section_end is missing#8208
rmccorm4 merged 3 commits into
mainfrom
keivenchang/fix-kimi-k2-tool-call-boundary

keivenchang commented Apr 15, 2026 •

edited by rmccorm4

Loading

Uh oh!

coderabbitai Bot commented Apr 15, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

keivenchang commented Apr 15, 2026 • edited by rmccorm4 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues:

Uh oh!

coderabbitai Bot commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

keivenchang commented Apr 15, 2026 •

edited by rmccorm4

Loading

coderabbitai Bot commented Apr 15, 2026 •

edited

Loading