chore(parsers): Mapping vllm parser tests to new PARSER_CASES.md taxonomy by zhongdaor-nv · Pull Request #9290 · ai-dynamo/dynamo

zhongdaor-nv · 2026-05-07T23:32:45Z

Overview

Output of DIS-1926 — bidirectional diff between vLLM and Dynamo tool-parser test corpora, mapped onto the new PARSER_CASES.md taxonomy (PR #9127). Doc-only changes; no source touched.

Unblocks DIS-1906 (cross-impl parser parity harness) by giving it a stable, accurate label set with per-test bucket assignments.

What's in this PR

`lib/parsers/PARSER_CASES.md` (+96 / −12)

Taxonomy refinements driven by gaps the audit surfaced:

Split PARSER.fmt.1 → PARSER.fmt.5. Old CASE.21 (and an earlier draft of PARSER.fmt.1) conflated function-name surface concerns with argument-envelope shape concerns. Now:
- PARSER.fmt.1 — function-name surface only (allowed identifier chars, functions.NAME vs bare NAME, malformed-ID rejection).
- PARSER.fmt.5 — argument-envelope shape: native call-ID preservation (Kimi K2 PR #32768), JSON field-order tolerance ({name, arguments} vs {arguments, name}), arguments ↔ parameters key alias.
Broaden PARSER.fmt.3 examples — beyond Kimi K2 singular vs plural section tokens, document Mistral pre-v11 vs v11+ wire format, Llama 3 with vs without <|python_tag|>, Hermes qwen25 registry alias.
New Known production gaps section — flags Mistral v11+ wire format ([TOOL_CALLS]name{...args} name-then-object) as a parser-implementation gap. Dynamo's ToolCallConfig::mistral() currently only handles pre-v11 (JSON-array
body); vLLM tests v11 extensively. v11 is the current Mistral-Small / Mistral-Large production path.
Promote regex-timeout / failure containment to Universal Gaps — vLLM has explicit test_regex_timeout_handling for llama3_json / llama4_pythonic / pythonic and *_streaming_exception_returns_none for Mistral; Dynamo relies on
Rust regex linear-time guarantees but does not pin failure-containment paths.
Cross-ref PARSER.batch.1 happy-path → PARSER.fmt.5 native-ID sub-axis.

Summary by CodeRabbit

Documentation
- Updated internal parser implementation guidelines and test coverage documentation to clarify format-conditional variants and argument-envelope shape conventions across different model formats.

github-actions · 2026-05-07T23:34:52Z

🌿 Fern Docs Preview: https://nvidia-preview-f6f1e86d-21a8-4cef-857d-b5a960dd113e.docs.buildwithfern.com/dynamo/dev

zhongdaor-nv · 2026-05-07T23:36:36Z

Note: lib/parsers/VLLM_TEST_AUDIT.md is only for reviewers to check correctness. I will remove this file before merging.

coderabbitai · 2026-05-07T23:38:59Z

Walkthrough

This PR expands the tool-call parser taxonomy documentation by introducing PARSER.fmt.5 for argument-envelope shape conventions, detailed wire-format variant coverage in PARSER.fmt.3, clarifies PARSER.fmt.1 scope, and updates all cross-references and applicability checklists to reflect the new taxonomy dimension.

Changes

Parser Taxonomy: Argument-Shape & Wire-Format

Layer / File(s)	Summary
Introduce PARSER.fmt.5 & Universal Gaps `lib/parsers/PARSER_CASES.md`	First mention of `PARSER.fmt.5` argument-envelope conventions; expands universal gaps section with regex-timeout/exception guidance and Mistral v11+ production gap note.
Define PARSER.fmt.5 Argument-Shape Conventions `lib/parsers/PARSER_CASES.md`	Full `PARSER.fmt.5` section defining three argument-envelope sub-axes: native call-ID preservation, JSON field-order tolerance (including `arguments` key named `name` edge case), and argument-key aliasing (`arguments` vs `parameters`), with references to named parametrized tests.
Expand PARSER.fmt.3 Wire-Format Variants `lib/parsers/PARSER_CASES.md`	Detailed `PARSER.fmt.3` documentation enumerating multiple wire-format spellings across Kimi K2, Mistral (pre-v11 vs v11+), Llama 3 (python_tag fence presence), and Hermes (`qwen25` alias), emphasizing active-config registration constraints.
Clarify Format-Related Scope & Batch.1 Reference `lib/parsers/PARSER_CASES.md`	Clarifies that `PARSER.fmt.1` covers only function-name surface and excludes argument-envelope shape (covered by `PARSER.fmt.5`); updates `PARSER.batch.1` happy-path requirements to assert native `ToolCall.id` preservation with `PARSER.fmt.5` cross-reference.
Update Applicability Summary & New-Parser Checklist `lib/parsers/PARSER_CASES.md`	Updates applicability summary table and new-parser-checklist to include `PARSER.fmt.{1..5}` coverage; explicitly calls out `PARSER.fmt.5` requirement for JSON-family parsers.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The PR title mentions 'vllm parser tests' and 'taxonomy', but the actual changes are documentation-only updates to PARSER_CASES.md taxonomy definitions themselves, not mapping of vLLM tests to the taxonomy.	Revise the title to reflect that this is a documentation update to the PARSER_CASES.md taxonomy (e.g., 'docs(parsers): Update PARSER_CASES taxonomy with fmt.5 and audit findings' or similar).

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description check	✅ Passed	The description includes all required template sections: Overview (with linked issue context), detailed breakdown of What's in this PR with specific taxonomy changes, and implicit related-issue reference through DIS-1926/PR `#9127`.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

lib/parsers/PARSER_CASES.md (1)
121-126: ⚡ Quick win

Add direct GitHub links for cited PR references.

You reference incidents as PR #...; please also include clickable GitHub URLs for those references to improve audit traceability in docs.

As per coding guidelines "**/*.md: Markdown documentation may reference Linear tickets for internal context, but should prefer to also include the matching GitHub link when one exists`."

Also applies to: 329-333, 429-432
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/parsers/PARSER_CASES.md` around lines 121 - 126, Update the Markdown
references that cite PR numbers to include direct clickable GitHub URLs: replace
occurrences like "PR `#32768`" (and the other cited ranges around lines 329-333
and 429-432) with the full GitHub PR link for the corresponding repository/PR,
keeping the existing text (e.g., "vLLM Kimi K2 PR `#32768`") but appending or
replacing with "([vLLM#32768](https://github.com/vllm-org/vllm/pull/32768))" or
the correct repo/PR URL; ensure the sentences mentioning PARSER.fmt.5, Kimi K2,
and any other PR references consistently include the clickable link to satisfy
the Markdown guideline.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@lib/parsers/PARSER_CASES.md`:
- Around line 121-126: Update the Markdown references that cite PR numbers to
include direct clickable GitHub URLs: replace occurrences like "PR `#32768`" (and
the other cited ranges around lines 329-333 and 429-432) with the full GitHub PR
link for the corresponding repository/PR, keeping the existing text (e.g., "vLLM
Kimi K2 PR `#32768`") but appending or replacing with
"([vLLM#32768](https://github.com/vllm-org/vllm/pull/32768))" or the correct
repo/PR URL; ensure the sentences mentioning PARSER.fmt.5, Kimi K2, and any
other PR references consistently include the clickable link to satisfy the
Markdown guideline.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 75224add-ba0e-4dc8-a6ae-d509f3be65f2

📥 Commits

Reviewing files that changed from the base of the PR and between 73bc969 and e03a365.

📒 Files selected for processing (2)

lib/parsers/PARSER_CASES.md
lib/parsers/VLLM_TEST_AUDIT.md

ayushag-nv

Please remove VLLM_TEST_AUDIT.md. LGTM.

keivenchang

thanks so much for the research/comparisons! Our next step would be to look at the differences and fill in the gap (if/when possible).

…audit Output of DIS-1926 (research vLLM parser test coverage gaps). Doc-only change to `lib/parsers/PARSER_CASES.md`; no source touched. `cargo check -p dynamo-parsers --tests` passes. Refinements driven by gaps surfaced during a bidirectional diff against vLLM `tests/tool_parsers/*` at commit b53c507bc91f87e28b03e9b54bbff7c76e97d58b: - Split PARSER.fmt.1 (function-name surface) from new PARSER.fmt.5 (argument-envelope shape: native call-ID preservation, JSON field-order tolerance, arguments↔parameters key alias). The old CASE.21 (and an earlier draft of PARSER.fmt.1) conflated both axes. - Broaden PARSER.fmt.3 examples beyond Kimi K2's singular vs plural section tokens to include Mistral pre-v11 vs v11+ wire formats, Llama 3 with vs without `<|python_tag|>`, Hermes `qwen25` registry alias. - Add `Known production gaps` section flagging Mistral v11+ wire format (`[TOOL_CALLS]name{...args}` name-then-object) — Dynamo's `ToolCallConfig::mistral()` only handles pre-v11 (JSON-array body), while vLLM tests v11 extensively. v11 is the current Mistral-Small / Mistral-Large production path. Largest single Dynamo parser gap surfaced by the audit. - Promote regex-timeout / parser-exception containment to Universal Gaps (vLLM has explicit `test_regex_timeout_handling` for llama3_json / llama4_pythonic / pythonic and `*_streaming_exception_returns_none` for Mistral; Dynamo relies on Rust regex linear-time guarantees but does not pin failure-containment paths). - Cross-ref PARSER.batch.1 happy-path → PARSER.fmt.5 native-ID sub-axis. - Update Applicability summary and `Adding a new parser` minimum viable set to cover fmt.{1..5}. The full per-test bidirectional audit (493 test rows across 36 parser families, mapped onto the new taxonomy) lives outside this commit. It informed every refinement above; the audit itself is not committed because it's a working artifact rather than a stable reference doc. Top-3 P0 gap status from the audit: 1. Mistral v11 wire format — STILL OPEN (parser doesn't exist; flagged in the new `Known production gaps` section). 2. PARSER.stream.{1..4} parser-tier — partial; DSv4 (#8946) and Gemma 4 (#8852) added coverage; Kimi K2 / Qwen3 / Hermes / Pythonic / Mistral parser-tier streaming tests still gap. 3. CASE.25 / FRONTEND.3 (`adjust_request`) — CLOSED for 7 families via 28 new tests in `lib/llm/tests/tool_choice.rs` (#8946 + #9035). Coverage PRs since 2026-05-05 baseline: #8888 (silent-drop recoveries), #8946 (DSv4 + Kimi K2 coverage), #9035 (top-N CASE.6+ quartet), #8852 (Gemma 4 family), #9127 (taxonomy rename). Signed-off-by: zhongdaor <zhongdaor@nvidia.com>

Companion artifact to PR #9290 (PARSER_CASES.md taxonomy refinement). Adds the full per-test bidirectional audit that informed every change in that PR — every vLLM tool-parser test mapped onto the new (PR #9127) taxonomy with a clickable source link. `lib/parsers/VLLM_TEST_AUDIT.md` (new file, 906 lines, 493 distinct test rows): - **Source**: vLLM `main` at commit b53c507bc91f87e28b03e9b54bbff7c76e97d58b (`vllm/tool_parsers/*`, `tests/tool_parsers/*`, `tests/tool_use/*`, `tests/entrypoints/openai/tool_parsers/*`). - **Scope**: 421 explicit test functions + 72 inherited common-suite rows from `ToolParserTests`. - **Bucketing**: every row carries one or more `PARSER_CASES.md` / `REASONING_CASES.md` / `PIPELINE_CASES.md` / `FRONTEND_CASES.md` tags, plus a one-line behavioral note. Re-bucketing transformations applied (vs the original CASE.* labels the audit was first written against, before PR #9127): - 244 streaming rows split per-row into PARSER.stream.{1,2,3,4} (single-call assembly / multi-call assembly / partial-token chunking / streaming termination) - 26 fmt rows split per-row into PARSER.fmt.1 (function-name) vs PARSER.fmt.5 (argument-shape: native ID, JSON field-order, arguments↔parameters alias) - Out-of-PARSER-scope buckets relocated to sibling docs: CASE.{11,18,25} → FRONTEND.{1,3,5,6}; CASE.12 → PIPELINE.finish_reason; CASE.{9,10,17} → REASONING.batch.{1,2}; CASE.20 → `// helper`; CASE.16 → inline-regression annotation; CASE.26 dissolved into PARSER.batch.4 impl-defined recovery contract Two mis-bucketings caught and fixed during review: - FunctionGemma::test_multiple_tool_calls and Gemma4::TestExtractToolCalls.test_multiple_tool_calls were both labeled CASE.1 but assert len(tool_calls) == 2 — corrected to PARSER.batch.2. Four bucket-assignment refinements caught by review: - test_unique_tool_call_ids (DSv3.2) drops fmt.5 (no native call-ID surface; just parallel-call distinctness). - test_invalid_funcall_id_skipped (Kimi K2) moves fmt.5 → fmt.1 (validation, not preservation). - 3 Mistral `argument_before_name*` parametrized rows gain fmt.5 (canonical field-order swap test set referenced by PARSER_CASES.md). A staleness banner at the top documents the re-bucketing transformation and mis-bucket fixes for traceability. Top findings the audit informed (already addressed in PR #9290 or flagged for follow-up): 1. Mistral v11+ wire format — STILL OPEN (parser doesn't exist; flagged in PARSER_CASES.md "Known production gaps"). 2. PARSER.stream.{1..4} parser-tier coverage gap in 5 families (Kimi K2 / Qwen3 / Hermes / Pythonic / Mistral) — partial closure via DSv4 (#8946) and Gemma 4 (#8852). 3. CASE.25 / FRONTEND.3 (`adjust_request`) — CLOSED for 7 families via 28 new tests in `lib/llm/tests/tool_choice.rs` (#8946 + #9035). Signed-off-by: zhongdaor <zhongdaor@nvidia.com>

pull-request-size Bot added the size/XL label May 7, 2026

github-actions Bot added chore documentation Improvements or additions to documentation labels May 7, 2026

zhongdaor-nv marked this pull request as ready for review May 7, 2026 23:37

zhongdaor-nv changed the title ~~chore(parsers): DIS-1926 — bidirectional vLLM↔Dynamo audit + taxonomy…~~ chore(parsers): Mapping vllm parser tests to new PARSER_CASES.md taxonomy May 7, 2026

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

zhongdaor-nv requested a review from a team as a code owner May 8, 2026 00:21

pull-request-size Bot added size/XXL and removed size/XL labels May 8, 2026

copy-pr-bot Bot temporarily deployed to GITLAB May 8, 2026 00:21 Inactive

zhongdaor-nv force-pushed the zhongdaor/dis-1926-research-vllm-parser-test-coverage-gaps branch from 55fb950 to e03a365 Compare May 8, 2026 00:34

pull-request-size Bot added size/XL and removed size/XXL labels May 8, 2026

ayushag-nv approved these changes May 8, 2026

View reviewed changes

dynamo-ops approved these changes May 8, 2026

View reviewed changes

keivenchang approved these changes May 8, 2026

View reviewed changes

zhongdaor-nv force-pushed the zhongdaor/dis-1926-research-vllm-parser-test-coverage-gaps branch from e03a365 to 5e06371 Compare May 8, 2026 17:20

pull-request-size Bot added size/M and removed size/XL labels May 8, 2026

dynamo-ops reviewed May 8, 2026

View reviewed changes

Comment thread lib/parsers/PARSER_CASES.md

zhongdaor-nv merged commit 3d451f2 into main May 8, 2026
56 checks passed

zhongdaor-nv deleted the zhongdaor/dis-1926-research-vllm-parser-test-coverage-gaps branch May 8, 2026 18:18

zhongdaor-nv mentioned this pull request May 8, 2026

docs(parsers): add DIS-1926 vLLM tool-parser test audit #9329

Draft

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(parsers): Mapping vllm parser tests to new PARSER_CASES.md taxonomy#9290

chore(parsers): Mapping vllm parser tests to new PARSER_CASES.md taxonomy#9290
zhongdaor-nv merged 1 commit into
mainfrom
zhongdaor/dis-1926-research-vllm-parser-test-coverage-gaps

zhongdaor-nv commented May 7, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

github-actions Bot commented May 7, 2026 •

edited

Loading

Uh oh!

zhongdaor-nv commented May 7, 2026

Uh oh!

coderabbitai Bot commented May 7, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

ayushag-nv left a comment

Uh oh!

keivenchang left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

zhongdaor-nv commented May 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

What's in this PR

lib/parsers/PARSER_CASES.md (+96 / −12)

Summary by CodeRabbit

Uh oh!

github-actions Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zhongdaor-nv commented May 7, 2026

Uh oh!

coderabbitai Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

ayushag-nv left a comment

Choose a reason for hiding this comment

Uh oh!

keivenchang left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zhongdaor-nv commented May 7, 2026 •

edited by coderabbitai Bot

Loading

`lib/parsers/PARSER_CASES.md` (+96 / −12)

github-actions Bot commented May 7, 2026 •

edited

Loading

coderabbitai Bot commented May 7, 2026 •

edited

Loading