Skip to content

Add structured-output migration repro test#1

Closed
fergusfinn wants to merge 171 commits into
mainfrom
repro/structured-output-migration
Closed

Add structured-output migration repro test#1
fergusfinn wants to merge 171 commits into
mainfrom
repro/structured-output-migration

Conversation

@fergusfinn
Copy link
Copy Markdown

Summary

  • add a focused fault-tolerance repro for structured-output request migration with real local dynamo.frontend and real dynamo.vllm workers
  • extend the migration test helper to accept a payload override so the repro can inject an OpenAI response_format JSON schema request
  • kill the serving worker mid-stream and assert the resumed response still parses as JSON

What this reproduces

On the current v1.0.1-based environment, request migration is happening, but the resumed structured-output stream becomes invalid JSON.

Observed in the targeted test run on gotenks:

  • frontend logs Stream disconnected... recreating stream...
  • migration metrics report ongoing_request: 1, new_request: 0
  • the resumed response contains nested restarted JSON content, for example:
{
  "animals": [
    {
      "name": "Lion",
      "habitat": "{
        "animals": [
          {
  • json.loads(...) then fails with:
json.decoder.JSONDecodeError: Invalid control character at: line 7 column 20 (char 66)

Repro command

source /tmp/dynamo-install-mv3o75/.venv/bin/activate
cd /tmp/dynamo-install-mv3o75
python -m pytest -q tests/fault_tolerance/migration/test_vllm_structured.py::test_request_migration_vllm_aggregated_structured_output -s

Latest result

FAILED tests/fault_tolerance/migration/test_vllm_structured.py::test_request_migration_vllm_aggregated_structured_output[tcp]
1 failed in 49.31s

jasonqinzhou and others added 30 commits February 26, 2026 18:13
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: alec-flowers <aflowers@nvidia.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…i-dynamo#6662)

Signed-off-by: Dan Gil <dagil@nvidia.com>
Signed-off-by: dagil-nvidia <dagil@nvidia.com>
Signed-off-by: Harrison Saturley-Hall <hsaturleyhal@nvidia.com>
Co-authored-by: Harrison Saturley-Hall <hsaturleyhal@nvidia.com>
Signed-off-by: ashnamehrotra <ashnamehrotra@gmail.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
…es (ai-dynamo#6682)

Signed-off-by: Anant Sharma <anants@nvidia.com>
…metadata to backends (ai-dynamo#6692) (ai-dynamo#6718)

Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
)

Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
Signed-off-by: hongkuanz <hongkuanz@nvidia.com>
Signed-off-by: Dan Gil <dagil@nvidia.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Harrison Saturley-Hall <hsaturleyhal@nvidia.com>
Signed-off-by: Harrison Saturley-Hall <hsaturleyhal@nvidia.com>
…amo#6753)

Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>
…age type is pvc (ai-dynamo#6752) (ai-dynamo#6755)

Signed-off-by: Julien Mancuso <jmancuso@nvidia.com>
…iation (ai-dynamo#6651) (ai-dynamo#6776)

Signed-off-by: Guan Luo <41310872+GuanLuo@users.noreply.github.com>
Signed-off-by: Qi Wang <qiwa@nvidia.com>
…rker (ai-dynamo#6765)

Signed-off-by: Krishnan Prashanth <kprashanth@nvidia.com>
…s (http://nvbugs/5936491/1) (ai-dynamo#6772)

Signed-off-by: Matej Kosec <mkosec@nvidia.com>
dagil-nvidia and others added 18 commits March 12, 2026 17:53
…mo#7306)

Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
Signed-off-by: Anant Sharma <anants@nvidia.com>
Co-authored-by: Anant Sharma <anants@nvidia.com>
Signed-off-by: Hannah Zhang <hannahz@nvidia.com>
Signed-off-by: Dan Gil <dagil@nvidia.com>
Co-authored-by: hhzhang16 <54051230+hhzhang16@users.noreply.github.com>
Signed-off-by: Dan Gil <dagil@nvidia.com>
Signed-off-by: akshatha-k <33278067+akshatha-k@users.noreply.github.com>
Co-authored-by: akshatha-k <33278067+akshatha-k@users.noreply.github.com>
…7312) (ai-dynamo#7332)

Signed-off-by: Dmitry Tokarev <dtokarev@nvidia.com>
Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
…ynamo#7336, ai-dynamo#7350, ai-dynamo#7352) (ai-dynamo#7354)

Signed-off-by: Dan Gil <dagil@nvidia.com>
Signed-off-by: dagil-nvidia <dagil@nvidia.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: Neal Vaidya <nealv@nvidia.com>
ai-dynamo#7404)

Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
…7410)

Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
Signed-off-by: Harrison Saturley-Hall <hsaturleyhal@nvidia.com>
Signed-off-by: Dan Gil <dagil@nvidia.com>
Signed-off-by: Neal Vaidya <nealv@nvidia.com>
Signed-off-by: athreesh <anish.maddipoti@utexas.edu>
Signed-off-by: Anish <80174047+athreesh@users.noreply.github.com>
Signed-off-by: akshatha-k <akshutk@gmail.com>
Signed-off-by: Nikhar Maheshwari <nikharm@nvidia.com>
Signed-off-by: Keiven Chang <keivenchang@users.noreply.github.com>
Signed-off-by: Dmitry Tokarev <dtokarev@nvidia.com>
Co-authored-by: Neal Vaidya <nealv@nvidia.com>
Co-authored-by: Anish <80174047+athreesh@users.noreply.github.com>
Co-authored-by: akshatha-k <33278067+akshatha-k@users.noreply.github.com>
Co-authored-by: akshatha-k <akshutk@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: nikharm <nikharm@nvidia.com>
Co-authored-by: Keiven C <213854356+keivenchang@users.noreply.github.com>
Co-authored-by: Keiven Chang <keivenchang@users.noreply.github.com>
Co-authored-by: Dmitry Tokarev <dtokarev@nvidia.com>
Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
…to be use Kimi's tokenizer and fix tiktoken multi-byte handling (ai-dynamo#7424)
…mo#7433)

Signed-off-by: Dan Gil <dagil@nvidia.com>
Co-authored-by: Ben Hamm <ben.hamm@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
…o#7412) (ai-dynamo#7429)

Signed-off-by: Harrison King Saturley-Hall <hsaturleyhal@nvidia.com>
Co-authored-by: Biswa Panda <biswa.panda@gmail.com>
Signed-off-by: Dan Gil <dagil@nvidia.com>
@fergusfinn fergusfinn force-pushed the repro/structured-output-migration branch from a10f970 to 43f38e4 Compare March 25, 2026 15:25
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a10f9700e4

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@pytest.mark.vllm
@pytest.mark.gpu_1
@pytest.mark.e2e
@pytest.mark.post_merge
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Mark the repro test non-blocking until migration fix lands

This new test is tagged post_merge, and the post-merge workflow selects all vllm and gpu_1 tests with (pre_merge or post_merge) markers, so it will run in nightly CI; because the test asserts json.loads(response_text) succeeds after forced migration (the known repro path), it will keep the post-merge pipeline red on environments where the migration bug is still present. Please gate it with xfail/skip (or remove the post_merge marker) until the product fix is merged.

Useful? React with 👍 / 👎.

@fergusfinn
Copy link
Copy Markdown
Author

Reran the repro on the rebased main branch in a fresh Python 3.11 environment (/tmp/dynamo-install-mv3o75/.venv-main) with current vllm deps and rebuilt local bindings.

Command:

source /tmp/dynamo-install-mv3o75/.venv-main/bin/activate
cd /tmp/dynamo-install-mv3o75
python -m pytest -q tests/fault_tolerance/migration/test_vllm_structured.py::test_request_migration_vllm_aggregated_structured_output -s

Latest result on main:

FAILED tests/fault_tolerance/migration/test_vllm_structured.py::test_request_migration_vllm_aggregated_structured_output[tcp]
1 failed in 60.17s

Observed behavior differs from the earlier v1.0.1 snapshot run:

  • worker 2 is killed as intended
  • frontend logs Stream disconnected... recreating stream...
  • worker 1 then hits a vLLM engine crash / segfault during the resumed request
  • frontend logs Cannot recreate stream: no instances found for endpoint dynamo/backend/generate
  • the test fails in validate_response(...) because the client sees an event: error SSE and a 12.293s delay instead of a completed migrated response

So on rebased main, this repro is still failing, but the failure mode is now migration collapse / backend crash rather than malformed resumed JSON.

@github-actions
Copy link
Copy Markdown

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions Bot added the Stale label May 10, 2026
@github-actions
Copy link
Copy Markdown

This PR has been closed due to inactivity. If you believe this PR is still relevant, please feel free to reopen it with additional context or information.

@github-actions github-actions Bot closed this May 15, 2026
@github-actions github-actions Bot deleted the repro/structured-output-migration branch May 15, 2026 11:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.