fix(sglang): stop re-encoding routed_experts from sglang 0.5.11+ by KrishnanPrash · Pull Request #9657 · ai-dynamo/dynamo

KrishnanPrash · 2026-05-16T09:33:57Z

What

--enable-return-routed-experts crashes on the first decoded token against any non-DSv4 MoE model.

docker run ... --enable-return-routed-experts ...
curl localhost:8000/v1/chat/completions ...

Before:

File ".../decode_handler.py", line 649, in _process_token_stream
    routed_experts.numpy().tobytes()
AttributeError: 'str' object has no attribute 'numpy'

After:

HTTP 200, `nvext.routed_experts` is a base64 UTF-8 string. Recover ids with `np.frombuffer(b64decode(routed_experts), dtype=np.int32)`.

Why

sgl-project/sglang#21634 (in v0.5.11) moved the b64encode(t.numpy().tobytes()) of routed_experts into tokenizer_manager. The decode handler still ran that same encode on what is now already a str. Two emit sites, same bug.

DSv4 unaffected: _resolve_routed_experts_kwargs keeps the code path dormant on forks that lack return_routed_experts on async_generate.

What changed

decode_handler.py: pass the string through at both emit sites, drop the now-unused pybase64 import.
_compat.py: short note in the module docstring on the wire-format contract.
New test_routed_experts_passthrough.py: two cases pinning the pass-through.

Test

Test red against pre-fix code: traceback at line 747, matches ticket.
After fix: 2 passed.
pytest test_sglang_unit.py: 27 passed, no regressions.
Not tested end-to-end on H100. Bug is pure Python serialization with no model-state dependency.

Resolves DYN-3046

coderabbitai · 2026-05-16T09:40:31Z

Walkthrough

Updated routed_experts wire-format handling to accept pre-encoded base64 strings directly from SGLang >= 0.5.11. Removed redundant base64-encoding in the decode handler for both token and text streaming modes, removed the pybase64 dependency, updated the compatibility contract documentation, and added regression tests validating passthrough behavior.

Changes

Routed Experts Passthrough for SGLang 0.5.11+

Layer / File(s)	Summary
Compatibility contract documentation `components/src/dynamo/sglang/_compat.py`	Module docstring extended with release-specific guidance that SGLang >= 0.5.11 provides `routed_experts` as a pre-encoded base64 UTF-8 string, requiring passthrough without re-encoding.
Decode handler passthrough implementation `components/src/dynamo/sglang/request_handlers/llm/decode_handler.py`	Removed `pybase64` import and replaced base64-encoding logic in `_process_token_stream` and `_process_text_stream` to pass `routed_experts` strings from `meta_info` directly into `disaggregated_params` and `nvext` payloads.
Routed experts passthrough regression tests `components/src/dynamo/sglang/tests/test_routed_experts_passthrough.py`	New test module with token-stream and text-stream tests verifying that pre-encoded `routed_experts` strings are forwarded verbatim to downstream fields, plus a test asserting that `disaggregated_params` is absent when `routed_experts` is not provided.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: fixing re-encoding of routed_experts from sglang 0.5.11+, which is the core issue addressed across all modified files.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description check	✅ Passed	The PR description covers all required template sections: Overview (What), Details (Why and What changed), and Related Issues (Resolves DYN-3046). It provides clear problem statement, root cause analysis, and solution summary.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@components/src/dynamo/sglang/request_handlers/llm/decode_handler.py`:
- Around line 644-650: Normalize routed_experts before assigning to
out["disaggregated_params"]: if routed_experts is a str, pass it through
unchanged; if it appears tensor-like (has attributes/methods .detach, .cpu, and
.numpy), convert it to a base64 string by detaching, moving to CPU, calling
.numpy(), getting its raw bytes and base64-encoding that result, then assign
{"routed_experts": <base64_str>}; for any other type set {"routed_experts":
None} to avoid leaking non-serializable objects; add/adjust unit tests to cover
string pass-through and tensor-like normalization (use a small mock object or
actual tensor) and the fallback-to-None case.

In `@components/src/dynamo/sglang/tests/test_routed_experts_passthrough.py`:
- Around line 4-15: Replace internal Linear ticket identifiers in the added
docstring and any test IDs (e.g., "DYN-3046" and "dyn-3046-*") with a public
GitHub-style issue reference (for example "GH-9657"); update the module
docstring top comment and any test function or test name strings that include
those identifiers so they no longer contain "DYN-3046" or "dyn-3046-*" but
instead use the chosen GH-#### token, ensuring references in the docstring and
test identifiers (search for the literal "DYN-3046" and "dyn-3046-") are all
replaced consistently.
- Around line 26-32: The pytest module-level pytestmark list is missing a
required component marker and should be made immutable; update the pytestmark
definition used in this file (pytestmark) to include exactly one component
marker from {multimodal, router, kvbm, core} (e.g., pytest.mark.router) and
convert pytestmark from a mutable list to an immutable tuple so tests adhere to
marker policy.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0f2ffdeb-f61b-4279-b2b5-85941b2fd36a

📥 Commits

Reviewing files that changed from the base of the PR and between d6240e6 and 4de3bff.

📒 Files selected for processing (3)

components/src/dynamo/sglang/_compat.py
components/src/dynamo/sglang/request_handlers/llm/decode_handler.py
components/src/dynamo/sglang/tests/test_routed_experts_passthrough.py

fix(sglang): stop re-encoding routed_experts from sglang 0.5.11+

4de3bff

KrishnanPrash requested review from a team as code owners May 16, 2026 09:33

pull-request-size Bot added the size/L label May 16, 2026

github-actions Bot added fix backend::sglang Relates to the sglang backend labels May 16, 2026

coderabbitai Bot reviewed May 16, 2026

View reviewed changes

Comment thread components/src/dynamo/sglang/request_handlers/llm/decode_handler.py

Comment thread components/src/dynamo/sglang/tests/test_routed_experts_passthrough.py Outdated

Comment thread components/src/dynamo/sglang/tests/test_sglang_routed_experts_passthrough.py

test(sglang): trim routed_experts regression test and call-site comments

a5f0d56

copy-pr-bot Bot temporarily deployed to GITLAB May 16, 2026 09:41 Inactive

dynamo-ops approved these changes May 16, 2026

View reviewed changes

copy-pr-bot Bot temporarily deployed to GITLAB May 16, 2026 09:41 Inactive

test(sglang): rename test file to match sglang conftest skip pattern

69486f7

copy-pr-bot Bot temporarily deployed to GITLAB May 16, 2026 09:57 Inactive

copy-pr-bot Bot temporarily deployed to GITLAB May 16, 2026 09:58 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(sglang): stop re-encoding routed_experts from sglang 0.5.11+#9657

fix(sglang): stop re-encoding routed_experts from sglang 0.5.11+#9657
KrishnanPrash wants to merge 3 commits into
mainfrom
kprashanth/dyn-3046

KrishnanPrash commented May 16, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 16, 2026 •

edited

Loading

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

KrishnanPrash commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

What changed

Test

Uh oh!

coderabbitai Bot commented May 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

KrishnanPrash commented May 16, 2026 •

edited

Loading

coderabbitai Bot commented May 16, 2026 •

edited

Loading