fix(planner): normalize model_name case in KubernetesConnector comparisons by tedzhouhk · Pull Request #8384 · ai-dynamo/dynamo

tedzhouhk · 2026-04-20T18:21:11Z

Summary

Fixes a pair of case-sensitivity bugs in KubernetesConnector.get_model_name() that caused Planner to enter CrashLoopBackOff in active mode when the deployment model name retained original casing (e.g. Qwen/Qwen3-0.6B).

Supersedes #8360 — includes brluo's original fix plus one additional normalization for the sibling prefill/decode comparison.

Root cause

self.user_provided_model_name is lowercased in __init__ (line 58).
But model_name retrieved from the DGD at line 227 was compared as-is → spurious UserProvidedModelNameMismatchError.
The sibling comparison at line 206 (prefill vs decode) had the same case-sensitive pattern, which would bite later if prefill/decode ever reported the same model with different casing (e.g. MDC display_name vs container-arg parsing).

Changes

Commit 1 (brluo): normalize model_name.lower() before comparing with user_provided_model_name.
Commit 2: normalize both sides of the prefill/decode comparison.

Test Plan

Validated end-to-end on a single-node Kubernetes cluster (L20 GPU), per brluo's original validation:

Planner enters Running state (no CrashLoopBackOff) with model_name: "Qwen/Qwen3-0.6B" + scaling_mode: "active"
Under load, Planner scales VllmDecodeWorker from 1→2
Advisory mode continues to work unchanged

Closes #8359

Summary by CodeRabbit

Bug Fixes

Model name validation is now case-insensitive. Validation errors will no longer occur when model names differ solely in capitalization between prefill and decode model configurations, or when comparing user-provided model names against deployment-derived values. This provides greater flexibility and reduces potential errors when managing model configurations across different deployment scenarios.

…ith user-provided name When model_name is provided in Planner config, it is normalized to lowercase in __init__ (self.user_provided_model_name = model_name.lower()), but the model name retrieved from the deployment was not normalized before comparison. This caused a spurious UserProvidedModelNameMismatchError in active mode when the deployment model name retained its original casing (e.g. Qwen/Qwen3-0.6B vs qwen/qwen3-0.6b). Fixes #8359 Signed-off-by: hongkuanz <hongkuanz@nvidia.com>

…comparison The sibling comparison at get_model_name() was also case-sensitive. If the prefill and decode services in a DGD ever report the same model with different casing (e.g. MDC display_name vs container-arg parsing), it would spuriously raise DeploymentModelNameMismatchError. Normalize both sides to match the user-provided-name comparison already fixed in the previous commit. Signed-off-by: hongkuanz <hongkuanz@nvidia.com>

coderabbitai · 2026-04-20T18:24:06Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 03f6dd06-5f72-4e5b-ab96-2682d0240baf

📥 Commits

Reviewing files that changed from the base of the PR and between 2618812 and d1466c2.

📒 Files selected for processing (1)

components/src/dynamo/planner/connectors/kubernetes.py

Walkthrough

Updated KubernetesConnector.get_model_name() to perform case-insensitive model name comparisons. Added .lower() calls to two equality checks, enabling the method to correctly handle model names with differing capitalization without raising spurious validation errors.

Changes

Cohort / File(s)	Summary
Case-insensitive model name validation `components/src/dynamo/planner/connectors/kubernetes.py`	Added `.lower()` calls to two model name equality comparisons in `get_model_name()`: (1) when comparing `prefill_model_name` vs `decode_model_name`, and (2) when comparing deployment-derived `model_name` vs `user_provided_model_name`. Aligns comparison logic with the lowercase normalization already applied during initialization.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: normalizing model_name case in KubernetesConnector comparisons.
Description check	✅ Passed	The description covers all required sections: Summary (root cause), Changes (what was fixed), and Test Plan (validation results). Includes issue reference.
Linked Issues check	✅ Passed	The PR fully addresses issue `#8359`: implements case-insensitive comparisons via .lower() on both model_name sources, matching the root cause analysis and required fixes.
Out of Scope Changes check	✅ Passed	All changes are scoped to KubernetesConnector.get_model_name() case-sensitivity fixes; no unrelated modifications detected.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…isons (cherry-pick of #8384) (#8401) Signed-off-by: hongkuanz <hongkuanz@nvidia.com> Co-authored-by: brluobt <brluo@nvidia.com>

brluobt and others added 2 commits April 20, 2026 11:16

tedzhouhk requested review from a team as code owners April 20, 2026 18:21

pull-request-size Bot added the size/XS label Apr 20, 2026

github-actions Bot added fix planner labels Apr 20, 2026

tedzhouhk mentioned this pull request Apr 20, 2026

fix(planner): normalize deployment model_name case before comparing with user-provided name #8360

Closed

4 tasks

PeaBrane approved these changes Apr 20, 2026

View reviewed changes

tedzhouhk enabled auto-merge (squash) April 20, 2026 18:24

Merge branch 'main' into hzhou/dyn-2747-model-name-case

60867b4

copy-pr-bot Bot temporarily deployed to GITLAB April 20, 2026 18:28 Inactive

tedzhouhk merged commit 073da7b into main Apr 20, 2026
65 checks passed

tedzhouhk deleted the hzhou/dyn-2747-model-name-case branch April 20, 2026 18:58

tedzhouhk mentioned this pull request Apr 20, 2026

fix(planner): normalize model_name case in KubernetesConnector comparisons (cherry-pick of #8384) #8401

Merged

3 tasks

copy-pr-bot Bot had a problem deploying to GITLAB April 20, 2026 20:17 Failure

nv-nmailhot pushed a commit that referenced this pull request Apr 20, 2026

fix(planner): normalize model_name case in KubernetesConnector compar…

7834378

…isons (cherry-pick of #8384) (#8401) Signed-off-by: hongkuanz <hongkuanz@nvidia.com> Co-authored-by: brluobt <brluo@nvidia.com>

tedzhouhk mentioned this pull request Apr 22, 2026

fix(planner): match MDC component field against backend default, not DGD key #8489

Merged

4 tasks

brluobt mentioned this pull request May 7, 2026

feat(planner): Add Advisory Mode for Scaling Decisions #8114

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(planner): normalize model_name case in KubernetesConnector comparisons#8384

fix(planner): normalize model_name case in KubernetesConnector comparisons#8384
tedzhouhk merged 3 commits into
mainfrom
hzhou/dyn-2747-model-name-case

tedzhouhk commented Apr 20, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

tedzhouhk commented Apr 20, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Changes

Test Plan

Summary by CodeRabbit

Bug Fixes

Uh oh!

coderabbitai Bot commented Apr 20, 2026

Walkthrough

Changes

Estimated code review effort

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tedzhouhk commented Apr 20, 2026 •

edited by coderabbitai Bot

Loading