[chart] Update InferenceService template for PD mode support by slin1237 · Pull Request #430 · ome-projects/ome

slin1237 · 2025-12-12T23:19:27Z

Remove runtime spec (auto-selected by operator)
Add PD mode auto-detection (model names ending with "-pd")
PD mode: requires engine, decoder, and router
Non-PD mode: requires engine, optional router (no decoder)
Support both flat (minReplicas) and nested (engine.minReplicas) config

Checklist

Tests added/updated (if applicable)
Docs updated (if applicable)
make test passes locally

- Remove runtime spec (auto-selected by operator) - Add PD mode auto-detection (model names ending with "-pd") - PD mode: requires engine, decoder, and router - Non-PD mode: requires engine, optional router (no decoder) - Support both flat (minReplicas) and nested (engine.minReplicas) config

gemini-code-assist · 2025-12-12T23:19:30Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

- Remove 10 -pd model entries from values.yaml (they're not separate models) - Remove 10 -pd entries from model registry in _helpers.tpl - Add pdMode field: when true, InferenceService includes decoder and router - Update InferenceService to use explicit pdMode flag (not model name suffix) - Update README with pdMode documentation and supported PD models list - Update model count from 176 to 165 PD mode supported models: kimi-k2-instruct, deepseek-rdma, llama-3-1-70b-instruct, llama-3-2-1b-instruct, llama-3-2-3b-instruct, llama-3-3-70b-instruct, llama-4-maverick-17b-128e-instruct-fp8, llama-4-scout-17b-16e-instruct, mistral-7b-instruct, mixtral-8x7b-instruct

* [chart] Update InferenceService template for PD mode support - Remove runtime spec (auto-selected by operator) - Add PD mode auto-detection (model names ending with "-pd") - PD mode: requires engine, decoder, and router - Non-PD mode: requires engine, optional router (no decoder) - Support both flat (minReplicas) and nested (engine.minReplicas) config * [chart] Replace -pd model entries with pdMode configuration option - Remove 10 -pd model entries from values.yaml (they're not separate models) - Remove 10 -pd entries from model registry in _helpers.tpl - Add pdMode field: when true, InferenceService includes decoder and router - Update InferenceService to use explicit pdMode flag (not model name suffix) - Update README with pdMode documentation and supported PD models list - Update model count from 176 to 165 PD mode supported models: kimi-k2-instruct, deepseek-rdma, llama-3-1-70b-instruct, llama-3-2-1b-instruct, llama-3-2-3b-instruct, llama-3-3-70b-instruct, llama-4-maverick-17b-128e-instruct-fp8, llama-4-scout-17b-16e-instruct, mistral-7b-instruct, mixtral-8x7b-instruct

github-actions Bot added the helm Helm chart changes label Dec 12, 2025

github-actions Bot added the documentation Documentation changes label Dec 13, 2025

slin1237 merged commit 9466e5e into main Dec 13, 2025
27 checks passed

slin1237 deleted the helm-n/1 branch December 13, 2025 18:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[chart] Update InferenceService template for PD mode support#430

[chart] Update InferenceService template for PD mode support#430
slin1237 merged 2 commits into
mainfrom
helm-n/1

slin1237 commented Dec 12, 2025

Uh oh!

gemini-code-assist Bot commented Dec 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

slin1237 commented Dec 12, 2025

Checklist

Uh oh!

gemini-code-assist Bot commented Dec 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant