Skip to content

docs(profiler): clarify when planner-profile-data ConfigMap is emitted [DYN-2751]#8486

Merged
tedzhouhk merged 1 commit into
mainfrom
hzhou/dyn-2751
Apr 22, 2026
Merged

docs(profiler): clarify when planner-profile-data ConfigMap is emitted [DYN-2751]#8486
tedzhouhk merged 1 commit into
mainfrom
hzhou/dyn-2751

Conversation

@tedzhouhk
Copy link
Copy Markdown
Contributor

@tedzhouhk tedzhouhk commented Apr 22, 2026

Summary

Documents the conditions under which planner-profile-data-XXXX is emitted, and flags the agg + throughput-scaling gap that showed up in QA ticket DYN-2751.

  • assemble_final_config — profile-data ConfigMap only lands when the picked config is disagg AND NPZ sweep ran (thorough only). Rapid deployments never emit it.
  • build_aic_interpolation_spec — spec only carries prefill_pick + decode_pick, so agg picks on rapid have no AIC bootstrap fallback. Planner currently relies solely on DYN_BENCHMARK_MODE runtime endpoint.
  • profile_sla.pyTODO at the agg-skip site pointing at the wider fix (extend AICInterpolationSpec + run_interpolation with an agg_pick).

No behavior change — docstrings and TODO only. The bigger rework (plumb agg through the profiler + planner AIC handoff + mocker worker flags) is deliberately out of scope for a release QA bug fix.

Related

  • QA ticket: DYN-2751 (release/1.1.0)
  • Planned cherry-pick to release/1.1.0 after this lands.

Test plan

  • pre-commit run on the two touched files passes (isort / black / flake8 / ruff / codespell).
  • Lint-only PR — no runtime change, no unit tests added.

Open in Devin Review

Summary by CodeRabbit

  • Documentation

    • Clarified profiler behavior for interpolation handling across different deployment configurations, specifying when interpolation data is generated and the coverage limitations for disaggregated vs. aggregated deployments.
  • Chores

    • Updated internal profiler documentation with implementation constraints and TODO notes for future enhancement.

Document the conditions under which `planner-profile-data-XXXX` is
generated, and flag the agg + throughput-scaling gap with a TODO:

- assemble_final_config: the profile-data ConfigMap only lands when the
  picked config is disagg AND NPZ sweep ran (thorough only); rapid
  deployments never emit it.
- build_aic_interpolation_spec: the spec carries prefill_pick +
  decode_pick only, so agg picks on rapid have no AIC fallback and the
  planner relies on DYN_BENCHMARK_MODE runtime endpoint only.
- profile_sla.py: TODO at the agg-skip site pointing at the wider
  fix (extend AICInterpolationSpec + run_interpolation with an
  agg_pick).

No behavior change — docstrings and TODO only.

Signed-off-by: hongkuanz <hongkuanz@nvidia.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 22, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 135583e6-07c9-4359-83ed-a52d5079dfc1

📥 Commits

Reviewing files that changed from the base of the PR and between 77a0051 and 596d491.

📒 Files selected for processing (2)
  • components/src/dynamo/profiler/profile_sla.py
  • components/src/dynamo/profiler/utils/dgd_generation.py

Walkthrough

Two profiler files updated to clarify interpolation handling: a conditional branch added to skip interpolation building for aggregated configurations, and docstrings expanded to document when interpolation-data ConfigMaps are emitted and the coverage limitations of interpolation specs.

Changes

Cohort / File(s) Summary
Profiler Interpolation and Configuration Logic
components/src/dynamo/profiler/profile_sla.py, components/src/dynamo/profiler/utils/dgd_generation.py
Added conditional branch to skip interpolation for aggregated configurations with clarifying TODO comment. Expanded docstrings for assemble_final_config and build_aic_interpolation_spec to document when interpolation-data ConfigMaps are emitted, spec coverage constraints (prefill and decode picks only), and fallback behavior when rapid AIC picks aggregated configs.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: clarifying documentation about when the planner-profile-data ConfigMap is emitted, with a reference to the related QA ticket DYN-2751.
Description check ✅ Passed The description includes all required template sections (Overview/Summary, Details, Where to start, Related Issues) with comprehensive explanations of changes, the test plan, and related QA ticket reference.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tedzhouhk tedzhouhk enabled auto-merge (squash) April 22, 2026 00:06
@tedzhouhk tedzhouhk merged commit 5c35805 into main Apr 22, 2026
69 checks passed
@tedzhouhk tedzhouhk deleted the hzhou/dyn-2751 branch April 22, 2026 00:31
nv-nmailhot pushed a commit that referenced this pull request Apr 22, 2026
DYN-2751] (cherry-pick of #8486) (#8516)

Signed-off-by: hongkuanz <hongkuanz@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants