fix: do not override the the event_buffer_max_size if user provided one#9284
Conversation
WalkthroughThe PR refines KV cache configuration handling in LLM worker initialization. When metrics publishing is enabled, the function now preserves existing ChangesKV Cache Event Buffering Configuration
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@components/src/dynamo/trtllm/workers/llm_worker.py`:
- Around line 324-327: The log call using logging.info with format "%d" for the
variable existing is unsafe because existing comes from kv_cache_config and may
not be an int; update the logging call in llm_worker.py (where logging.info(...)
logs "Using existing event_buffer_max_size=%d from kv_cache_config") to use a
type-safe formatter such as "%r" or "%s" (or explicitly cast existing to int
after validation) so non-integer config values do not raise formatting errors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d000ac6f-3312-45b0-a33f-e122e7e8a606
📒 Files selected for processing (1)
components/src/dynamo/trtllm/workers/llm_worker.py
indrajit96
left a comment
There was a problem hiding this comment.
LGTM on the buffer-related changes.
Non blocking Bigger question: is there a way to enforce these config-mismatch regressions programmatically we ahve seen quite a few of them?
Codex suggested a few ideas
_▎ 1. Snapshot-and-diff the user inputs vs final arg_map before constructing the engine. Log (or fail) when any user-supplied key was modified by Dynamo. The existing _warn_override_collisions (llm_worker.py:98) is the same pattern — apply it to the whole pipeline, not just override_engine_args.
▎ 2. Track key provenance in a parallel dict ({"event_buffer_max_size": "user_yaml", "backend": "dynamo_default"}) so we have a single audit trail of who set what.
▎ 3. Pin regression tests that feed a YAML with non-default values for the historically-fragile fields (kv_cache_config.*, return_perf_metrics, backend, enable_iter_perf_stats) and assert they survive end-to-end into the final arg_map._
yeah, good points. I think (1 or 2) and 3 combined would be ideal. We have audit log for debugging but also have test coverage for gate keeping. |
indrajit96
left a comment
There was a problem hiding this comment.
LGTM!
Created a "good first issue" on GH for hardening configs based on the dicsussion in this PR
#9288
|
overall lgtm, just how to deal with "return_perf_metrics" is an open question |
Overview:
do not override the the event_buffer_max_size if user provided one
Details:
dynamo.trtllm was overriding the event_buffer_max_size to 1024 no matter what the number is in the engine config, this might affect the rate of the worker publishing the events
Where should the reviewer start?
Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit