[codex] gate nvext response metadata behind extra_fields by AmeenP · Pull Request #8250 · ai-dynamo/dynamo

AmeenP · 2026-04-15T22:09:23Z

Summary

restore opt-in gating for nvext response fields in OpenAI-compatible chat/completions responses
keep RequestTracker always enabled so internal per-worker metrics still work
add a small doc clarification for routed_experts
add one smoke test in each delta generator to cover the default no-extra_fields path

Root Cause

The February 4, 2026 per-worker metrics change made request tracking unconditional in the chat/completions delta generators. Response shaping still emitted nvext whenever tracker-backed metadata was present, so plain OpenAI-compatible requests could leak worker_id and timing data by default.

What Changed

introduced a shared NvExtResponseFieldSelection helper to compute which response fields are allowed for a request
gated worker_id, timing, routed_experts, and token_ids independently from tracker existence
preserved the query_instance_id exception for worker_id and token_ids
left record_finish() unconditional so timing/ITL accounting and Prometheus metrics do not regress

Impact

plain /v1/chat/completions and /v1/completions requests no longer return nvext by default
nvext.worker_id, nvext.timing, and nvext.routed_experts remain opt-in via nvext.extra_fields
query_instance_id behavior is preserved

Closes #8249.

Validation

cargo check -p dynamo-llm --no-default-features --lib --tests

copy-pr-bot · 2026-04-15T22:09:28Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

github-actions · 2026-04-15T22:09:33Z

👋 Hi AmeenP! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

biswapanda · 2026-04-15T22:25:55Z

+
+        Self {
+            worker_id: has_extra_field("worker_id") || query_instance_id,
+            timing: has_extra_field("timing"),


Bug: query_instance_id no longer auto-enables timing in the response

The original (pre-regression) behavior of enable_tracking was:

let enable_tracking = timing_in_extra_fields || has_query_instance_id;

But NvExtResponseFieldSelection::from_nvext sets:

timing: has_extra_field("timing"), // no query_instance_id exception

This means GAIE Stage 1 (query_instance_id) requests will no longer receive timing info in the response. The original code intentionally auto-enabled timing for query_instance_id.

If this is an intentional behavior change, please call it out explicitly in the PR description so reviewers can verify the GAIE Stage 1 contract doesn't depend on auto-returned timing.

If unintentional, the fix is:

Self { worker_id: has_extra_field("worker_id") || query_instance_id, timing: has_extra_field("timing") || query_instance_id, token_ids: query_instance_id, routed_experts: has_extra_field("routed_experts"), }

The unit test test_nvext_response_field_selection_query_instance_id_exception asserts !selection.timing, so it would need updating too.

biswapanda · 2026-04-15T22:25:55Z

+
+    pub fn any(&self) -> bool {
+        self.worker_id || self.timing || self.token_ids || self.routed_experts
+    }


Nit: any() is dead code

This method is defined but never called anywhere in the diff (or the existing codebase). Consider removing it to avoid dead-code warnings, or add a #[allow(dead_code)] with a comment explaining the intended use if it's planned for future use.

biswapanda · 2026-04-15T22:25:55Z

            enable_logprobs: self.inner.logprobs.unwrap_or(false)
                || self.inner.top_logprobs.unwrap_or(0) > 0,
-            enable_tracking,
+            response_fields,


Suggestion: add positive-path tests for each gated field

The smoke test test_plain_request_without_extra_fields_omits_nvext is a great regression test, but there are no tests verifying the opt-in paths actually work. I'd recommend at minimum:

extra_fields: ["worker_id"] returns worker_id (and NOT timing)

extra_fields: ["timing"] returns timing (and NOT worker_id)

extra_fields: ["routed_experts"] returns routed_experts

query_instance_id annotation returns worker_id + token_ids without extra_fields

Multiple extra_fields combined work together

Without positive-path tests, a future refactor could accidentally break the opt-in behavior and the test suite wouldn't catch it.

…tive-path tests Address review feedback on PR #8250: - Restore the pre-regression behavior where query_instance_id automatically enables timing in the response (was lost during the refactor to NvExtResponseFieldSelection). - Remove dead any() method that had no callers. - Add positive-path unit tests for each gated field: worker_id, timing, routed_experts, query_instance_id, and combined fields. - Update stale doc comment on NvExtResponse.timing to mention query_instance_id auto-enablement. - Use assert_eq! with struct literals in tests for consistency and more informative failure output.

github-actions · 2026-05-16T09:58:55Z

This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

fix: gate nvext response metadata behind extra_fields

674341b

pull-request-size Bot added the size/L label Apr 15, 2026

github-actions Bot added external-contribution Pull request is from an external contributor documentation Improvements or additions to documentation frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` labels Apr 15, 2026

biswapanda reviewed Apr 15, 2026

View reviewed changes

github-actions Bot added the Stale label May 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[codex] gate nvext response metadata behind extra_fields#8250

[codex] gate nvext response metadata behind extra_fields#8250
AmeenP wants to merge 1 commit into
ai-dynamo:mainfrom
AmeenP:fix/nvext-response-gating

AmeenP commented Apr 15, 2026 •

edited by biswapanda

Loading

Uh oh!

copy-pr-bot Bot commented Apr 15, 2026

Uh oh!

github-actions Bot commented Apr 15, 2026

Uh oh!

biswapanda Apr 15, 2026

Uh oh!

biswapanda Apr 15, 2026

Uh oh!

biswapanda Apr 15, 2026

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

AmeenP commented Apr 15, 2026 • edited by biswapanda Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

What Changed

Impact

Validation

Uh oh!

copy-pr-bot Bot commented Apr 15, 2026

Uh oh!

github-actions Bot commented Apr 15, 2026

Uh oh!

biswapanda Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

biswapanda Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

biswapanda Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AmeenP commented Apr 15, 2026 •

edited by biswapanda

Loading