[codex] gate nvext response metadata behind extra_fields#8250
Conversation
|
👋 Hi AmeenP! Thank you for contributing to ai-dynamo/dynamo. Just a reminder: The 🚀 |
|
|
||
| Self { | ||
| worker_id: has_extra_field("worker_id") || query_instance_id, | ||
| timing: has_extra_field("timing"), |
There was a problem hiding this comment.
Bug: query_instance_id no longer auto-enables timing in the response
The original (pre-regression) behavior of enable_tracking was:
let enable_tracking = timing_in_extra_fields || has_query_instance_id;But NvExtResponseFieldSelection::from_nvext sets:
timing: has_extra_field("timing"), // no query_instance_id exceptionThis means GAIE Stage 1 (query_instance_id) requests will no longer receive timing info in the response. The original code intentionally auto-enabled timing for query_instance_id.
If this is an intentional behavior change, please call it out explicitly in the PR description so reviewers can verify the GAIE Stage 1 contract doesn't depend on auto-returned timing.
If unintentional, the fix is:
Self {
worker_id: has_extra_field("worker_id") || query_instance_id,
timing: has_extra_field("timing") || query_instance_id,
token_ids: query_instance_id,
routed_experts: has_extra_field("routed_experts"),
}The unit test test_nvext_response_field_selection_query_instance_id_exception asserts !selection.timing, so it would need updating too.
|
|
||
| pub fn any(&self) -> bool { | ||
| self.worker_id || self.timing || self.token_ids || self.routed_experts | ||
| } |
There was a problem hiding this comment.
Nit: any() is dead code
This method is defined but never called anywhere in the diff (or the existing codebase). Consider removing it to avoid dead-code warnings, or add a #[allow(dead_code)] with a comment explaining the intended use if it's planned for future use.
| enable_logprobs: self.inner.logprobs.unwrap_or(false) | ||
| || self.inner.top_logprobs.unwrap_or(0) > 0, | ||
| enable_tracking, | ||
| response_fields, |
There was a problem hiding this comment.
Suggestion: add positive-path tests for each gated field
The smoke test test_plain_request_without_extra_fields_omits_nvext is a great regression test, but there are no tests verifying the opt-in paths actually work. I'd recommend at minimum:
extra_fields: ["worker_id"]returns worker_id (and NOT timing)extra_fields: ["timing"]returns timing (and NOT worker_id)extra_fields: ["routed_experts"]returns routed_expertsquery_instance_idannotation returns worker_id + token_ids withoutextra_fields- Multiple
extra_fieldscombined work together
Without positive-path tests, a future refactor could accidentally break the opt-in behavior and the test suite wouldn't catch it.
…tive-path tests Address review feedback on PR #8250: - Restore the pre-regression behavior where query_instance_id automatically enables timing in the response (was lost during the refactor to NvExtResponseFieldSelection). - Remove dead any() method that had no callers. - Add positive-path unit tests for each gated field: worker_id, timing, routed_experts, query_instance_id, and combined fields. - Update stale doc comment on NvExtResponse.timing to mention query_instance_id auto-enablement. - Use assert_eq! with struct literals in tests for consistency and more informative failure output.
…tive-path tests Address review feedback on PR #8250: - Restore the pre-regression behavior where query_instance_id automatically enables timing in the response (was lost during the refactor to NvExtResponseFieldSelection). - Remove dead any() method that had no callers. - Add positive-path unit tests for each gated field: worker_id, timing, routed_experts, query_instance_id, and combined fields. - Update stale doc comment on NvExtResponse.timing to mention query_instance_id auto-enablement. - Use assert_eq! with struct literals in tests for consistency and more informative failure output.
…tive-path tests Address review feedback on PR #8250: - Restore the pre-regression behavior where query_instance_id automatically enables timing in the response (was lost during the refactor to NvExtResponseFieldSelection). - Remove dead any() method that had no callers. - Add positive-path unit tests for each gated field: worker_id, timing, routed_experts, query_instance_id, and combined fields. - Update stale doc comment on NvExtResponse.timing to mention query_instance_id auto-enablement. - Use assert_eq! with struct literals in tests for consistency and more informative failure output.
…tive-path tests Address review feedback on PR #8250: - Restore the pre-regression behavior where query_instance_id automatically enables timing in the response (was lost during the refactor to NvExtResponseFieldSelection). - Remove dead any() method that had no callers. - Add positive-path unit tests for each gated field: worker_id, timing, routed_experts, query_instance_id, and combined fields. - Update stale doc comment on NvExtResponse.timing to mention query_instance_id auto-enablement. - Use assert_eq! with struct literals in tests for consistency and more informative failure output.
…tive-path tests Address review feedback on PR #8250: - Restore the pre-regression behavior where query_instance_id automatically enables timing in the response (was lost during the refactor to NvExtResponseFieldSelection). - Remove dead any() method that had no callers. - Add positive-path unit tests for each gated field: worker_id, timing, routed_experts, query_instance_id, and combined fields. - Update stale doc comment on NvExtResponse.timing to mention query_instance_id auto-enablement. - Use assert_eq! with struct literals in tests for consistency and more informative failure output.
|
This PR is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
Summary
nvextresponse fields in OpenAI-compatible chat/completions responsesRequestTrackeralways enabled so internal per-worker metrics still workrouted_expertsextra_fieldspathRoot Cause
The February 4, 2026 per-worker metrics change made request tracking unconditional in the chat/completions delta generators. Response shaping still emitted
nvextwhenever tracker-backed metadata was present, so plain OpenAI-compatible requests could leakworker_idandtimingdata by default.What Changed
NvExtResponseFieldSelectionhelper to compute which response fields are allowed for a requestworker_id,timing,routed_experts, andtoken_idsindependently from tracker existencequery_instance_idexception forworker_idandtoken_idsrecord_finish()unconditional so timing/ITL accounting and Prometheus metrics do not regressImpact
/v1/chat/completionsand/v1/completionsrequests no longer returnnvextby defaultnvext.worker_id,nvext.timing, andnvext.routed_expertsremain opt-in vianvext.extra_fieldsquery_instance_idbehavior is preservedCloses #8249.
Validation
cargo check -p dynamo-llm --no-default-features --lib --tests