refactor: unify on_response, rename request_id fields, deprecate x-dynamo-request-id#7834
Conversation
WalkthroughThis change updates request identifier handling across the HTTP service layer. The deprecated Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~70 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
lib/runtime/src/logging.rs (2)
254-290:⚠️ Potential issue | 🟠 MajorDon't let an invalid
request-idsuppress the deprecated-header fallback.Both paths choose
request-idfirst and validate only after selection. If that header is present but malformed whilex-dynamo-request-idis valid, propagation is dropped instead of falling back, which breaks the backward-compatible rollout path.💡 Keep fallback after per-header validation
- if let Some(header_value) = headers.get("request-id") { - request_id = Some(header_value.to_string()); - } else if let Some(header_value) = headers.get("x-dynamo-request-id") { - request_id = Some(header_value.to_string()); - } - - // Validate UUID format - let request_id = request_id.filter(|id| uuid::Uuid::parse_str(id).is_ok()); + let request_id = headers + .get("request-id") + .filter(|id| Uuid::parse_str(id).is_ok()) + .map(|id| id.to_string()) + .or_else(|| { + headers + .get("x-dynamo-request-id") + .filter(|id| Uuid::parse_str(id).is_ok()) + .map(|id| id.to_string()) + });Apply the same validate-then-fallback ordering in
make_handle_payload_span_from_tcp_headers.Also applies to: 465-478
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/runtime/src/logging.rs` around lines 254 - 290, The current TraceParent::from_headers logic reads "request-id" first and only validates after selection, which prevents falling back to "x-dynamo-request-id" when "request-id" is present but malformed; change the flow to validate each candidate before choosing it (i.e., parse/validate headers.get("request-id") and only set request_id if uuid::Uuid::parse_str(...) succeeds, otherwise try headers.get("x-dynamo-request-id") and validate it), and apply the same validate-then-fallback pattern to make_handle_payload_span_from_tcp_headers so a malformed primary header does not suppress the deprecated fallback.
743-829:⚠️ Potential issue | 🟠 MajorChild spans still drop the inherited
request_id.You capture and store
request_idhere, but the parent-inheritance branch below still copies onlytrace_id,parent_id, andtracestate. Any nested#[instrument]span will therefore end up withrequest_id = None, so code callingget_distributed_tracing_context()under child spans can stop forwarding the original request ID.💡 Mirror the existing trace inheritance for
request_idif let Some(parent_tracing_context) = parent_ext.get::<DistributedTraceContext>() { trace_id = Some(parent_tracing_context.trace_id.clone()); parent_id = Some(parent_tracing_context.span_id.clone()); tracestate = parent_tracing_context.tracestate.clone(); + if request_id.is_none() { + request_id = parent_tracing_context.request_id.clone(); + } }Also applies to: 855-913
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/runtime/src/logging.rs` around lines 743 - 829, In on_new_span, PendingDistributedTraceContext captures request_id/x_request_id but the parent-inheritance branch that reads parent_ext.get::<DistributedTraceContext>() only copies trace_id, parent_id and tracestate, so child spans lose request_id; update that branch (inside on_new_span where you access ctx.current_span() and parent_ext.get::<DistributedTraceContext>()) to also copy parent_tracing_context.request_id and parent_tracing_context.x_request_id into the local request_id and x_request_id before inserting PendingDistributedTraceContext (and mirror the same change in the similar inheritance block later that finalizes the context).
🧹 Nitpick comments (1)
lib/runtime/src/logging.rs (1)
1315-1359: Please lock the JSONL field rename down with one regression assertion.
test_json_log_captureaccepts additional properties today, so CI will not catch a missingrequest_idor a reintroducedx_dynamo_request_id. A couple of targeted asserts in that existing test would make this log contract sticky.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/runtime/src/logging.rs` around lines 1315 - 1359, Update the existing test_json_log_capture test to assert the new JSONL contract: after capturing the logged JSON object (the map produced by visitor.fields in logging.rs via DistributedTraceContext handling), add a regression assertion that "request_id" is present and equals the expected request id value and another assertion that "x_dynamo_request_id" is absent (or null) to prevent regressions; locate the test by the function name test_json_log_capture and update its captured JSON checks to include these two targeted asserts so CI will fail if request_id is removed or x_dynamo_request_id is reintroduced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lib/llm/src/http/service/openai.rs`:
- Around line 307-329: The code currently narrows the deprecated
DYNAMO_REQUEST_ID_HEADER to UUIDs when building validated_header, dropping
non-UUID legacy IDs; update the logic so after retrieving
headers.get(DYNAMO_REQUEST_ID_HEADER) you still emit the deprecation warning but
accept any valid UTF-8 string as the preserved request id (i.e., remove the
uuid::Uuid::parse_str(s) validation branch), only treat non-UTF-8 as invalid
with a warning, and return Ok(s).to_string() for all other strings so legacy IDs
like "job-42" are preserved; update the match in the validated_header
construction accordingly.
In `@lib/llm/src/http/service/service_v2.rs`:
- Around line 537-560: The system TraceLayer is applied to system_router before
merging the OpenAPI docs route, so openapi_route won't get the
make_system_request_span/on_response instrumentation; move the
TraceLayer::new_for_http().make_span_with(make_system_request_span).on_response(on_response)
invocation to after the system_router = system_router.merge(openapi_route) call
(i.e., apply the layer to the merged system_router returned by
super::openapi_docs::openapi_router) so that system_router (including
openapi_route) is wrapped with the system tracing and response logging.
In `@lib/runtime/src/pipeline/network/ingress/http_endpoint.rs`:
- Around line 311-318: The current header lookup uses
headers.get("request-id").or_else(|| headers.get("x-dynamo-request-id")) before
calling to_str(), so a present but non-UTF8 "request-id" will block the
fallback; change the logic to try headers.get("request-id") and attempt to
to_str() on that result, and only if that to_str() fails or the header is
missing then try headers.get("x-dynamo-request-id") and to_str() it; set
traceparent.request_id from the first successful to_str() result (referencing
headers.get("request-id"), headers.get("x-dynamo-request-id"), to_str(), and
traceparent.request_id).
---
Outside diff comments:
In `@lib/runtime/src/logging.rs`:
- Around line 254-290: The current TraceParent::from_headers logic reads
"request-id" first and only validates after selection, which prevents falling
back to "x-dynamo-request-id" when "request-id" is present but malformed; change
the flow to validate each candidate before choosing it (i.e., parse/validate
headers.get("request-id") and only set request_id if uuid::Uuid::parse_str(...)
succeeds, otherwise try headers.get("x-dynamo-request-id") and validate it), and
apply the same validate-then-fallback pattern to
make_handle_payload_span_from_tcp_headers so a malformed primary header does not
suppress the deprecated fallback.
- Around line 743-829: In on_new_span, PendingDistributedTraceContext captures
request_id/x_request_id but the parent-inheritance branch that reads
parent_ext.get::<DistributedTraceContext>() only copies trace_id, parent_id and
tracestate, so child spans lose request_id; update that branch (inside
on_new_span where you access ctx.current_span() and
parent_ext.get::<DistributedTraceContext>()) to also copy
parent_tracing_context.request_id and parent_tracing_context.x_request_id into
the local request_id and x_request_id before inserting
PendingDistributedTraceContext (and mirror the same change in the similar
inheritance block later that finalizes the context).
---
Nitpick comments:
In `@lib/runtime/src/logging.rs`:
- Around line 1315-1359: Update the existing test_json_log_capture test to
assert the new JSONL contract: after capturing the logged JSON object (the map
produced by visitor.fields in logging.rs via DistributedTraceContext handling),
add a regression assertion that "request_id" is present and equals the expected
request id value and another assertion that "x_dynamo_request_id" is absent (or
null) to prevent regressions; locate the test by the function name
test_json_log_capture and update its captured JSON checks to include these two
targeted asserts so CI will fail if request_id is removed or x_dynamo_request_id
is reintroduced.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 92fca7be-82fe-4f4a-b93a-392ba22bd295
📒 Files selected for processing (5)
lib/llm/src/http/service/anthropic.rslib/llm/src/http/service/openai.rslib/llm/src/http/service/service_v2.rslib/runtime/src/logging.rslib/runtime/src/pipeline/network/ingress/http_endpoint.rs
eb1df1f to
140696c
Compare
140696c to
8457eee
Compare
|
Thanks @jh-nv — good catch on the missing UUID validation in |
8457eee to
7956fad
Compare
7956fad to
db13cf6
Compare
…namo-request-id Follow-up to #7733. Changes missed from the clean rebuild: - Unify on_response callback (error for 4xx/5xx, info for success) for both system and inference routes - Rename TraceParent/DistributedTraceContext field x_dynamo_request_id → request_id - Rename internal propagation header from x-dynamo-request-id to request-id (with backward compat fallback) - Add UUID validation on TCP header path - get_or_create_request_id returns String (warns on invalid UUID instead of 400) - Add deprecation warning (DEP #7812) for x-dynamo-request-id header - Add echo_request_id_header middleware - make_system_request_span preserves trace context + generates request_id - Remove duplicate x_dynamo_request_id span field from client_request spans Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
db13cf6 to
9a9e8ff
Compare
|
Fixed — split into separate |
…namo-request-id (ai-dynamo#7834) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
Follow-up to #7733 with improvements from the clean rebuild:
on_responsecallback for both system and inference routes (error for 4xx/5xx, info for success)x_dynamo_request_id→request_idinTraceParent,DistributedTraceContext, span fields, and JSONL outputx-dynamo-request-idtorequest-id(with backward compat fallback)get_or_create_request_idreturnsString(warns on invalid UUID instead of 400)x-dynamo-request-idheaderecho_request_id_headermiddleware to copyx-request-idfrom request to responsemake_system_request_spannow preserves trace context + generates request_id (same structure as inference spans)Test plan
cargo clippy --workspace -- -D warningscleancargo fmt --checkclean🤖 Generated with Claude Code
Summary by CodeRabbit
request-idheader format while maintaining backward compatibility with legacy header format