Skip to content

refactor: unify on_response, rename request_id fields, deprecate x-dynamo-request-id#7834

Merged
nnshah1 merged 1 commit into
mainfrom
nnshah1/DIS-1643-pr1-followup
Apr 2, 2026
Merged

refactor: unify on_response, rename request_id fields, deprecate x-dynamo-request-id#7834
nnshah1 merged 1 commit into
mainfrom
nnshah1/DIS-1643-pr1-followup

Conversation

@nnshah1
Copy link
Copy Markdown
Contributor

@nnshah1 nnshah1 commented Apr 2, 2026

Summary

Follow-up to #7733 with improvements from the clean rebuild:

  • Unify on_response callback for both system and inference routes (error for 4xx/5xx, info for success)
  • Rename x_dynamo_request_idrequest_id in TraceParent, DistributedTraceContext, span fields, and JSONL output
  • Rename internal propagation header from x-dynamo-request-id to request-id (with backward compat fallback)
  • Add UUID validation on TCP header path
  • get_or_create_request_id returns String (warns on invalid UUID instead of 400)
  • Add deprecation warning (DEP DEP: Deprecate client-controlled request IDs in favor of server-generated IDs #7812) for x-dynamo-request-id header
  • Add echo_request_id_header middleware to copy x-request-id from request to response
  • make_system_request_span now preserves trace context + generates request_id (same structure as inference spans)

Test plan

  • cargo clippy --workspace -- -D warnings clean
  • cargo fmt --check clean
  • CI checks pass

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Improvements
    • Enhanced request ID handling to gracefully continue processing when invalid headers are encountered, instead of rejecting requests
    • Migrated to standard request-id header format while maintaining backward compatibility with legacy header format
    • Improved request tracing with warnings emitted when deprecated headers are used
    • Refined response header handling for better request tracking and observability

@nnshah1 nnshah1 requested a review from a team as a code owner April 2, 2026 20:36
@github-actions github-actions Bot added refactor frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` labels Apr 2, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 2, 2026

Walkthrough

This change updates request identifier handling across the HTTP service layer. The deprecated x-dynamo-request-id header is now best-effort with warnings; invalid values are silently ignored. Request IDs are renamed from x_dynamo_request_id to request_id throughout, and response headers echo back incoming request IDs.

Changes

Cohort / File(s) Summary
Request ID field renaming
lib/runtime/src/logging.rs, lib/runtime/src/pipeline/network/ingress/http_endpoint.rs
Renamed distributed tracing field x_dynamo_request_id to request_id in DistributedTraceContext, PendingDistributedTraceContext, and TraceParent structures. Updated header reading logic to prefer request-id and fall back to deprecated x-dynamo-request-id. Modified span creation functions to emit request_id with UUID validation.
Service handler request ID resolution
lib/llm/src/http/service/anthropic.rs, lib/llm/src/http/service/openai.rs
Changed get_or_create_request_id from returning Result<String, String> to String directly. Request validation failures now emit warnings and continue processing instead of returning HTTP 400. Deprecated x-dynamo-request-id header is now best-effort with silent fallback to generated UUID on invalid values.
Router refactoring and middleware
lib/llm/src/http/service/service_v2.rs
Split HTTP routes into system and inference routers with separate tracing layers. Added echo_request_id_header middleware to echo incoming x-request-id in responses. Consolidated response status and latency logging through a shared callback.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~70 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the three main changes: unifying on_response callback, renaming request_id fields, and deprecating x-dynamo-request-id header.
Description check ✅ Passed The description provides comprehensive context with a summary, detailed explanation of changes, test plan status, and a clear link to related issue #7733. It follows the template structure with Overview, Details, and Related Issues sections.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
lib/runtime/src/logging.rs (2)

254-290: ⚠️ Potential issue | 🟠 Major

Don't let an invalid request-id suppress the deprecated-header fallback.

Both paths choose request-id first and validate only after selection. If that header is present but malformed while x-dynamo-request-id is valid, propagation is dropped instead of falling back, which breaks the backward-compatible rollout path.

💡 Keep fallback after per-header validation
-        if let Some(header_value) = headers.get("request-id") {
-            request_id = Some(header_value.to_string());
-        } else if let Some(header_value) = headers.get("x-dynamo-request-id") {
-            request_id = Some(header_value.to_string());
-        }
-
-        // Validate UUID format
-        let request_id = request_id.filter(|id| uuid::Uuid::parse_str(id).is_ok());
+        let request_id = headers
+            .get("request-id")
+            .filter(|id| Uuid::parse_str(id).is_ok())
+            .map(|id| id.to_string())
+            .or_else(|| {
+                headers
+                    .get("x-dynamo-request-id")
+                    .filter(|id| Uuid::parse_str(id).is_ok())
+                    .map(|id| id.to_string())
+            });

Apply the same validate-then-fallback ordering in make_handle_payload_span_from_tcp_headers.

Also applies to: 465-478

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/runtime/src/logging.rs` around lines 254 - 290, The current
TraceParent::from_headers logic reads "request-id" first and only validates
after selection, which prevents falling back to "x-dynamo-request-id" when
"request-id" is present but malformed; change the flow to validate each
candidate before choosing it (i.e., parse/validate headers.get("request-id") and
only set request_id if uuid::Uuid::parse_str(...) succeeds, otherwise try
headers.get("x-dynamo-request-id") and validate it), and apply the same
validate-then-fallback pattern to make_handle_payload_span_from_tcp_headers so a
malformed primary header does not suppress the deprecated fallback.

743-829: ⚠️ Potential issue | 🟠 Major

Child spans still drop the inherited request_id.

You capture and store request_id here, but the parent-inheritance branch below still copies only trace_id, parent_id, and tracestate. Any nested #[instrument] span will therefore end up with request_id = None, so code calling get_distributed_tracing_context() under child spans can stop forwarding the original request ID.

💡 Mirror the existing trace inheritance for request_id
                 if let Some(parent_tracing_context) = parent_ext.get::<DistributedTraceContext>() {
                     trace_id = Some(parent_tracing_context.trace_id.clone());
                     parent_id = Some(parent_tracing_context.span_id.clone());
                     tracestate = parent_tracing_context.tracestate.clone();
+                    if request_id.is_none() {
+                        request_id = parent_tracing_context.request_id.clone();
+                    }
                 }

Also applies to: 855-913

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/runtime/src/logging.rs` around lines 743 - 829, In on_new_span,
PendingDistributedTraceContext captures request_id/x_request_id but the
parent-inheritance branch that reads parent_ext.get::<DistributedTraceContext>()
only copies trace_id, parent_id and tracestate, so child spans lose request_id;
update that branch (inside on_new_span where you access ctx.current_span() and
parent_ext.get::<DistributedTraceContext>()) to also copy
parent_tracing_context.request_id and parent_tracing_context.x_request_id into
the local request_id and x_request_id before inserting
PendingDistributedTraceContext (and mirror the same change in the similar
inheritance block later that finalizes the context).
🧹 Nitpick comments (1)
lib/runtime/src/logging.rs (1)

1315-1359: Please lock the JSONL field rename down with one regression assertion.

test_json_log_capture accepts additional properties today, so CI will not catch a missing request_id or a reintroduced x_dynamo_request_id. A couple of targeted asserts in that existing test would make this log contract sticky.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/runtime/src/logging.rs` around lines 1315 - 1359, Update the existing
test_json_log_capture test to assert the new JSONL contract: after capturing the
logged JSON object (the map produced by visitor.fields in logging.rs via
DistributedTraceContext handling), add a regression assertion that "request_id"
is present and equals the expected request id value and another assertion that
"x_dynamo_request_id" is absent (or null) to prevent regressions; locate the
test by the function name test_json_log_capture and update its captured JSON
checks to include these two targeted asserts so CI will fail if request_id is
removed or x_dynamo_request_id is reintroduced.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/llm/src/http/service/openai.rs`:
- Around line 307-329: The code currently narrows the deprecated
DYNAMO_REQUEST_ID_HEADER to UUIDs when building validated_header, dropping
non-UUID legacy IDs; update the logic so after retrieving
headers.get(DYNAMO_REQUEST_ID_HEADER) you still emit the deprecation warning but
accept any valid UTF-8 string as the preserved request id (i.e., remove the
uuid::Uuid::parse_str(s) validation branch), only treat non-UTF-8 as invalid
with a warning, and return Ok(s).to_string() for all other strings so legacy IDs
like "job-42" are preserved; update the match in the validated_header
construction accordingly.

In `@lib/llm/src/http/service/service_v2.rs`:
- Around line 537-560: The system TraceLayer is applied to system_router before
merging the OpenAPI docs route, so openapi_route won't get the
make_system_request_span/on_response instrumentation; move the
TraceLayer::new_for_http().make_span_with(make_system_request_span).on_response(on_response)
invocation to after the system_router = system_router.merge(openapi_route) call
(i.e., apply the layer to the merged system_router returned by
super::openapi_docs::openapi_router) so that system_router (including
openapi_route) is wrapped with the system tracing and response logging.

In `@lib/runtime/src/pipeline/network/ingress/http_endpoint.rs`:
- Around line 311-318: The current header lookup uses
headers.get("request-id").or_else(|| headers.get("x-dynamo-request-id")) before
calling to_str(), so a present but non-UTF8 "request-id" will block the
fallback; change the logic to try headers.get("request-id") and attempt to
to_str() on that result, and only if that to_str() fails or the header is
missing then try headers.get("x-dynamo-request-id") and to_str() it; set
traceparent.request_id from the first successful to_str() result (referencing
headers.get("request-id"), headers.get("x-dynamo-request-id"), to_str(), and
traceparent.request_id).

---

Outside diff comments:
In `@lib/runtime/src/logging.rs`:
- Around line 254-290: The current TraceParent::from_headers logic reads
"request-id" first and only validates after selection, which prevents falling
back to "x-dynamo-request-id" when "request-id" is present but malformed; change
the flow to validate each candidate before choosing it (i.e., parse/validate
headers.get("request-id") and only set request_id if uuid::Uuid::parse_str(...)
succeeds, otherwise try headers.get("x-dynamo-request-id") and validate it), and
apply the same validate-then-fallback pattern to
make_handle_payload_span_from_tcp_headers so a malformed primary header does not
suppress the deprecated fallback.
- Around line 743-829: In on_new_span, PendingDistributedTraceContext captures
request_id/x_request_id but the parent-inheritance branch that reads
parent_ext.get::<DistributedTraceContext>() only copies trace_id, parent_id and
tracestate, so child spans lose request_id; update that branch (inside
on_new_span where you access ctx.current_span() and
parent_ext.get::<DistributedTraceContext>()) to also copy
parent_tracing_context.request_id and parent_tracing_context.x_request_id into
the local request_id and x_request_id before inserting
PendingDistributedTraceContext (and mirror the same change in the similar
inheritance block later that finalizes the context).

---

Nitpick comments:
In `@lib/runtime/src/logging.rs`:
- Around line 1315-1359: Update the existing test_json_log_capture test to
assert the new JSONL contract: after capturing the logged JSON object (the map
produced by visitor.fields in logging.rs via DistributedTraceContext handling),
add a regression assertion that "request_id" is present and equals the expected
request id value and another assertion that "x_dynamo_request_id" is absent (or
null) to prevent regressions; locate the test by the function name
test_json_log_capture and update its captured JSON checks to include these two
targeted asserts so CI will fail if request_id is removed or x_dynamo_request_id
is reintroduced.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 92fca7be-82fe-4f4a-b93a-392ba22bd295

📥 Commits

Reviewing files that changed from the base of the PR and between 3d8e85e and eb1df1f.

📒 Files selected for processing (5)
  • lib/llm/src/http/service/anthropic.rs
  • lib/llm/src/http/service/openai.rs
  • lib/llm/src/http/service/service_v2.rs
  • lib/runtime/src/logging.rs
  • lib/runtime/src/pipeline/network/ingress/http_endpoint.rs

Comment thread lib/llm/src/http/service/openai.rs
Comment thread lib/llm/src/http/service/service_v2.rs Outdated
Comment thread lib/runtime/src/pipeline/network/ingress/http_endpoint.rs
Comment thread lib/runtime/src/pipeline/network/ingress/http_endpoint.rs
@nnshah1 nnshah1 force-pushed the nnshah1/DIS-1643-pr1-followup branch from eb1df1f to 140696c Compare April 2, 2026 20:59
@nnshah1 nnshah1 force-pushed the nnshah1/DIS-1643-pr1-followup branch from 140696c to 8457eee Compare April 2, 2026 21:49
@nnshah1
Copy link
Copy Markdown
Contributor Author

nnshah1 commented Apr 2, 2026

Thanks @jh-nv — good catch on the missing UUID validation in from_axum_headers. Added .filter(|s| uuid::Uuid::parse_str(s).is_ok()) on both the request-id and x-dynamo-request-id lookups for consistency with the other code paths.

@nnshah1 nnshah1 force-pushed the nnshah1/DIS-1643-pr1-followup branch from 8457eee to 7956fad Compare April 2, 2026 22:02
@nnshah1 nnshah1 force-pushed the nnshah1/DIS-1643-pr1-followup branch from 7956fad to db13cf6 Compare April 2, 2026 22:04
…namo-request-id

Follow-up to #7733. Changes missed from the clean rebuild:
- Unify on_response callback (error for 4xx/5xx, info for success) for
  both system and inference routes
- Rename TraceParent/DistributedTraceContext field x_dynamo_request_id → request_id
- Rename internal propagation header from x-dynamo-request-id to request-id
  (with backward compat fallback)
- Add UUID validation on TCP header path
- get_or_create_request_id returns String (warns on invalid UUID instead of 400)
- Add deprecation warning (DEP #7812) for x-dynamo-request-id header
- Add echo_request_id_header middleware
- make_system_request_span preserves trace context + generates request_id
- Remove duplicate x_dynamo_request_id span field from client_request spans

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@nnshah1 nnshah1 force-pushed the nnshah1/DIS-1643-pr1-followup branch from db13cf6 to 9a9e8ff Compare April 2, 2026 22:11
@nnshah1 nnshah1 enabled auto-merge (squash) April 2, 2026 22:22
@nnshah1
Copy link
Copy Markdown
Contributor Author

nnshah1 commented Apr 2, 2026

Re: fallback logic comment

Fixed — split into separate if/else if branches so a malformed request-id correctly falls through to x-dynamo-request-id. Also added UUID validation on both branches for consistency with the other code paths (per jh-nv's feedback).

@nnshah1 nnshah1 merged commit c376655 into main Apr 2, 2026
92 of 93 checks passed
@nnshah1 nnshah1 deleted the nnshah1/DIS-1643-pr1-followup branch April 2, 2026 23:00
yao531441 pushed a commit to yao531441/dynamo that referenced this pull request May 13, 2026
…namo-request-id (ai-dynamo#7834)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

frontend `python -m dynamo.frontend` and `dynamo-run in=http|text|grpc` refactor size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants