Skip to content

[otel-advisor] OTel improvement: fix observability summary to show the real OTLP trace ID instead of workflow_call_id #24665

@github-actions

Description

@github-actions

📡 OTel Instrumentation Improvement: Fix trace ID in job summaries

Analysis Date: 2026-04-05
Priority: High
Effort: Small (< 2h)

Problem

The observability summary written to GitHub Actions job summaries displays the wrong trace ID. In generate_observability_summary.cjs:63, collectObservabilityData() reads awInfo.context.workflow_call_id as the trace ID:

// Current: actions/setup/js/generate_observability_summary.cjs (line 63)
const traceId = awInfo.context && typeof awInfo.context.workflow_call_id === "string"
  ? awInfo.context.workflow_call_id
  : "";

workflow_call_id is constructed in aw_context.cjs as:

workflow_call_id: `${process.env.GITHUB_RUN_ID ?? context.runId ?? ""}-${process.env.GITHUB_RUN_ATTEMPT ?? "1"}`,
// → produces values like "12345678901-1"
```

But the **actual OTLP trace ID** used in spans — the one that exists in Sentry / Honeycomb / Datadog — is a 32-character hex string generated by `sendJobSetupSpan()` and stored in `GITHUB_AW_OTEL_TRACE_ID` and `awInfo.context.otel_trace_id`.

A DevOps engineer investigating a failed workflow cannot answer: _"Where is this trace in my observability backend?"_ because the ID shown in the summary does not match any trace ID in those backends.

### Why This Matters (DevOps Perspective)

When a workflow fails, the first place an engineer looks is the GitHub Actions job summary. The current summary shows:

```
- **trace id**: 12345678901-1

The engineer copies this value and pastes it into Sentry/Honeycomb/Datadog — and finds nothing, because the real trace ID is something like a3f2c8d1e4b7091f6a5c2e3d8f401b72. This silent mismatch adds unnecessary debugging time to every incident. Fixing it means a single copy-paste from the summary to the backend immediately surfaces the full trace.

Current Behavior

// actions/setup/js/generate_observability_summary.cjs (line 63)
const traceId = awInfo.context && typeof awInfo.context.workflow_call_id === "string"
  ? awInfo.context.workflow_call_id  // ❌ "12345678901-1" — NOT an OTLP trace ID
  : "";

The same awInfo.context object also carries otel_trace_id (set by buildAwContext() from GITHUB_AW_OTEL_TRACE_ID), which is the real 32-char hex trace ID:

// actions/setup/js/aw_context.cjs (line 151)
otel_trace_id: process.env.GITHUB_AW_OTEL_TRACE_ID || "",
// → produces values like "a3f2c8d1e4b7091f6a5c2e3d8f401b72"

Proposed Change

Prefer otel_trace_id over workflow_call_id, falling back to workflow_call_id only when the OTLP trace ID is not available:

// Proposed: actions/setup/js/generate_observability_summary.cjs (line 63)
const traceId = awInfo.context
  ? (awInfo.context.otel_trace_id || awInfo.context.workflow_call_id || "")
  : "";

Expected Outcome

After this change:

  • In Grafana / Honeycomb / Datadog / Sentry: Engineers can copy the trace ID directly from the job summary and look up the full trace end-to-end with no manual translation.
  • In the JSONL mirror: No change; the mirror already stores the correct OTLP trace ID inside the payload.
  • For on-call engineers: Trace correlation goes from "impossible without digging through logs" to "one copy-paste."

Implementation Steps

  • Update collectObservabilityData() in actions/setup/js/generate_observability_summary.cjs to prefer awInfo.context.otel_trace_id over awInfo.context.workflow_call_id
  • Update actions/setup/js/generate_observability_summary.test.cjs to add a test case where otel_trace_id is present and assert it is shown instead of workflow_call_id; also update the existing test to include otel_trace_id in the context fixture and assert the summary shows it
  • Run make test-unit (or cd actions/setup/js && npx vitest run) to confirm tests pass
  • Run make fmt to ensure formatting
  • Open a PR referencing this issue

Evidence from Live Sentry Data

Sentry MCP tools were not available in this analysis run, so live span data could not be sampled. The gap is confirmed by static code analysis:

  • generate_observability_summary.cjs:63 reads workflow_call_id
  • aw_context.cjs:151 shows otel_trace_id is a separate, correctly-formatted 32-char hex field on the same context object
  • The existing test (generate_observability_summary.test.cjs:42) uses { workflow_call_id: "trace-123" } with no otel_trace_id, which means the wrong field has been silently baked into the test coverage as well

Related Files

  • actions/setup/js/generate_observability_summary.cjs — contains the bug (line 63)
  • actions/setup/js/generate_observability_summary.test.cjs — test needs updating
  • actions/setup/js/aw_context.cjs — defines both workflow_call_id and otel_trace_id
  • actions/setup/js/send_otlp_span.cjs — generates and exports the real OTLP trace ID
  • actions/setup/js/action_setup_otlp.cjs — writes the trace ID to GITHUB_AW_OTEL_TRACE_ID

Generated by the Daily OTel Instrumentation Advisor workflow

Generated by Daily OTel Instrumentation Advisor · ● 153.6K ·

  • expires on Apr 12, 2026, 4:32 AM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions