[rollout_trace] Trace tool and code-mode boundaries by cassirer-openai · Pull Request #18878 · openai/codex

cassirer-openai · 2026-04-21T20:16:46Z

Summary

Extends rollout tracing across tool dispatch and code-mode runtime boundaries. This records canonical tool-call lifecycle events and links code-mode execution/wait operations back to the model-visible calls that caused them.

Stack

This is PR 3/5 in the rollout trace stack.

#18876: Add rollout trace crate
#18877: Record core session rollout traces
#18878: Trace tool and code-mode boundaries
#18879: Trace sessions and multi-agent edges
#18880: Add debug trace reduction command

Review Notes

This PR is about attribution. Reviewers should focus on whether direct tool calls, code-mode-originated tool calls, waits, outputs, and cancellation boundaries are recorded with enough source information for deterministic reduction without coupling the reducer to live runtime internals.

The stack remains valid after this layer: tool and code-mode traces reduce through the existing crate model, while the broader session and multi-agent relationships are added in the next PR.

jif-oai

Same comments as for the previous PR

jif-oai · 2026-04-21T21:36:08Z

 }

 #[derive(Debug, PartialEq)]
+pub enum WaitResponse {


I don't have enough context to review but it looks sus to add this in code-mode

There is an explicit tool call wait where you wait on a code cell to complete/write outputs. It is important that we capture this one so that we can track which code cell the main loop is being blocked on and what it yielded back to the model.

jif-oai · 2026-04-21T21:37:47Z

+/// affect the trace lifecycle. Keeping the trace eligibility and event writes
+/// behind this helper makes those paths say what happened instead of repeating
+/// the Direct/CodeMode/JsRepl/first-class-object policy at each branch.
+struct DispatchTrace {


this should move

It becomes very awkward to move this into the tracing module since it would either introduce heavy dependencies on ToolInvocation and AnyToolHandler or force use to define awkward generic types inside of the module. All the heavy lifting is already handled by the RolloutTraceRecorder. This class only does the minimal wiring.

I'm going to move as much as I can.

I mean this should move in a dedicated file. This has nothing to do with the registry

And ideally most of them become some impl From<>...

This is addressed in spirit by moving the dispatch adapter out of registry.rs; the remaining conversions are now isolated in tool_dispatch_trace.rs. I don’t think impl From is clearly better for all of them: the main invocation conversion is intentionally optional because JsRepl should not create a dispatch trace, and the result conversion depends on both source and output formatting. ToolPayload -> ToolDispatchPayload is the one plausible From candidate, but I’d prefer not to add that unless we see it materially improves the clone/ownership cleanup.

## Summary Adds the standalone `codex-rollout-trace` crate, which defines the raw trace event format, replay/reduction model, writer, and reducer logic for reconstructing model-visible conversation/runtime state from recorded rollout data. The crate-level design is documented in [`codex-rs/rollout-trace/README.md`](https://github.com/openai/codex/blob/codex/rollout-trace-crate/codex-rs/rollout-trace/README.md). ## Stack This is PR 1/5 in the rollout trace stack. - [#18876](#18876): Add rollout trace crate - [#18877](#18877): Record core session rollout traces - [#18878](#18878): Trace tool and code-mode boundaries - [#18879](#18879): Trace sessions and multi-agent edges - [#18880](#18880): Add debug trace reduction command ## Review Notes This PR intentionally does not wire tracing into live Codex execution. It establishes the data model and reducer contract first, with crate-local tests covering conversation reconstruction, compaction boundaries, tool/session edges, and code-cell lifecycle reduction. Later PRs emit into this model. The README is the best entry point for reviewing the intended trace format and reduction semantics before diving into the reducer modules.

## Summary Wires rollout trace recording into `codex-core` session and turn execution. This records the core model request/response, compaction, and session lifecycle boundaries needed for replay without yet tracing every nested runtime/tool boundary. ## Stack This is PR 2/5 in the rollout trace stack. - [#18876](#18876): Add rollout trace crate - [#18877](#18877): Record core session rollout traces - [#18878](#18878): Trace tool and code-mode boundaries - [#18879](#18879): Trace sessions and multi-agent edges - [#18880](#18880): Add debug trace reduction command ## Review Notes This layer is the first live integration point. The important review question is whether trace recording is isolated from normal session behavior: trace failures should not become user-visible execution failures, and recording should preserve the existing turn/session lifecycle semantics. The PR depends on the reducer/data model from the first stack entry and only introduces the core recorder surface that later PRs use for richer runtime and relationship events.

jif-oai · 2026-04-22T19:03:19Z

+/// Code mode owns the per-cell runtime id. Hosts should preserve it for
+/// provenance/debugging, but should still assign their own runtime tool call id
+/// if their tool-call graph requires globally unique ids.
+pub struct CodeModeToolInvocation {


Just to be sure, this does not exist anywhere else? This looks a bit like a sub-structure that should already be available

It didn't really exist in a way that we can import it but I've reorganized the code a little bit so we don't have to repeat it like this.

jif-oai · 2026-04-22T19:31:24Z

+impl ToolDispatchPayload {
+    fn log_payload(&self) -> String {
+        match self {
+            ToolDispatchPayload::Function { arguments } => arguments.clone(),


Side comments but all the code contains tons of clone. For this kind of features with high throughput, this can have an impact on latency

Yeah you are right. I've taken a pass over the code and cleaned/avoided all clones I could and also changed the API/implementation slightly so that the construction of larger objects (mainly raw requests) never happen when the tracer is disabled.

## Summary Adds the standalone `codex-rollout-trace` crate, which defines the raw trace event format, replay/reduction model, writer, and reducer logic for reconstructing model-visible conversation/runtime state from recorded rollout data. The crate-level design is documented in [`codex-rs/rollout-trace/README.md`](https://github.com/openai/codex/blob/codex/rollout-trace-crate/codex-rs/rollout-trace/README.md). ## Stack This is PR 1/5 in the rollout trace stack. - [openai#18876](openai#18876): Add rollout trace crate - [openai#18877](openai#18877): Record core session rollout traces - [openai#18878](openai#18878): Trace tool and code-mode boundaries - [openai#18879](openai#18879): Trace sessions and multi-agent edges - [openai#18880](openai#18880): Add debug trace reduction command ## Review Notes This PR intentionally does not wire tracing into live Codex execution. It establishes the data model and reducer contract first, with crate-local tests covering conversation reconstruction, compaction boundaries, tool/session edges, and code-cell lifecycle reduction. Later PRs emit into this model. The README is the best entry point for reviewing the intended trace format and reduction semantics before diving into the reducer modules.

## Summary Wires rollout trace recording into `codex-core` session and turn execution. This records the core model request/response, compaction, and session lifecycle boundaries needed for replay without yet tracing every nested runtime/tool boundary. ## Stack This is PR 2/5 in the rollout trace stack. - [openai#18876](openai#18876): Add rollout trace crate - [openai#18877](openai#18877): Record core session rollout traces - [openai#18878](openai#18878): Trace tool and code-mode boundaries - [openai#18879](openai#18879): Trace sessions and multi-agent edges - [openai#18880](openai#18880): Add debug trace reduction command ## Review Notes This layer is the first live integration point. The important review question is whether trace recording is isolated from normal session behavior: trace failures should not become user-visible execution failures, and recording should preserve the existing turn/session lifecycle semantics. The PR depends on the reducer/data model from the first stack entry and only introduces the core recorder surface that later PRs use for richer runtime and relationship events.

This was referenced Apr 21, 2026

[rollout_trace] Record core session rollout traces #18877

Merged

[rollout_trace] Trace sessions and multi-agent edges #18879

Merged

[rollout_trace] Add rollout trace crate #18876

Merged

[rollout_trace] Add debug trace reduction command #18880

Merged

cassirer-openai marked this pull request as ready for review April 21, 2026 20:29

cassirer-openai requested a review from a team as a code owner April 21, 2026 20:29

cassirer-openai assigned jif-oai Apr 21, 2026

cassirer-openai force-pushed the codex/rollout-trace-core-recorder branch from 3844251 to 582bf74 Compare April 21, 2026 21:25

cassirer-openai force-pushed the codex/rollout-trace-tool-code-mode branch from d22b579 to 47d822c Compare April 21, 2026 21:25

jif-oai reviewed Apr 21, 2026

View reviewed changes

cassirer-openai force-pushed the codex/rollout-trace-core-recorder branch from 582bf74 to 899bb99 Compare April 21, 2026 22:40

Base automatically changed from codex/rollout-trace-core-recorder to main April 22, 2026 17:00

cassirer-openai force-pushed the codex/rollout-trace-tool-code-mode branch from 47d822c to 6a5ab49 Compare April 22, 2026 18:27

cassirer-openai enabled auto-merge (squash) April 22, 2026 19:27

cassirer-openai disabled auto-merge April 22, 2026 19:27

jif-oai reviewed Apr 22, 2026

View reviewed changes

jif-oai approved these changes Apr 23, 2026

View reviewed changes

Comment thread codex-rs/core/src/tools/registry_tests.rs Outdated

cassirer-openai force-pushed the codex/rollout-trace-tool-code-mode branch from b131784 to f1de340 Compare April 23, 2026 16:29

Trace tool and code-mode runtime boundaries

3aaacfc

cassirer-openai force-pushed the codex/rollout-trace-tool-code-mode branch from f1de340 to 3aaacfc Compare April 23, 2026 16:53

cassirer-openai merged commit 6d09b67 into main Apr 23, 2026
25 checks passed

cassirer-openai deleted the codex/rollout-trace-tool-code-mode branch April 23, 2026 19:22

github-actions Bot locked and limited conversation to collaborators Apr 23, 2026

Conversation

cassirer-openai commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Stack

Review Notes

Uh oh!

jif-oai left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cassirer-openai Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cassirer-openai commented Apr 21, 2026 •

edited

Loading

cassirer-openai Apr 22, 2026 •

edited

Loading