[rollout_trace] Trace sessions and multi-agent edges#18879
[rollout_trace] Trace sessions and multi-agent edges#18879cassirer-openai merged 3 commits intomainfrom
Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0689ef66b5
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
d22b579 to
47d822c
Compare
0689ef6 to
e3bfbd5
Compare
## Summary Adds the standalone `codex-rollout-trace` crate, which defines the raw trace event format, replay/reduction model, writer, and reducer logic for reconstructing model-visible conversation/runtime state from recorded rollout data. The crate-level design is documented in [`codex-rs/rollout-trace/README.md`](https://github.com/openai/codex/blob/codex/rollout-trace-crate/codex-rs/rollout-trace/README.md). ## Stack This is PR 1/5 in the rollout trace stack. - [#18876](#18876): Add rollout trace crate - [#18877](#18877): Record core session rollout traces - [#18878](#18878): Trace tool and code-mode boundaries - [#18879](#18879): Trace sessions and multi-agent edges - [#18880](#18880): Add debug trace reduction command ## Review Notes This PR intentionally does not wire tracing into live Codex execution. It establishes the data model and reducer contract first, with crate-local tests covering conversation reconstruction, compaction boundaries, tool/session edges, and code-cell lifecycle reduction. Later PRs emit into this model. The README is the best entry point for reviewing the intended trace format and reduction semantics before diving into the reducer modules.
## Summary Wires rollout trace recording into `codex-core` session and turn execution. This records the core model request/response, compaction, and session lifecycle boundaries needed for replay without yet tracing every nested runtime/tool boundary. ## Stack This is PR 2/5 in the rollout trace stack. - [#18876](#18876): Add rollout trace crate - [#18877](#18877): Record core session rollout traces - [#18878](#18878): Trace tool and code-mode boundaries - [#18879](#18879): Trace sessions and multi-agent edges - [#18880](#18880): Add debug trace reduction command ## Review Notes This layer is the first live integration point. The important review question is whether trace recording is isolated from normal session behavior: trace failures should not become user-visible execution failures, and recording should preserve the existing turn/session lifecycle semantics. The PR depends on the reducer/data model from the first stack entry and only introduces the core recorder surface that later PRs use for richer runtime and relationship events.
47d822c to
6a5ab49
Compare
## Summary Adds the standalone `codex-rollout-trace` crate, which defines the raw trace event format, replay/reduction model, writer, and reducer logic for reconstructing model-visible conversation/runtime state from recorded rollout data. The crate-level design is documented in [`codex-rs/rollout-trace/README.md`](https://github.com/openai/codex/blob/codex/rollout-trace-crate/codex-rs/rollout-trace/README.md). ## Stack This is PR 1/5 in the rollout trace stack. - [openai#18876](openai#18876): Add rollout trace crate - [openai#18877](openai#18877): Record core session rollout traces - [openai#18878](openai#18878): Trace tool and code-mode boundaries - [openai#18879](openai#18879): Trace sessions and multi-agent edges - [openai#18880](openai#18880): Add debug trace reduction command ## Review Notes This PR intentionally does not wire tracing into live Codex execution. It establishes the data model and reducer contract first, with crate-local tests covering conversation reconstruction, compaction boundaries, tool/session edges, and code-cell lifecycle reduction. Later PRs emit into this model. The README is the best entry point for reviewing the intended trace format and reduction semantics before diving into the reducer modules.
## Summary Wires rollout trace recording into `codex-core` session and turn execution. This records the core model request/response, compaction, and session lifecycle boundaries needed for replay without yet tracing every nested runtime/tool boundary. ## Stack This is PR 2/5 in the rollout trace stack. - [openai#18876](openai#18876): Add rollout trace crate - [openai#18877](openai#18877): Record core session rollout traces - [openai#18878](openai#18878): Trace tool and code-mode boundaries - [openai#18879](openai#18879): Trace sessions and multi-agent edges - [openai#18880](openai#18880): Add debug trace reduction command ## Review Notes This layer is the first live integration point. The important review question is whether trace recording is isolated from normal session behavior: trace failures should not become user-visible execution failures, and recording should preserve the existing turn/session lifecycle semantics. The PR depends on the reducer/data model from the first stack entry and only introduces the core recorder surface that later PRs use for richer runtime and relationship events.
f1de340 to
3aaacfc
Compare
e3bfbd5 to
5e2a4ef
Compare
## Summary Extends rollout tracing across tool dispatch and code-mode runtime boundaries. This records canonical tool-call lifecycle events and links code-mode execution/wait operations back to the model-visible calls that caused them. ## Stack This is PR 3/5 in the rollout trace stack. - [#18876](#18876): Add rollout trace crate - [#18877](#18877): Record core session rollout traces - [#18878](#18878): Trace tool and code-mode boundaries - [#18879](#18879): Trace sessions and multi-agent edges - [#18880](#18880): Add debug trace reduction command ## Review Notes This PR is about attribution. Reviewers should focus on whether direct tool calls, code-mode-originated tool calls, waits, outputs, and cancellation boundaries are recorded with enough source information for deterministic reduction without coupling the reducer to live runtime internals. The stack remains valid after this layer: tool and code-mode traces reduce through the existing crate model, while the broader session and multi-agent relationships are added in the next PR.
5e2a4ef to
b98bcf9
Compare
b98bcf9 to
0dd8afa
Compare
|
@codex Review. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0dd8afa04f
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let _ = self.thread_created_tx.send(thread_id); | ||
| } | ||
|
|
||
| async fn parent_rollout_thread_trace_for_source( |
There was a problem hiding this comment.
Not for @jif-oai : simplify with the unification of agent ID
## Summary Adds the debug CLI entry point for reducing recorded rollout traces. This gives developers a direct way to inspect whether the emitted trace stream reduces into the expected conversation/runtime model. ## Stack This is PR 5/5 in the rollout trace stack. - [#18876](#18876): Add rollout trace crate - [#18877](#18877): Record core session rollout traces - [#18878](#18878): Trace tool and code-mode boundaries - [#18879](#18879): Trace sessions and multi-agent edges - [#18880](#18880): Add debug trace reduction command ## Review Notes This PR is intentionally last: it depends on the trace crate, core recorder, runtime/tool events, and session/agent edge data all existing. The command should remain a debug/developer tool and avoid adding new runtime behavior. The useful review question is whether the CLI exposes the reducer in the smallest practical way for local inspection without turning the debug command into a supported user-facing workflow.
Summary
Adds the remaining session and multi-agent edge wiring needed to reconstruct rollout relationships across spawned agents, resumed sessions, and parent/child message delivery.
Stack
This is PR 4/5 in the rollout trace stack.
Review Notes
This is the stack layer that makes traces useful for multi-threaded agent workflows. The main invariant is that reconstructed relationships should come from durable rollout data rather than transient in-memory manager state wherever possible.
The PR is intentionally small relative to the preceding layers: it uses the recorder and reducer contracts already established by the stack and only adds the session/agent relationship events needed by later debug reduction.