[pull] main from openai:main by pull[bot] · Pull Request #58 · kontext-dev/codex

pull · 2026-03-12T00:25:27Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

Addresses #13586 This doesn't affect our CI scripts. It was user-reported. Summary - add `wiremock::ResponseTemplate` and `body_string_contains` imports behind `#[cfg(not(debug_assertions))]` in `codex-rs/core/tests/suite/view_image.rs` so release builds only pull the helpers they actually use

Replace the Unix shell lookup path in `codex-rs/core/src/shell.rs` to use `libc::getpwuid_r()` instead of `libc::getpwuid()` when resolving the current user's shell. Why: - `getpwuid()` can return pointers into libc-managed shared storage - on the musl static Linux build, concurrent callers can race on that storage - this matches the crash pattern reported in tmux/Linux sessions with parallel shell activity Refs: - Fixes #13842

There are some bug investigations that currently require us to ask users for their user ID even though they've already uploaded logs and session details via `/feedback`. This frustrates users and increases the time for diagnosis. This PR includes the ChatGPT user ID in the metadata uploaded for `/feedback` (both the TUI and app-server).

Summary - document output types for the various tool handlers and registry so the API exposes richer descriptions - update unified execution helpers and client tests to align with the new output metadata - clean up unused helpers across tool dispatch paths Testing - Not run (not requested)

## Summary - run the split stdout/stderr PTY test through the normal shell helper on every platform - use a Windows-native command string instead of depending on Python to emit split streams - assert CRLF line endings on Windows explicitly ## Why this fixes the flake The earlier PTY split-output test used a Python one-liner on Windows while the rest of the file exercised shell-command behavior. That made the test depend on runner-local Python availability and masked the real Windows shell output shape. Using a native cmd-compatible command and asserting the actual CRLF output makes the split stdout/stderr coverage deterministic on Windows runners.

We already have a type to represent the MCP tool output, reuse it instead of the custom McpHandlerOutput

Fixes a Codex app bug where quitting the app mid-run could leave the reopened thread stuck in progress and non-interactable. On cold thread resume, app-server could return an idle thread with a replayed turn still marked in progress. This marks incomplete replayed turns as interrupted unless the thread is actually active.

This updates the `skill-creator` sample skill to explicitly cover forward-testing as part of the skill authoring workflow. The guidance now treats subagent-based validation as a first-class step for complex or fragile skills, with an emphasis on preserving evaluation integrity and avoiding leaked context. The sample initialization script is also updated so newly created skills point authors toward forward-testing after validation. Together, these changes make the sample more opinionated about how skills should be iterated on once the initial implementation is complete. - Add new guidance to `SKILL.md` on protecting validation integrity, when to use subagents for forward-testing, and how to structure realistic test prompts without leaking expected answers. - Expand the skill creation workflow so iteration explicitly includes forward-testing for complex skills, including approval guidance for expensive or risky validation runs.

## Summary - add the OpenAI Docs skill under codex-rs/skills/src/assets/samples/openai-docs - include the skill metadata, assets, and GPT-5.4 upgrade reference files - exclude the test harness and test fixtures ## Testing - not run (skill-only asset copy)

## Summary - add ARC monitor support for MCP tool calls by serializing MCP approval requests into the ARC action shape and sending the relevant conversation/policy context to the `/api/codex/safety/arc` endpoint - route ARC outcomes back into MCP approval flow so `ask-user` falls back to a user prompt and `steer-model` blocks the tool call, with guardian/ARC tests covering the new request shape - update the TUI approval copy from “Approve Once” to “Allow” / “Allow for this session” and refresh the related snapshots --------- Co-authored-by: Fouad Matin <fouad@openai.com> Co-authored-by: Fouad Matin <169186268+fouad-openai@users.noreply.github.com>

Add forceRemoteSync to plugin/list. When it is set to True, we will sync the local plugin status with the remote one (backend-api/plugins/list).

- add `model` and `reasoning_effort` to the `spawn_agent` schema so the values pass through - validate requested models against `model.model` and only check that the selected model supports the requested reasoning effort --------- Co-authored-by: Codex <noreply@openai.com>

image-gen feature will have the model saving to /tmp by default + at all times

…Reject (#14191) ## Summary This change makes `AskForApproval::Reject` gate correctly anywhere it appears inside otherwise-stable app-server protocol types. Previously, experimental gating for `approval_policy: Reject` was handled with request-specific logic in `ClientRequest` detection. That covered a few request params types, but it did not generalize to other nested uses such as `ProfileV2`, `Config`, `ConfigReadResponse`, or `ConfigRequirements`. This PR replaces that ad hoc handling with a generic nested experimental propagation mechanism. ## Testing seeing this when run app-server-test-client without experimental api enabled: ``` initialize response: InitializeResponse { user_agent: "codex-toy-app-server/0.0.0 (Mac OS 26.3.1; arm64) vscode/2.4.36 (codex-toy-app-server; 0.0.0)" } > { > "id": "50244f6a-270a-425d-ace0-e9e98205bde7", > "method": "thread/start", > "params": { > "approvalPolicy": { > "reject": { > "mcp_elicitations": false, > "request_permissions": true, > "rules": false, > "sandbox_approval": true > } > }, > "baseInstructions": null, > "config": null, > "cwd": null, > "developerInstructions": null, > "dynamicTools": null, > "ephemeral": null, > "experimentalRawEvents": false, > "mockExperimentalField": null, > "model": null, > "modelProvider": null, > "persistExtendedHistory": false, > "personality": null, > "sandbox": null, > "serviceName": null > } > } < { < "error": { < "code": -32600, < "message": "askForApproval.reject requires experimentalApi capability" < }, < "id": "50244f6a-270a-425d-ace0-e9e98205bde7" < } [verified] thread/start rejected approvalPolicy=Reject without experimentalApi ``` --------- Co-authored-by: celia-oai <celia@openai.com>

…de (#14236) Summary - drop `McpToolOutput` in favor of `CallToolResult`, moving its helpers to keep MCP tooling focused on the final result shape - wire the new schema definitions through code mode, context, handlers, and spec modules so MCP tools serialize the exact output shape expected by the model - extend code mode tests to cover multiple MCP call scenarios and ensure the serialized data matches the new schema - refresh JS runner helpers and protocol models alongside the schema changes Testing - Not run (not requested)

Summary - document that `@openai/code_mode` exposes `set_max_output_tokens_per_exec_call` and that `code_mode` truncates the final Rust-side output when the budget is exceeded - enforce the configured budget in the Rust tool runner, reusing truncation helpers so text-only outputs follow the unified-exec wrapper and mixed outputs still fit within the limit - ensure the new behavior is covered by a code-mode integration test and string spec update Testing - Not run (not requested)

- raise the sdk workflow job timeout from 10 to 15 minutes to reduce false cancellations near the current limit --------- Co-authored-by: Codex <noreply@openai.com>

- clarify the `close_agent` tool description so it nudges models to close agents they no longer need - keep the change scoped to the tool spec text only Co-authored-by: Codex <noreply@openai.com>

Summary - document how code-mode can import `output_text`/`output_image` and ensure `add_content` stays compatible - add a synthetic `@openai/code_mode` module that appends content items and validates inputs - cover the new behavior with integration tests for structured text and image outputs Testing - Not run (not requested)

### Summary This PR adds first-class ephemeral support to thread/fork, bringing it in line with thread/start. The goal is to support one-off completions on full forked threads without persisting them as normal user-visible threads. ### Testing

…ontacts, Reminders (#14155) Add additional macOS Sandbox Permissions levers for the following: - Launch Services - Contacts - Reminders

Pass more params to /compact. This should give us parity with the /responses endpoint to improve caching. I'm torn about the MCP await. Blocking will give us parity but it seems like we explicitly don't block on MCPs. Happy either way

adds support for transferring state across code mode invocations.

## Summary - add `skill_approval` to `RejectConfig` and the app-server v2 `AskForApproval::Reject` payload so skill-script prompts can be configured independently from sandbox and rule-based prompts - update Unix shell escalation to reject prompts based on the actual decision source, keeping prefix rules tied to `rules`, unmatched command fallbacks tied to `sandbox_approval`, and skill scripts tied to `skill_approval` - regenerate the affected protocol/config schemas and expand unit/integration coverage for the new flag and skill approval behavior

## What changed - keep the explicit stdin-close behavior after writing so the child still receives EOF deterministically - on Windows, stop using `python -c` for the round-trip assertion and instead run a native `cmd.exe` pipeline that reads one line from stdin with `set /p` and echoes it back - send ` ` on Windows so the stdin payload matches the platform-native line ending the shell reader expects ## Why this fixes flakiness The failing branch-local flake was not in `spawn_pipe_process` itself. The child exited cleanly, but the Windows ARM runner sometimes produced an empty stdout string when the test used Python as the stdin consumer. That makes the test sensitive to Python startup and stdin-close timing rather than the pipe primitive we actually want to validate. Switching the Windows path to a native `cmd.exe` reader keeps the assertion focused on our pipe behavior: bytes written to stdin should come back on stdout before EOF closes the process. The explicit ` ` write removes line-ending ambiguity on Windows. ## Scope - test-only - no production logic change

## Summary - update the guardian prompting - clarify the guardian rejection message so an action may still proceed if the user explicitly approves it after being informed of the risk ## Testing - cargo run on selected examples

Summary - update the code-mode handler, runner, instructions, and error text to refer to the `exec` tool name everywhere that used to say `code_mode` - ensure generated documentation strings and tool specs describe `exec` and rely on the shared `PUBLIC_TOOL_NAME` - refresh the suite tests so they invoke `exec` instead of the old name Testing - Not run (not requested)

- include the requested sub-agent model and reasoning effort in the spawn begin event\n- render that metadata next to the spawned agent name and role in the TUI transcript --------- Co-authored-by: Codex <noreply@openai.com>

Adds an environment crate and environment + file system abstraction. Environment is a combination of attributes and services specific to environment the agent is connected to: File system, process management, OS, default shell. The goal is to move most of agent logic that assumes environment to work through the environment abstraction.

## Summary - stop `codex sandbox` from forcing legacy `sandbox_mode` when active `[permissions]` profiles are configured - keep the legacy `read-only` / `workspace-write` fallback for legacy configs and reject `--full-auto` for profile-based configs - use split filesystem and network policies in the macOS/Linux debug sandbox helpers and add regressions for the config-loading behavior assuming "codex/docs/private/secret.txt" = "none" ``` codex -c 'default_permissions="limited-read-test"' sandbox macos -- <command> ... codex sandbox macos -- cat codex/docs/private/secret.txt >/dev/null; echo EXIT:$? cat: codex/docs/private/secret.txt: Operation not permitted EXIT:1 ``` --------- Co-authored-by: celia-oai <celia@openai.com>

…14610) ## Summary - support legacy `ReadOnlyAccess::Restricted` on Windows in the elevated setup/runner backend - keep the unelevated restricted-token backend on the legacy full-read model only, and fail closed for restricted read-only policies there - keep the legacy full-read Windows path unchanged while deriving narrower read roots only for elevated restricted-read policies - honor `include_platform_defaults` by adding backend-managed Windows system roots only when requested, while always keeping helper roots and the command `cwd` readable - preserve `workspace-write` semantics by keeping writable roots readable when restricted read access is in use in the elevated backend - document the current Windows boundary: legacy `SandboxPolicy` is supported on both backends, while richer split-only carveouts still fail closed instead of running with weaker enforcement ## Testing - `cargo test -p codex-windows-sandbox` - `cargo check -p codex-windows-sandbox --tests --target x86_64-pc-windows-msvc` - `cargo clippy -p codex-windows-sandbox --tests --target x86_64-pc-windows-msvc -- -D warnings` - `cargo test -p codex-core windows_restricted_token_` ## Notes - local `cargo test -p codex-windows-sandbox` on macOS only exercises the non-Windows stubs; the Windows-targeted compile and clippy runs provide the local signal, and GitHub Windows CI exercises the runtime path

Remove all flags and model settings. --------- Co-authored-by: Codex <noreply@openai.com>

- close live realtime sessions on errors, ctrl-c, and active meter removal - centralize TUI realtime cleanup and avoid duplicate follow-up close info --------- Co-authored-by: Codex <noreply@openai.com> Co-authored-by: Ahmed Ibrahim <219906144+aibrahim-oai@users.noreply.github.com>

## Summary - move `guardian_developer_instructions` from managed config into workspace-managed `requirements.toml` - have guardian continue using the override when present and otherwise fall back to the bundled local guardian prompt - keep the generalized prompt-quality improvements in the shared guardian default prompt - update requirements parsing, layering, schema, and tests for the new source of truth ## Context This replaces the earlier managed-config / MDM rollout plan. The intended rollout path is workspace-managed requirements, including cloud enterprise policies, rather than backend model metadata, Statsig, or Jamf-managed config. That keeps the default/fallback behavior local to `codex-rs` while allowing faster policy updates through the enterprise requirements plane. This is intentionally an admin-managed policy input, not a user preference: the guardian prompt should come either from the bundled `codex-rs` default or from enterprise-managed `requirements.toml`, and normal user/project/session config should not override it. ## Updating The OpenAI Prompt After this lands, the OpenAI-specific guardian prompt should be updated through the workspace Policies UI at `/codex/settings/policies` rather than through Jamf or codex-backend model metadata. Operationally: - open the workspace Policies editor as a Codex admin - edit the default `requirements.toml` policy, or a higher-precedence group-scoped override if we ever want different behavior for a subset of users - set `guardian_developer_instructions = """..."""` to the full OpenAI-specific guardian prompt text - save the policy; codex-backend stores the raw TOML and `codex-rs` fetches the effective requirements file from `/wham/config/requirements` When updating the OpenAI-specific prompt, keep it aligned with the shared default guardian policy in `codex-rs` except for intentional OpenAI-only additions. ## Testing - `cargo check --tests -p codex-core -p codex-config -p codex-cloud-requirements --message-format short` - `cargo run -p codex-core --bin codex-write-config-schema` - `cargo fmt` - `git diff --check` Co-authored-by: Codex <noreply@openai.com>

) - this allows blocking the user's prompts from executing, and also prevents them from entering history - handles the edge case where you can both prevent the user's prompt AND add n amount of additionalContexts - refactors some old code into common.rs where hooks overlap functionality - refactors additionalContext being previously added to user messages, instead we use developer messages for them - handles queued messages correctly Sample hook for testing - if you write "[block-user-submit]" this hook will stop the thread: example run ``` › sup • Running UserPromptSubmit hook: reading the observatory notes UserPromptSubmit hook (completed) warning: wizard-tower UserPromptSubmit demo inspected: sup hook context: Wizard Tower UserPromptSubmit demo fired. For this reply only, include the exact phrase 'observatory lanterns lit' exactly once near the end. • Just riding the cosmic wave and ready to help, my friend. What are we building today? observatory lanterns lit › and [block-user-submit] • Running UserPromptSubmit hook: reading the observatory notes UserPromptSubmit hook (stopped) warning: wizard-tower UserPromptSubmit demo blocked the prompt on purpose. stop: Wizard Tower demo block: remove [block-user-submit] to continue. ``` .codex/config.toml ``` [features] codex_hooks = true ``` .codex/hooks.json ``` { "hooks": { "UserPromptSubmit": [ { "hooks": [ { "type": "command", "command": "/usr/bin/python3 .codex/hooks/user_prompt_submit_demo.py", "timeoutSec": 10, "statusMessage": "reading the observatory notes" } ] } ] } } ``` .codex/hooks/user_prompt_submit_demo.py ``` #!/usr/bin/env python3 import json import sys from pathlib import Path def prompt_from_payload(payload: dict) -> str: prompt = payload.get("prompt") if isinstance(prompt, str) and prompt.strip(): return prompt.strip() event = payload.get("event") if isinstance(event, dict): user_prompt = event.get("user_prompt") if isinstance(user_prompt, str): return user_prompt.strip() return "" def main() -> int: payload = json.load(sys.stdin) prompt = prompt_from_payload(payload) cwd = Path(payload.get("cwd", ".")).name or "wizard-tower" if "[block-user-submit]" in prompt: print( json.dumps( { "systemMessage": ( f"{cwd} UserPromptSubmit demo blocked the prompt on purpose." ), "decision": "block", "reason": ( "Wizard Tower demo block: remove [block-user-submit] to continue." ), } ) ) return 0 prompt_preview = prompt or "(empty prompt)" if len(prompt_preview) > 80: prompt_preview = f"{prompt_preview[:77]}..." print( json.dumps( { "systemMessage": ( f"{cwd} UserPromptSubmit demo inspected: {prompt_preview}" ), "hookSpecificOutput": { "hookEventName": "UserPromptSubmit", "additionalContext": ( "Wizard Tower UserPromptSubmit demo fired. " "For this reply only, include the exact phrase " "'observatory lanterns lit' exactly once near the end." ), }, } ) ) return 0 if __name__ == "__main__": raise SystemExit(main()) ```

### Motivation - Pinning the action to an immutable commit SHA reduces the risk of arbitrary code execution in runners with repository access and secrets. ### Description - Replaced `uses: mlugg/setup-zig@v2` with `uses: mlugg/setup-zig@d1434d0 # v2` in three workflow files. - Updated the following files: ` .github/workflows/rust-ci.yml`, ` .github/workflows/rust-release.yml`, and ` .github/workflows/shell-tool-mcp.yml` to reference the immutable SHA while preserving the original `v2` intent in a trailing comment. ### Testing - No automated tests were run because this is a workflow-only change and does not affect repository source code, so CI validation will occur on the next workflow execution. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_69763f570234832d9c67b1b66a27c78d)

## Summary If a subagent requests approval, and the user persists that approval to the execpolicy, it should (by default) propagate. We'll need to rethink this a bit in light of coming Permissions changes, though I think this is closer to the end state that we'd want, which is that execpolicy changes to one permissions profile should be synced across threads. ## Testing - [x] Added integration test --------- Co-authored-by: Codex <noreply@openai.com>

- [x] Support configuration tool suggest allowlist. Supports both plugins and connectors.

Client side to come

Because we don't want prompts collisions

Allows model to send an out-of-band notification. The notification is injected as another tool call output for the same call_id.

1. Use requirement-resolved config.features as the plugin gate. 2. Guard plugin/list, plugin/read, and related flows behind that gate. 3. Skip bad marketplace.json files instead of failing the whole list. 4. Simplify plugin state and caching.

## Problem The app-server TUI (`tui_app_server`) lacked composer history support. Pressing Up/Down to recall previous prompts hit a stub that logged a warning and displayed "Not available in app-server TUI yet." New submissions were silently dropped from the shared history file, so nothing persisted for future sessions. ## Mental model Codex maintains a single, append-only history file (`$CODEX_HOME/history.jsonl`) shared across all TUI processes on the same machine. The legacy (in-process) TUI already reads/writes this file through `codex_core::message_history`. The app-server TUI delegates most operations to a separate process over RPC, but history is intentionally *not* an RPC concern — it's a client-local file. This PR makes the app-server TUI access the same history file directly, bypassing the app-server process entirely. The composer's Up/Down navigation and submit-time persistence now follow the same code paths as the legacy TUI, with the only difference being *where* the call is dispatched (locally in `App`, rather than inside `CodexThread`). The branch is rebuilt directly on top of `upstream/main`, so it keeps the existing app-server restore architecture intact. `AppServerStartedThread` still restores transcript history from the server `Thread` snapshot via `thread_snapshot_events`; this PR only adds composer-history support. ## Non-goals - Adding history support to the app-server protocol. History remains client-local. - Changing the on-disk format or location of `history.jsonl`. - Surfacing history I/O errors to the user (failures are logged and silently swallowed, matching the legacy TUI). ## Tradeoffs | Decision | Why | Risk | |----------|-----|------| | Widen `message_history` from `pub(crate)` to `pub` | Avoids duplicating file I/O logic; the module already has a clean, minimal API surface. | Other workspace crates can now call these functions — the contract is no longer crate-private. However, this is consistent with recent precedent: `590cfa617` exposed `mention_syntax` for TUI consumption, `752402c4f` exposed plugin APIs (`PluginsManager`), and `14fcb6645`/`edacbf7b6` widened internal core APIs for other crates. These were all narrow, intentional exposures of specific APIs — not broad "make internals public" moves. `1af2a37ad` even went the other direction, reducing broad re-exports to tighten boundaries. This change follows the same pattern: a small, deliberate API surface (3 functions) rather than a wholesale visibility change. | | Intercept `AddToHistory` / `GetHistoryEntryRequest` in `App` before RPC fallback | Keeps history ops out of the "unsupported op" error path without changing app-server protocol. | This now routes through a single `submit_thread_op` entry point, which is safer than the original duplicated dispatch. The remaining risk is organizational: future thread-op submission paths need to keep using that shared entry point. | | `session_configured_from_thread_response` is now `async` | Needs `await` on `history_metadata()` to populate real `history_log_id` / `history_entry_count`. | Adds an async file-stat + full-file newline scan to the session bootstrap path. The scan is bounded by `history.max_bytes` and matches the legacy TUI's cost profile, but startup latency still scales with file size. | ## Architecture ``` User presses Up User submits a prompt │ │ ▼ ▼ ChatComposerHistory ChatWidget::do_submit_turn navigate_up() encode_history_mentions() │ │ ▼ ▼ AppEvent::CodexOp Op::AddToHistory { text } (GetHistoryEntryRequest) │ │ ▼ ▼ App::try_handle_local_history_op App::try_handle_local_history_op message_history::append_entry() spawn_blocking { │ message_history::lookup() ▼ } $CODEX_HOME/history.jsonl │ ▼ AppEvent::ThreadEvent (GetHistoryEntryResponse) │ ▼ ChatComposerHistory::on_entry_response() ``` ## Observability - `tracing::warn` on `append_entry` failure (includes thread ID). - `tracing::warn` on `spawn_blocking` lookup join error. - `tracing::warn` from `message_history` internals on file-open, lock, or parse failures. ## Tests - `chat_composer_history::tests::navigation_with_async_fetch` — verifies that Up emits `Op::GetHistoryEntryRequest` (was: checked for stub error cell). - `app::tests::history_lookup_response_is_routed_to_requesting_thread` — verifies multi-thread composer recall routes the lookup result back to the originating thread. - `app_server_session::tests::resume_response_relies_on_snapshot_replay_not_initial_messages` — verifies app-server session restore still uses the upstream thread-snapshot path. - `app_server_session::tests::session_configured_populates_history_metadata` — verifies bootstrap sets nonzero `history_log_id` / `history_entry_count` from the shared local history file.

Fix the CI job by updating it to use artifacts from a more recent release (`0.115.0`) instead of the existing one (`0.74.0`). This step in our CI job on PRs started failing today: https://github.com/openai/codex/blob/334164a6f714c171bb9f6440c7d3cd04ec04d295/.github/workflows/ci.yml#L33-L47 I believe it's because this test verifies that the "package npm" script works, but we want it to be fast and not wait for binaries to be built, so it uses a GitHub workflow that's already done. Because it was using a GitHub workflow associated with `0.74.0`, it seems likely that workflow's history has been reaped, so we need to use a newer one.

Clean up error flow to push the FunctionCallError all the way up to dispatcher and allow code mode to surface as exception.

Cleanup image semantics in code mode. `view_image` now returns `{image_url:string, details?: string}` `image()` now allows both string parameter and `{image_url:string, details?: string}`

## Summary - detect custom prompts in `$CODEX_HOME/prompts` during TUI startup - show a deprecation notice only when prompts are present, with guidance to use `$skill-creator` - add TUI tests and snapshot coverage for present, missing, and empty prompts directories ## Testing - Manually tested

Reverts #15020 I messed up the commit in my PR and accidentally merged changes that were still under review.

- prefix realtime handoff output with the agent final message label for both realtime v1 and v2 - update realtime websocket and core expectations to match

## Summary - store a pre-rendered `feedback_log_body` in SQLite so `/feedback` exports keep span prefixes and structured event fields - render SQLite feedback exports with timestamps and level prefixes to match the old in-memory feedback formatter, while preserving existing trailing newlines - count `feedback_log_body` in the SQLite retention budget so structured or span-prefixed rows still prune correctly - bound `/feedback` row loading in SQL with the retention estimate, then apply exact whole-line truncation in Rust so uploads stay capped without splitting lines ## Details - add a `feedback_log_body` column to `logs` and backfill it from `message` for existing rows - capture span names plus formatted span and event fields at write time, since SQLite does not retain enough structure to reconstruct the old formatter later - keep SQLite feedback queries scoped to the requested thread plus same-process threadless rows - restore a SQL-side cumulative `estimated_bytes` cap for feedback export queries so over-retained partitions do not load every matching row before truncation - add focused formatting coverage for exported feedback lines and parity coverage against `tracing_subscriber` ## Testing - cargo test -p codex-state - just fix -p codex-state - just fmt codex author: `codex resume 019ca1b0-0ecc-78b1-85eb-6befdd7e4f1f` --------- Co-authored-by: Codex <noreply@openai.com>

…4888) ## Summary This PR makes `thread/resume` reuse persisted thread model metadata when the caller does not explicitly override it. Changes: - read persisted thread metadata from SQLite during `thread/resume` - reuse persisted `model` and `model_reasoning_effort` as resume-time defaults - fetch persisted metadata once and reuse it later in the resume response path - keep thread summary loading on the existing rollout path, while reusing persisted metadata when available - document the resume fallback behavior in the app-server README ## Why Before this change, resuming a thread without explicit overrides derived `model` and `model_reasoning_effort` from current config, which could drift from the thread’s last persisted values. That meant a resumed thread could report and run with different model settings than the ones it previously used. ## Behavior Precedence on `thread/resume` is now: 1. explicit resume overrides 2. persisted SQLite metadata for the thread 3. normal config resolution for the resumed cwd

# External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request.

It's empty!

It's empty !

Resubmit #15020 with correct content. 1. Use requirement-resolved config.features as the plugin gate. 2. Guard plugin/list, plugin/read, and related flows behind that gate. 3. Skip bad marketplace.json files instead of failing the whole list. 4. Simplify plugin state and caching.

etraut-openai and others added 30 commits March 11, 2026 12:33

Reuse McpToolOutput in McpHandler (#14229)

c4d3508

We already have a type to represent the MCP tool output, reuse it instead of the custom McpHandlerOutput

Add OpenAI Docs skill (#13596)

b7f8e91

## Summary - add the OpenAI Docs skill under codex-rs/skills/src/assets/samples/openai-docs - include the skill metadata, assets, and GPT-5.4 upgrade reference files - exclude the test harness and test fixtures ## Testing - not run (skill-only asset copy)

feat: Allow sync with remote plugin status. (#14176)

d751e68

Add forceRemoteSync to plugin/list. When it is set to True, we will sync the local plugin status with the remote one (backend-api/plugins/list).

Add granular metrics for cloud requirements load (#14108)

3d4628c

unifying all image saves to /tmp to bug-proof (#14149)

722e8f0

image-gen feature will have the model saving to /tmp by default + at all times

Load agent metadata from role files (#14177)

a67660d

Increase sdk workflow timeout to 15 minutes (#14252)

b1dddcb

- raise the sdk workflow job timeout from 10 to 15 minutes to reduce false cancellations near the current limit --------- Co-authored-by: Codex <noreply@openai.com>

Clarify close_agent tool description (#14269)

ce1d9ab

- clarify the `close_agent` tool description so it nudges models to close agents they no longer need - keep the change scoped to the tool spec text only Co-authored-by: Codex <noreply@openai.com>

feat: Add additional macOS Sandbox Permissions for Launch Services, C…

889b479

…ontacts, Reminders (#14155) Add additional macOS Sandbox Permissions levers for the following: - Launch Services - Contacts - Reminders

Pass more params to compaction (#14247)

2621ba1

Pass more params to /compact. This should give us parity with the /responses endpoint to improve caching. I'm torn about the MCP await. Blocking will give us parity but it seems like we explicitly don't block on MCPs. Happy either way

Add store/load support for code mode (#14259)

83b22bb

adds support for transferring state across code mode invocations.

prompt changes to guardian (#14263)

e77b2fd

## Summary - update the guardian prompting - clarify the guardian rejection message so an action may still proceed if the user explicitly approves it after being informed of the risk ## Testing - cargo run on selected examples

Show spawned agent model and effort in TUI (#14273)

285b3a5

- include the requested sub-agent model and reasoning effort in the spawn begin event\n- render that metadata next to the spawned agent name and role in the TUI transcript --------- Co-authored-by: Codex <noreply@openai.com>

pakrym-oai and others added 30 commits March 17, 2026 17:36

Prefer websockets when providers support them (#13592)

7706164

Remove all flags and model settings. --------- Co-authored-by: Codex <noreply@openai.com>

[plugins] Support configuration tool suggest allowlist. (#15022)

40a7d1d

- [x] Support configuration tool suggest allowlist. Supports both plugins and connectors.

feat: adapt artifacts to new packaging and 2.5.6 (#14947)

0f9484d

feat: add memory citation to agent message (#14821)

a265d60

Client side to come

nit: disable live memory edition (#15058)

58ac2a8

Removed remaining core events from tui_app_server (#14942)

347c6b1

chore: disable memory read path for morpheus (#15059)

7ae9957

Because we don't want prompts collisions

Add notify to code-mode (#14842)

606d850

Allows model to send an out-of-band notification. The notification is injected as another tool call output for the same call_id.

fix: harden plugin feature gating (#15020)

580f32a

1. Use requirement-resolved config.features as the plugin gate. 2. Guard plugin/list, plugin/read, and related flows behind that gate. 3. Skip bad marketplace.json files instead of failing the whole list. 4. Simplify plugin state and caching.

Propagate tool errors to code mode (#15075)

88e5382

Clean up error flow to push the FunctionCallError all the way up to dispatcher and allow code mode to surface as exception.

Return image URL from view_image tool (#15072)

5cada46

Cleanup image semantics in code mode. `view_image` now returns `{image_url:string, details?: string}` `image()` now allows both string parameter and `{image_url:string, details?: string}`

Revert "fix: harden plugin feature gating" (#15102)

86982ca

Reverts #15020 I messed up the commit in my PR and accidentally merged changes that were still under review.

Add final message prefix to realtime handoff output (#15077)

7b37a03

- prefix realtime handoff output with the agent final message label for both realtime v1 and v2 - update realtime websocket and core expectations to match

Add update_plan code mode result (#15103)

3590e18

It's empty!

Add apply_patch code mode result (#15100)

56d0c6b

It's empty !

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from openai:main#58

[pull] main from openai:main#58
pull[bot] wants to merge 247 commits intokontext-dev:mainfrom
openai:main

pull bot commented Mar 12, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

pull bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

pull bot commented Mar 12, 2026 •

edited

Loading