Skip to content

feat: add --experimental to generate-ts#10402

Merged
jif-oai merged 1 commit intomainfrom
jif/bind-experimental
Feb 2, 2026
Merged

feat: add --experimental to generate-ts#10402
jif-oai merged 1 commit intomainfrom
jif/bind-experimental

Conversation

@jif-oai
Copy link
Collaborator

@jif-oai jif-oai commented Feb 2, 2026

Adding a --experimental flag to the generate-ts fct in the app-sever.

It can be called through one of those 2 command

just write-app-server-schema --experimental
codex app-server generate-ts --experimental

@jif-oai
Copy link
Collaborator Author

jif-oai commented Feb 2, 2026

@codex review

@chatgpt-codex-connector
Copy link
Contributor

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@jif-oai jif-oai merged commit 059d386 into main Feb 2, 2026
60 of 65 checks passed
@jif-oai jif-oai deleted the jif/bind-experimental branch February 2, 2026 20:30
pakrym-oai added a commit that referenced this pull request Feb 4, 2026
- Added `codex app <path>` on macOS to launch Codex Desktop from the CLI, with automatic DMG download if it is missing. (#10418)
- Added personal skill loading from `~/.agents/skills` (with `~/.codex/skills` compatibility), plus app-server APIs/events to list and download public remote skills. (#10437, #10448)
- `/plan` now accepts inline prompt arguments and pasted images, and slash-command editing/highlighting in the TUI is more polished. (#10269)
- Shell-related tools can now run in parallel, improving multi-command execution throughput. (#10505)
- Shell executions now receive `CODEX_THREAD_ID`, so scripts and skills can detect the active thread/session. (#10096)
- Added vendored Bubblewrap + FFI wiring in the Linux sandbox as groundwork for upcoming runtime integration. (#10413)

## Bug Fixes
- Hardened Git command safety so destructive or write-capable invocations no longer bypass approval checks. (#10258)
- Improved resume/thread browsing reliability by correctly showing saved thread names and fixing thread listing behavior. (#10340, #10383)
- Fixed first-run trust-mode handling so sandbox mode is reported consistently, and made `$PWD/.agents` read-only like `$PWD/.codex`. (#10415, #10524)
- Fixed `codex exec` hanging after interrupt in websocket/streaming flows; interrupted turns now shut down cleanly. (#10519)
- Fixed review-mode approval event wiring so `requestApproval` IDs align with the corresponding command execution items. (#10416)
- Improved 401 error diagnostics by including server message/body details plus `cf-ray` and `requestId`. (#10508)

## Documentation
- Expanded TUI chat composer docs to cover slash-command arguments and attachment handling in plan/review flows. (#10269)
- Refreshed issue templates and labeler prompts to better separate CLI/app bug reporting and feature requests. (#10411, #10453, #10548, #10552)

## Chores
- Completed migration off the deprecated `mcp-types` crate to `rmcp`-based protocol types/adapters, then removed the legacy crate. (#10356, #10349, #10357)
- Updated the `bytes` dependency for a security advisory and cleaned up resolved advisory configuration. (#10525)

## Changelog

Full Changelog: rust-v0.94.0...rust-v0.95.0

- #10340 Session picker shows thread_name if set @pap-openai
- #10381 chore: collab experimental @jif-oai
- #10231 feat: experimental flags @jif-oai
- #10382 nit: shell snapshot retention to 3 days @jif-oai
- #10383 fix: thread listing @jif-oai
- #10386 fix: Rfc3339 casting @jif-oai
- #10356 feat: add MCP protocol types and rmcp adapters @bolinfest
- #10269 Nicer highlighting of slash commands, /plan accepts prompt args and pasted images @charley-oai
- #10274 Add credits tooltip @pakrym-oai
- #10394 chore: ignore synthetic messages @jif-oai
- #10398 feat: drop sqlx logging @jif-oai
- #10281 Select experimental features with space @pakrym-oai
- #10402 feat: add `--experimental` to `generate-ts` @jif-oai
- #10258 fix: unsafe auto-approval of git commands @viyatb-oai
- #10411 Updated labeler workflow prompt to include "app" label @etraut-openai
- #10399 emit a separate metric when the user cancels UAT during elevated setup @iceweasel-oai
- #10377 chore(tui) /personalities tip @dylan-hurd-oai
- #10252 [feat] persist thread_dynamic_tools in db @celia-oai
- #10437 feat: Read personal skills from .agents/skills @gverma-openai
- #10145 make codex better at git @pash-openai
- #10418 Add `codex app` macOS launcher @aibrahim-oai
- #10447 Fix plan implementation prompt reappearing after /agent thread switch @charley-oai
- #10064 TUI: Render request_user_input results in history and simplify interrupt handling @charley-oai
- #10349 feat: replace custom mcp-types crate with equivalents from rmcp @bolinfest
- #10415 Fixed sandbox mode inconsistency if untrusted is selected @etraut-openai
- #10452 Hide short worked-for label in final separator @aibrahim-oai
- #10357 chore: remove deprecated mcp-types crate @bolinfest
- #10454 app tool tip @aibrahim-oai
- #10455 chore: add phase to message responseitem @sayan-oai
- #10414 Require models refresh on cli version mismatch @aibrahim-oai
- #10271 [Codex][CLI] Gate image inputs by model modalities @ccy-oai
- #10374 Trim compaction input @pakrym-oai
- #10453 Updated bug and feature templates @etraut-openai
- #10465 Restore status after preamble @pakrym-oai
- #10406 fix: clarify deprecation message for features.web_search @sayan-oai
- #10474 Ignore remote_compact_trims_function_call_history_to_fit_context_window on windows @pakrym-oai
- #10413 feat(linux-sandbox): vendor bubblewrap and wire it with FFI @viyatb-oai
- #10142 feat(secrets): add codex-secrets crate @viyatb-oai
- #10157 chore: nuke chat/completions API @jif-oai
- #10498 feat: drop wire_api from clients @jif-oai
- #10501 feat: clean codex-api part 1 @jif-oai
- #10508 Add more detail to 401 error @gt-oai
- #10521 Avoid redundant transactional check before inserting dynamic tools @jif-oai
- #10525 chore: update bytes crate in response to security advisory @bolinfest
- #10408 fix WebSearchAction type clash between v1 and v2 @sayan-oai
- #10404 Cleanup collaboration mode variants @charley-oai
- #10505 Enable parallel shell tools @jif-oai
- #10532 feat: `find_thread_path_by_id_str_in_subdir` from DB @jif-oai
- #10524 fix: make $PWD/.agents read-only like $PWD/.codex @bolinfest
- #10096 Inject CODEX_THREAD_ID into the terminal environment @maxj-oai
- #10536 Revert "Load untrusted rules" @viyatb-oai
- #10412 fix(app-server): fix TS annotations for optional fields on requests @owenlin0
- #10416 fix(app-server): fix approval events in review mode @owenlin0
- #10545 Improve Default mode prompt (less confusion with Plan mode) @charley-oai
- #10289 [apps] Gateway MCP should be blocking. @mzeng-openai
- #10189 implement per-workspace capability SIDs for workspace specific ACLs @iceweasel-oai
- #10548 Updated bug templates and added a new one for app @etraut-openai
- #10531 [codex] Default values from requirements if unset @gt-oai
- #10552 Fixed icon for CLI bug template @etraut-openai
- #10039 chore(arg0): advisory-lock janitor for codex tmp paths @viyatb-oai
- #10448 feat: add APIs to list and download public remote skills @xl-openai
- #10519 Handle exec shutdown on Interrupt (fixes immortal `codex exec` with websockets) @rasmusrygaard
- #10556 Feat: add upgrade to app server modelList @shijie-oai
- #10461 feat(tui): pace catch-up stream chunking with hysteresis @joshka-oai
- #10367 chore: add `codex debug app-server` tooling @celia-oai
michiosw added a commit to kontext-dev/codex that referenced this pull request Feb 7, 2026
* chore: unify metric (#10220)

* Conversation naming (#8991)

Session renaming:
- `/rename my_session`
- `/rename` without arg and passing an argument in `customViewPrompt`
- AppExitInfo shows resume hint using the session name if set instead of
uuid, defaults to uuid if not set
- Names are stored in `CODEX_HOME/sessions.jsonl`

Session resuming:
- codex resume <name> lookup for `CODEX_HOME/sessions.jsonl` first entry
matching the name and resumes the session

---------

Co-authored-by: jif-oai <jif@openai.com>

* Fetch Requirements from cloud (#10167)

Load requirements from Codex Backend. It only does this for enterprise
customers signed in with ChatGPT.

Todo in follow-up PRs:
* Add to app-server and exec too
* Switch from fail-open to fail-closed on failure

* explorer prompt (#10225)

* fix: make sure the shell exists (#10222)

* chore: do not clean the DB anymore (#10232)

* feat: improve logs client (#10229)

* feat: heuristic coloring of logs (#10228)

* nit: fix db with multiple metadata lines (#10237)

* feat: refactor CodexAuth so invalid state cannot be represented (#10208)

Previously, `CodexAuth` was defined as follows:


https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L39-L46

But if you looked at its constructors, we had creation for
`AuthMode::ApiKey` where `storage` was built using a nonsensical path
(`PathBuf::new()`) and `auth_dot_json` was `None`:


https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L212-L220

By comparison, when `AuthMode::ChatGPT` was used, `api_key` was always
`None`:


https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L665-L671

https://github.com/openai/codex/pull/10012 took things further because
it introduced a new `ChatgptAuthTokens` variant to `AuthMode`, which is
important in when invoking `account/login/start` via the app server, but
most logic _internal_ to the app server should just reason about two
`AuthMode` variants: `ApiKey` and `ChatGPT`.

This PR tries to clean things up as follows:

- `LoginAccountParams` and `AuthMode` in `codex-rs/app-server-protocol/`
both continue to have the `ChatgptAuthTokens` variant, though it is used
exclusively for the on-the-wire messaging.
- `codex-rs/core/src/auth.rs` now has its own `AuthMode` enum, which
only has two variants: `ApiKey` and `ChatGPT`.
- `CodexAuth` has been changed from a struct to an enum. It is a
disjoint union where each variant (`ApiKey`, `ChatGpt`, and
`ChatGptAuthTokens`) have only the associated fields that make sense for
that variant.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10208).
* #10224
* __->__ #10208

* chore(feature) Experimental: Personality (#10212)

## Summary
Let users start opting in to trying out personalities

## Testing
- [x] existing tests pass

* chore(feature) Experimental: Smart Approvals (#10211)

## Summary
Let's start getting feedback on this feature 😅 

## Testing
- [x] existing tests pass

* Load exec policy rules from requirements (#10190)

`requirements.toml` should be able to specify rules which always run. 

My intention here was that these rules could only ever be restrictive,
which means the decision can be "prompt" or "forbidden" but never
"allow". A requirement of "you must always allow this command" didn't
make sense to me, but happy to be gaveled otherwise.

Rules already applies the most restrictive decision, so we can safely
merge these with rules found in other config folders.

* plan mode: add TL;DR checkpoint and client behavior note (#10195)

## Summary
- Tightens Plan Mode to encourage exploration-first behavior and more
back-and-forth alignment.
- Adds a required TL;DR checkpoint before drafting the full plan.
- Clarifies client behavior that can cause premature “Implement this
plan?” prompts.

## What changed
- Require at least one targeted non-mutating exploration pass before the
first user question.
- Insert a TL;DR checkpoint between Phase 2 (intent) and Phase 3
(implementation).
- TL;DR checkpoint guidance:
  - Label: “Proposed Plan (TL;DR)”
  - Format: 3–5 bullets using `- `
  - Options: exactly one option, “Approve”
- `isOther: true`, with explicit guidance that “None of the above” is
the edit path in the current UI.
- Require the final plan to include a TL;DR consistent with the approved
checkpoint.

## Why
- In Plan Mode, any normal assistant message at turn completion is
treated as plan content by the client. This can trigger premature
“Implement this plan?” prompts.
- The TL;DR checkpoint aligns on direction before Codex drafts a long,
decision-complete plan.

## Testing
- Manual: built the local CLI and verified the flow now explores first,
presents a TL;DR checkpoint, and only drafts the full plan after
approval.

---------

Co-authored-by: Nick Baumann <@openai.com>

* chore: fix the build breakage that came from a merge race (#10239)

I think I needed to rebase on top of
https://github.com/openai/codex/pull/10167 before merging
https://github.com/openai/codex/pull/10208.

* Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786)

## Summary
- Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed
in core, emitting plan deltas plus a plan `ThreadItem`, while stripping
tags from normal assistant output.
- Persist plan items and rebuild them on resume so proposed plans show
in thread history.
- Wire plan items/deltas through app-server protocol v2 and render a
dedicated proposed-plan view in the TUI, including the “Implement this
plan?” prompt only when a plan item is present.

## Changes

### Core (`codex-rs/core`)
- Added a generic, line-based tag parser that buffers each line until it
can disprove a tag prefix; implements auto-close on `finish()` for
unterminated tags. `codex-rs/core/src/tagged_block_parser.rs`
- Refactored proposed plan parsing to wrap the generic parser.
`codex-rs/core/src/proposed_plan_parser.rs`
- In plan mode, stream assistant deltas as:
  - **Normal text** → `AgentMessageContentDelta`
  - **Plan text** → `PlanDelta` + `TurnItem::Plan` start/completion  
  (`codex-rs/core/src/codex.rs`)
- Final plan item content is derived from the completed assistant
message (authoritative), not necessarily the concatenated deltas.
- Strips `<proposed_plan>` blocks from assistant text in plan mode so
tags don’t appear in normal messages.
(`codex-rs/core/src/stream_events_utils.rs`)
- Persist `ItemCompleted` events only for plan items for rollout replay.
(`codex-rs/core/src/rollout/policy.rs`)
- Guard `update_plan` tool in Plan Mode with a clear error message.
(`codex-rs/core/src/tools/handlers/plan.rs`)
- Updated Plan Mode prompt to:  
  - keep `<proposed_plan>` out of non-final reasoning/preambles  
  - require exact tag formatting  
  - allow only one `<proposed_plan>` block per turn  
  (`codex-rs/core/templates/collaboration_mode/plan.md`)

### Protocol / App-server protocol
- Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items.
(`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`)
- Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with
EXPERIMENTAL markers and note that deltas may not match the final plan
item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`)
- Added plan delta route in app-server protocol common mapping.
(`codex-rs/app-server-protocol/src/protocol/common.rs`)
- Rebuild plan items from persisted `ItemCompleted` events on resume.
(`codex-rs/app-server-protocol/src/protocol/thread_history.rs`)

### App-server
- Forward plan deltas to v2 clients and map core plan items to v2 plan
items. (`codex-rs/app-server/src/bespoke_event_handling.rs`,
`codex-rs/app-server/src/codex_message_processor.rs`)
- Added v2 plan item tests.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

### TUI
- Added a dedicated proposed plan history cell with special background
and padding, and moved “• Proposed Plan” outside the highlighted block.
(`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`)
- Only show “Implement this plan?” when a plan item exists.
(`codex-rs/tui/src/chatwidget.rs`,
`codex-rs/tui/src/chatwidget/tests.rs`)

<img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM"
src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286"
/>

### Docs / Misc
- Updated protocol docs to mention plan deltas.
(`codex-rs/docs/protocol_v1.md`)
- Minor plumbing updates in exec/debug clients to tolerate plan deltas.
(`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`)

## Tests
- Added core integration tests:
  - Plan mode strips plan from agent messages.
  - Missing `</proposed_plan>` closes at end-of-message.  
  (`codex-rs/core/tests/suite/items.rs`)
- Added unit tests for generic tag parser (prefix buffering, non-tag
lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`)
- Existing app-server plan item tests in v2.
(`codex-rs/app-server/tests/suite/v2/plan_item.rs`)

## Notes / Behavior
- Plan output no longer appears in standard assistant text in Plan Mode;
it streams via `PlanDelta` and completes as a `TurnItem::Plan`.
- The final plan item content is authoritative and may diverge from
streamed deltas (documented as experimental).
- Reasoning summaries are not filtered; prompt instructs the model not
to include `<proposed_plan>` outside the final plan message.

## Codex Author
`codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb`

* Tui: hide Code mode footer label (#10063)

Title
Hide Code mode footer label/cycle hint; add Plan footer-collapse
snapshots

Summary
- Keep Code mode internal naming but suppress the footer mode label +
cycle hint when Code is active.
- Only show the cycle hint when a non‑Code mode indicator is present.
- Add Plan-mode footer collapse snapshot coverage (empty + queued,
across widths) and update existing footer collapse snapshots for the new
Code behavior.

Notes
- The test run currently fails in codex-cloud-requirements on
origin/main due to a stale auth.mode field; no fix is included in this
PR to keep the diff minimal.

Codex author
`codex resume 019c0296-cfd4-7193-9b0a-6949048e4546`

* chore: rename ChatGpt -> Chatgpt in type names (#10244)

When using ChatGPT in names of types, we should be consistent, so this
renames some types with `ChatGpt` in the name to `Chatgpt`. From
https://rust-lang.github.io/api-guidelines/naming.html:

> In `UpperCamelCase`, acronyms and contractions of compound words count
as one word: use `Uuid` rather than `UUID`, `Usize` rather than `USize`
or `Stdin` rather than `StdIn`. In `snake_case`, acronyms and
contractions are lower-cased: `is_xid_start`.

This PR updates existing uses of `ChatGpt` and changes them to
`Chatgpt`. Though in all cases where it could affect the wire format, I
visually inspected that we don't change anything there. That said, this
_will_ change the codegen because it will affect the spelling of type
names.

For example, this renames `AuthMode::ChatGPT` to `AuthMode::Chatgpt` in
`app-server-protocol`, but the wire format is still `"chatgpt"`.

This PR also updates a number of types in `codex-rs/core/src/auth.rs`.

* Plan mode prompt (#10238)

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.

* Fix deploy (#10251)

Fix
https://github.com/openai/codex/actions/runs/21527697445/job/62035898666

* plan prompt (#10255)

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.

* Make plan highlight use popup grey background (#10253)

## Summary
- align proposed plan background with popup surface color by reusing
`user_message_bg`
- remove the custom blue-tinted plan background

<img width="1572" height="1568" alt="image"
src="https://github.com/user-attachments/assets/63a5341e-4342-4c07-b6b0-c4350c3b2639"
/>

* Skip loading codex home as project layer (#10207)

Summary:
- Fixes issue #9932: https://github.com/openai/codex/issues/9932
- Prevents `$CODEX_HOME` (typically `~/.codex`) from being discovered as
a project `.codex` layer by skipping it during project layer traversal.
We compare both normalized absolute paths and best-effort canonicalized
paths to handle symlinks.
- Adds regression tests for home-directory invocation and for the case
where `CODEX_HOME` points to a project `.codex` directory (e.g.,
worktrees/editor integrations).

Testing:
- `cargo build -p codex-cli --bin codex`
- `cargo build -p codex-rmcp-client --bin test_stdio_server`
- `cargo test -p codex-core`
- `cargo test --all-features`
- Manual: ran `target/debug/codex` from `~` and confirmed the
disabled-folder warning and trust prompt no longer appear.

* Update copy (#10256)

<img width="839" height="62" alt="image"
src="https://github.com/user-attachments/assets/ca987cdb-9e8c-403e-8856-a9b37baa7673"
/>

* core: prevent shell_snapshot from inheriting stdin (#9735)

Fixes #9559.

When `shell_snapshot` runs, it may execute user startup files (e.g.
`.bashrc`). If those files read from stdin (or if stdin is an
interactive TTY under job control), the snapshot subprocess can block or
receive `SIGTTIN` (as reported over SSH).

This change explicitly sets `stdin` to `Stdio::null()` for the snapshot
subprocess, so it can't read from the terminal.

Regression test added that would hang/timeout without this change.
Tests: `ulimit -n 4096 && cargo test -p codex-core`.

cc @dongdongbh @etraut-openai

---------

Co-authored-by: Skylar Graika <sgraika127@gmail.com>

* Fix main (#10262)

* Hide /approvals from the slash-command list (#10265)

`/permissions` is the replacement. `/approvals` still available when
typing.

* Update announcement_tip.toml (#10267)

Extend the test for dev version

* file-search: multi-root walk (#10240)

Instead of a separate walker for each root in a multi-root walk, use a
single walker.

* fix: dont auto-enable web_search for azure (#10266)

seeing issues with azure after default-enabling web search: #10071,
#10257.

need to work with azure to fix api-side, for now turning off
default-enable of web_search for azure.

diff is big because i moved logic to reuse

* fix: update file search directory when session CWD changes (#9279)

## Summary

Fixes #9041

- Adds update_search_dir() method to FileSearchManager to allow updating
the search directory after initialization
- Calls this method when the session CWD changes: new session, resume,
or fork

## Problem

The FileSearchManager was created once with the initial search_dir and
never updated. When a user:

1. Starts Codex in a non-git directory (e.g., /tmp/random)
2. Resumes or forks a session from a different workspace
3. The @filename lookup still searched the original directory

This caused no matches to be returned even when files existed in the
current workspace.

## Solution

Update FileSearchManager.search_dir whenever the session working
directory changes:
- AppEvent::NewSession: Use current config CWD
- SessionSelection::Resume: Use resumed session CWD
- SessionSelection::Fork: Use forked session CWD

## Test plan

- [ ] Start Codex in /tmp/test-dir (non-git)
- [ ] Resume a session from a project with actual files
- [ ] Verify @filename returns matches from the resumed session
directory

---------

Co-authored-by: Eric Traut <etraut@openai.com>

* Validate CODEX_HOME before resolving (#10249)

Summary
- require `CODEX_HOME` to point to an existing directory before
canonicalizing and surface clear errors otherwise
- share the same helper logic in both `core` and `rmcp-client` and add
unit tests that cover missing, non-directory, valid, and default paths

This addresses #9222

* chore: implement Mul for TruncationPolicy (#10272)

Codex thought this was a good idea while working on
https://github.com/openai/codex/pull/10192.

* Wire up cloud reqs in exec, app-server (#10241)

We're fetching cloud requirements in TUI in
https://github.com/openai/codex/pull/10167.

This adds the same fetching in exec and app-server binaries also.

* Add enforce_residency to requirements (#10263)

Add `enforce_residency` to requirements.toml and thread it through to a
header on `default_client`.

* add missing fields to WebSearchAction and update app-server types (#10276)

- add `WebSearchAction` to app-server v2 types
- add `queries` to `WebSearchAction::Search` type

Updated tests.

* Turn on cloud requirements for business too (#10283)

Need to check "enterprise" and "business"

* Fix minor typos in comments and documentation (#10287)

## Summary

I have read the contribution guidelines.  
All changes in this PR are limited to text corrections and do not modify
any business logic, runtime behavior, or user-facing functionality.

## Details

This PR fixes several minor typos, including:

- `create` -> `crate`
- `analagous` -> `analogous`
- `apply-patch` -> `apply_patch`
- `codecs` -> `codex`
- ` '/" ` -> ` '/' `
- `Respesent` -> `Represent`

* feat(core) Smart approvals on (#10286)

## Summary
Turn on Smart Approvals by default

## Testing
 - [x] Updated unit tests

* feat: show runtime metrics in console (#10278)

Summary of changes:

- Adds a new feature flag: runtime_metrics
  - Declared in core/src/features.rs
  - Added to core/config.schema.json
  - Wired into OTEL init in core/src/otel_init.rs

- Enables on-demand runtime metric snapshots in OTEL
  - Adds runtime_metrics: bool to otel/src/config.rs
  - Enables experimental custom reader features in otel/Cargo.toml
  - Adds snapshot/reset/summary APIs in:
    - otel/src/lib.rs
    - otel/src/metrics/client.rs
    - otel/src/metrics/config.rs
    - otel/src/metrics/error.rs

- Defines metric names and a runtime summary builder
  - New files:
    - otel/src/metrics/names.rs
    - otel/src/metrics/runtime_metrics.rs
  - Summarizes totals for:
    - Tool calls
    - API requests
    - SSE/streaming events

- Instruments metrics collection in OTEL manager
  - otel/src/traces/otel_manager.rs now records:
    - API call counts + durations
    - SSE event counts + durations (success/failure)
    - Tool call metrics now use shared constants

- Surfaces runtime metrics in the TUI
  - Resets runtime metrics at turn start in tui/src/chatwidget.rs
- Displays metrics in the final separator line in
tui/src/history_cell.rs

- Adds tests
  - New OTEL tests:
    - otel/tests/suite/snapshot.rs
    - otel/tests/suite/runtime_summary.rs
  - New TUI test:
- final_message_separator_includes_runtime_metrics in
tui/src/history_cell.rs

Scope:
- 19 files changed
- ~652 insertions, 38 deletions


<img width="922" height="169" alt="Screenshot 2026-01-30 at 4 11 34 PM"
src="https://github.com/user-attachments/assets/1efd754d-a16d-4564-83a5-f4442fd2f998"
/>

* display promo message in usage error (#10285)

If a promo message is attached to a rate limit response, then display it
in the error message.

* fix(nix): update flake for newer Rust toolchain requirements (#10302)

## Summary

- Add rust-overlay input to provide newer Rust versions (rama crates
require rustc 1.91.0+)
- Add devShells output with complete development environment
- Add missing git dependency hashes to codex-rs/default.nix

## Changes

**flake.nix:**
- Added `rust-overlay` input to get newer Rust toolchains
- Updated `packages` output to use `rust-bin.stable.latest.minimal` for
builds
- Added `devShells` output with:
  - Rust with `rust-src` and `rust-analyzer` extensions for IDE support
- Required build dependencies: `pkg-config`, `openssl`, `cmake`,
`libclang`
  - Environment variables: `PKG_CONFIG_PATH`, `LIBCLANG_PATH`

**codex-rs/default.nix:**
- Added missing `outputHashes` for git dependencies:
  - `nucleo-0.5.0`, `nucleo-matcher-0.3.1`
  - `runfiles-0.1.0`
  - `tokio-tungstenite-0.28.0`, `tungstenite-0.28.0`

## Test Plan

- [x] `nix develop` enters shell successfully
- [x] `nix develop -c rustc --version` shows 1.93.0
- [x] `nix develop -c cargo build` completes successfully

* chore(features) remove Experimental tag from UTF8 (#10296)

## Summary
This has been default on for some time, it should now be the default.

## Testing
- [x] Existing tests pass

* Fix npm README image link (#10303)

### Motivation
- The image referenced in the package README was 404ing on the npm
package page because it used a relative file path that doesn't resolve
on npm, so the splash image needs a GitHub-hosted URL to render
correctly.

### Description
- Update `README.md` to replace the relative image path
`./.github/codex-cli-splash.png` with the GitHub-hosted URL
`https://github.com/openai/codex/blob/main/.github/codex-cli-splash.png`.

### Testing
- No automated tests were run because this is a docs-only change and
does not affect code or test behavior.

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_697e58dbce34832d87c7847779e8f4a5)

* chore(app-server) add personality update test (#10306)

## Summary
Add some additional validation to ensure app-server handles Personality
changes

## Testing
- [x] These are tests

* plan mode prompt (#10308)

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.

* chore(core) Default to friendly personality (#10305)

## Summary
Update default personality to friendly

## Testing
- [x] Unit tests pass

* feat(core,tui,app-server) personality migration (#10307)

## Summary
Keep existing users on Pragmatic, to preserve behavior while new users
default to Friendly

## Testing
- [x] Tested locally
- [x] add integration tests

* enable plan mode (#10313)

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.

* feat: fire tracking events for skill invocation (#10120)

* feat: Support loading skills from .agents/skills (#10317)

This PR adds support for loading
[skills](https://developers.openai.com/codex/skills) from
`.agents/skills/`.
- Issue: https://github.com/agentskills/agentskills/issues/15
- Motivation: When skills live on the filesystem, sharing them across
agents is awkward and often ends up requiring symlinks/duplication. A
single location under `.agents/` makes it easier to share skills.
- Loading from `.codex/skills/` will remain but will be deprecated soon.
The change only applies to the [REPO
scope](https://developers.openai.com/codex/skills#where-to-save-skills).
- Documentation will be updated before this change is live.

Testing with skills in two locations of this repo:
<img width="960" height="152" alt="image"
src="https://github.com/user-attachments/assets/28975ff9-7363-46dd-ad40-f4c7bfdb8234"
/>

When starting Codex with CWD in `$repo_root` (should only pick up at
root):
<img width="513" height="143" alt="image"
src="https://github.com/user-attachments/assets/389e1ea7-020c-481e-bda0-ce58562db59f"
/>

When starting Codex with CWD in `$repo_root/codex-rs` (should pick up at
cwd and crawl up to root):
<img width="552" height="177" alt="image"
src="https://github.com/user-attachments/assets/a5beb8de-11b4-45ed-8660-80707c77006a"
/>

* Make skills prompt explicit about relative-path lookup (#10282)

Fix cases where the model tries to locate skill scripts from the cwd and
fails.

* Add websocket telemetry metrics and labels (#10316)

Summary
- expose websocket telemetry hooks through the responses client so
request durations and event processing can be reported
- record websocket request/event metrics and emit runtime telemetry
events that the history UI now surfaces
- improve tests to cover websocket telemetry reporting and guard runtime
summary updates


<img width="824" height="79" alt="Screenshot 2026-01-31 at 5 28 12 PM"
src="https://github.com/user-attachments/assets/ea9a7965-d8b4-4e3c-a984-ef4fdc44c81d"
/>

* chore(config) Rename config setting to personality (#10314)

## Summary
Let's make the setting name consistent with the SlashCommand!

## Testing
- [x] Updated tests

* chore(features) Personality => Stable (#10310)

## Summary
Bump `/personality` to stable

## Testing
 - [x] unit tests pass

* Sync system skills from public repo (#10320)

Syncs the system skills included in Codex with the updates in
https://github.com/openai/skills/tree/main/skills/.system

* Sync system skills from public repo for openai yaml changes (#10322)

Follow-up to https://github.com/openai/codex/pull/10320

Syncing additional changes from
https://github.com/openai/skills/tree/main/skills/.system

* fix(config) config schema newline (#10323)

## Summary
Looks like we may have introduced a formatting issue in recent PRs.

## Testing
- [x] ran `just write-config-schema`

* Improve plan mode interaction rules (#10329)

## Summary
- Replace the “Hard interaction rule” with a clearer “Response
constraints” section that enumerates the allowed exceptions for Plan
Mode replies.
- Remove the stray Phase 1 exception line about simple questions.
- Update plan content requirements to ask for a brief summary section
and generalize API/type wording.

* Bump thread updated_at on unarchive to refresh sidebar ordering (#10280)

## Summary
- Touch restored rollout files on `thread/unarchive` so `updatedAt`
reflects the unarchive time.
- Add a regression test to ensure unarchiving bumps `updated_at` from an
old mtime.

## Notes
This fixes the UX issue where unarchived old threads don’t reappear near
the top of recent threads.

* fix: System skills marker includes nested folders recursively (#10350)

Updated system skills bundled with Codex were not correctly replacing
the user's skills in their .system folder.

- Fix `.codex-system-skills.marker` not updating by hashing embedded
system skills recursively (nested dirs + file contents), so updates
trigger a reinstall.
- Added a build Cargo hook to rerun if there are changes in
`src/skills/assets/samples/*`, ensuring embedded skill updates rebuild
correctly under caching.
- Add a small unit test to ensure nested entries are included in the
fingerprint.

* fix(rules) Limit rules listed in conversation (#10351)

## Summary
We should probably warn users that they have a million rules, and help
clean them up. But for now, we should handle this unbounded case.

Limit rules listed in conversations, with shortest / broadest rules
first.

## Testing
- [x] Updated unit tests

* Do not append items on override turn context (#10354)

* fix(core) Deduplicate prefix_rules before appending (#10309)

## Summary
We ideally shouldn't make it to this point in the first place, but if we
do try to append a rule that already exists, we shouldn't append the
same rule twice.

## Testing
- [x] Added unit test for this case

* chore(core) gpt-5.2-codex personality template (#10373)

## Summary
Consolidate prompts

## Testing
- [x] Existing tests pass

* fix(personality) prompt patch (#10375)

## Summary
We had 2 typos in #10373

## Testing
- [x] unit tests pass

* feat: vendor app-server protocol schema fixtures (#10371)

Similar to what @sayan-oai did in openai/codex#8956 for
`config.schema.json`, this PR updates the repo so that it includes the
output of `codex app-server generate-json-schema` and `codex app-server
generate-ts` and adds a test to verify it is in sync with the current
code.

Motivation:
- This makes any schema changes introduced by a PR transparent during
code review.
- In particular, this should help us catch PRs that would introduce a
non-backwards-compatible change to the app schema (eventually, this
should also be enforced by tooling).
- Once https://github.com/openai/codex/pull/10231 is in to formalize the
notion of "experimental" fields, we can work on ensuring the
non-experimental bits are backwards-compatible.

`codex-rs/app-server-protocol/tests/schema_fixtures.rs` was added as the
test and `just write-app-server-schema` can be use to generate the
vendored schema files.

Incidentally, when I run:

```
rg _ codex-rs/app-server-protocol/schema/typescript/v2
```

I see a number of `snake_case` names that should be `camelCase`.

* Session picker shows thread_name if set (#10340)

- shows names of threads in the ResumePicker used by `/resume` and
`codex resume` if set, default to preview (previous behaviour) if none
- adds a `find_thread_names_by_ids` that maps names to IDs in
`codex-rs/core/src/rollout/session_index.rs`. It reads sequentially in
normal (instead of reverse order in `codex resume <name>`) the index
mapping file. This function is called from a list of session (default
page is 25, pages loaded depends of height of terminal), for which most
of them will always have at least one session unnamed and require the
whole file to be read therefore. Could be better and sqlite integration
will make this better
- those reads won't be needed when leveraging sqlite
 

Opened questions:
- We could rename the TUI "Conversation" column to "Name" or "Thread"
that would feel more accurate. Could be a fast-follow if we implement
auto-naming as it'll always be a name instead?

* chore: collab experimental (#10381)

* feat: experimental flags (#10231)

## Problem being solved
- We need a single, reliable way to mark app-server API surface as
experimental so that:
  1. the runtime can reject experimental usage unless the client opts in
2. generated TS/JSON schemas can exclude experimental methods/fields for
stable clients.

Right now that’s easy to drift or miss when done ad-hoc.

## How to declare experimental methods and fields
- **Experimental method**: add `#[experimental("method/name")]` to the
`ClientRequest` variant in `client_request_definitions!`.
- **Experimental field**: on the params struct, derive `ExperimentalApi`
and annotate the field with `#[experimental("method/name.field")]` + set
`inspect_params: true` for the method variant so
`ClientRequest::experimental_reason()` inspects params for experimental
fields.

## How the macro solves it
- The new derive macro lives in
`codex-rs/codex-experimental-api-macros/src/lib.rs` and is used via
`#[derive(ExperimentalApi)]` plus `#[experimental("reason")]`
attributes.
- **Structs**:
- Generates `ExperimentalApi::experimental_reason(&self)` that checks
only annotated fields.
  - The “presence” check is type-aware:
    - `Option<T>`: `is_some_and(...)` recursively checks inner.
    - `Vec`/`HashMap`/`BTreeMap`: must be non-empty.
    - `bool`: must be `true`.
    - Other types: considered present (returns `true`).
- Registers each experimental field in an `inventory` with `(type_name,
serialized field name, reason)` and exposes `EXPERIMENTAL_FIELDS` for
that type. Field names are converted from `snake_case` to `camelCase`
for schema/TS filtering.
- **Enums**:
- Generates an exhaustive `match` returning `Some(reason)` for annotated
variants and `None` otherwise (no wildcard arm).
- **Wiring**:
- Runtime gating uses `ExperimentalApi::experimental_reason()` in
`codex-rs/app-server/src/message_processor.rs` to reject requests unless
`InitializeParams.capabilities.experimental_api == true`.
- Schema/TS export filters use the inventory list and
`EXPERIMENTAL_CLIENT_METHODS` from `client_request_definitions!` to
strip experimental methods/fields when `experimental_api` is false.

* nit: shell snapshot retention to 3 days (#10382)

* fix: thread listing (#10383)

* fix: Rfc3339 casting (#10386)

* feat: add MCP protocol types and rmcp adapters (#10356)

Currently, types from our custom `mcp-types` crate are part of some of
our APIs:


https://github.com/openai/codex/blob/03fcd12e77fedf4fa327af27e2e476e1ebc5f651/codex-rs/app-server-protocol/src/protocol/v2.rs#L43-L46

To eliminate this crate in #10349 by switching to `rmcp`, we need our
own wrappers for the `rmcp` types that we can use in our API, which is
what this PR does.

Note this PR introduces the new API types, but we do not make use of
them until #10349.





---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10356).
* #10357
* #10349
* __->__ #10356

* Nicer highlighting of slash commands, /plan accepts prompt args and pasted images (#10269)

## Summary
- Make typed slash commands become text elements when the user hits
space, including paste‑burst spaces.
- Enable `/plan` to accept inline args and submit them in plan mode,
mirroring `/review` behavior and blocking submission while a task is
running.
- Preserve text elements/attachments for slash commands that take args.

<img width="1510" height="500" alt="image"
src="https://github.com/user-attachments/assets/446024df-b69a-4249-85db-1a85110e07f1"
/>

## Changes
- Add safe helper to insert element ranges in the textarea.
- Extend command‑with‑args pipeline to carry text elements and reuse
submission prep.
- Update `/plan` dispatch to switch to plan mode then submit prompt +
elements.
- Document new composer behavior and add tests.

## Notes
- `/plan` is blocked during active tasks (same as `/review`).
- Slash‑command elementization recognizes built‑ins and `/prompts:`
custom commands only.

## Codex author
`codex fork 019c16d3-4520-7bb0-9b9d-48720d40a8ab`

* Add credits tooltip (#10274)

* chore: ignore synthetic messages (#10394)

This will be fixed once this is settled:
https://www.notion.so/openai/Artificial-context-management-2fb8e50b62b080db8b8ed93b3b19d1a2#2fb8e50b62b080d2bffce2dd1e60972b

* feat: drop sqlx logging (#10398)

* Select experimental features with space (#10281)

* feat: add `--experimental` to `generate-ts` (#10402)

Adding a `--experimental` flag to the `generate-ts` fct in the
app-sever.

It can be called through one of those 2 command
```
just write-app-server-schema --experimental
codex app-server generate-ts --experimental
```

* fix: unsafe auto-approval of git commands (#10258)

fixes https://github.com/openai/codex/issues/10160 and some more.

## Description

Hardens Git command safety to prevent approval bypasses for destructive
or write-capable invocations (branch delete, risky push forms,
output/config-override flags), so these commands no longer auto-run as
“safe.”

- `git branch -d` variants (especially in worktrees / with global
options like -C / -c)
- `git show|diff|log --output` ... style file-write flags
- risky Git config override flags (-c, --config-env) that can trigger
external execution
- dangerous push forms that weren’t fully caught (`--force*`,
`--delete`, `+refspec`, `:refspec`)
- grouped short-flag delete forms (e.g. stacked branch flags containing
`d/D`)

will fast follow with a common git policy to bring windows to parity.

---------

Co-authored-by: Eric Traut <etraut@openai.com>

* Updated labeler workflow prompt to include "app" label (#10411)

Support for desktop app issues

* emit a separate metric when the user cancels UAT during elevated setup (#10399)

Currently this shows up as elevated setup failure, which isn't quite
accurate.

* chore(tui) /personalities tip (#10377)

## Summary
We have /personality now.

## Testing
- [x] tested locally

* [feat] persist thread_dynamic_tools in db (#10252)

Persist thread_dynamic_tools in sqlite and read first from it. Fall back
to rollout files if it's not found. Persist dynamic tools to both sqlite
and rollout files.

Saw that new sessions get populated to db correctly & old sessions get
backfilled correctly at startup:
```
celia@com-92114 codex-rs % sqlite3 ~/.codex/state.sqlite \      "select thread_id, position,name,description,input_schema from thread_dynamic_tools;"
019c0cad-ec0d-74b2-a787-e8b33a349117|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
....
019c10ca-aa4b-7620-ae40-c0919fbd7ea7|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"}
```

* feat: Read personal skills from .agents/skills (#10437)

- Issue: https://github.com/agentskills/agentskills/issues/15
- Follow-up to https://github.com/openai/codex/pull/10317 (for team/repo
skills)
- This change now also loads personal/user skills from
`$HOME/.agents/skills` (or `~/.agents/skills`) in addition to loading
from `.agents/skills` inside of git repos.
- The location of `.system` skills remains unchanged.
- Keeping backwards compatibility with `~/.codex/skills` for now until
we fully deprecate.

With skills in both personal folders:
<img width="831" height="421" alt="image"
src="https://github.com/user-attachments/assets/ad8ac918-bfe6-4a2d-8a8e-d608c9d3d701"
/>

We load from both places:
<img width="607" height="236" alt="image"
src="https://github.com/user-attachments/assets/480f4db0-ae64-4dc1-bdf5-c5de98c16f5c"
/>

* make codex better at git (#10145)

adds basic git context to the session prefix so the model can anchor git
actions and be a bit more version-aware. structured it in a
multiroot-friendly shape even though we only have one root today

* Add `codex app` macOS launcher (#10418)

- Add `codex app <path>` to launch the Codex Desktop app.
- On macOS, auto-downloads the DMG if missing; non-macOS prints a link
to chatgpt.com/codex.

* Fix plan implementation prompt reappearing after /agent thread switch (#10447)

## Summary

This fixes a UX bug (https://github.com/openai/codex/issues/10442) where
the **"Implement this plan?"** prompt could reappear after switching
agents with `/agent` and then switching back to the original agent
during plan execution.

## Root Cause

On thread switch, the TUI rebuilds `ChatWidget`, replays buffered thread
events, then drains any queued live events.

In this flow, a `TurnComplete` can be handled twice for the same logical
turn:
1. replayed (`from_replay = true`)
2. then live (`from_replay = false`)

`ChatWidget` used `saw_plan_item_this_turn` to decide whether to show
the plan implementation prompt, but that flag was only reset on
`TurnStarted`.
If duplicate completion events occurred, stale `saw_plan_item_this_turn
= true` could cause the prompt to re-trigger unexpectedly.

## Fix

- Clear `saw_plan_item_this_turn` at the end of `on_task_complete`,
after prompt gating runs.
- This keeps the flag truly turn-scoped and prevents duplicate
`TurnComplete` handling from reopening the prompt.

* TUI: Render request_user_input results in history and simplify interrupt handling (#10064)

## Summary
This PR improves the TUI experience for `request_user_input` by
rendering submitted question/answer sets directly in conversation
history with clear, structured formatting.

It also intentionally simplifies interrupt behavior for now: on `Esc` /
`Ctrl+C`, the questions overlay interrupts the turn without attempting
to submit partial answers.

<img width="1344" height="573" alt="Screenshot 2026-02-02 at 4 51 40 PM"
src="https://github.com/user-attachments/assets/ff752131-7060-44c1-9ded-af061969a533"
/>

## Scope
- TUI-only changes.
- No core/protocol/app-server behavior changes in this PR.
- Resume reconstruction of interrupted question sets is out of scope for
this PR.

## What Changed
- Added a new history cell: `RequestUserInputResultCell` in
`codex-rs/tui/src/history_cell.rs`.
- On normal `request_user_input` submission, TUI now inserts that
history cell immediately after sending `Op::UserInputAnswer`.
- Rendering includes a `Questions` header with `answered/total` count.
- Rendering shows each question as a bullet item.
- Rendering styles submitted answer lines in cyan.
- Rendering styles notes (for option questions) as `note:` lines in
cyan.
- Rendering styles freeform text (for no-option questions) as `answer:`
lines in cyan.
- Rendering dims only the `(unanswered)` suffix.
- Rendering can include an interrupted suffix and summary text when the
cell is marked interrupted.
- Rendering redacts secret questions as `••••••` instead of showing raw
values.
- Added `wrap_with_prefix(...)` in `history_cell.rs` for wrapped
prefixed lines.
- Added `split_request_user_input_answer(...)` in `history_cell.rs` for
decoding `"user_note: ..."` entries.

## Interrupt Behavior (Intentional for this PR)
- `Esc` / `Ctrl+C` in the questions overlay now performs `Op::Interrupt`
and exits the overlay.
- It does **not** submit partial/committed answers on interrupt.
- Added TODO comments in `request_user_input` overlay interrupt paths
indicating where interrupted partial result emission should be
reintroduced once core support is finalized.
- Queued `request_user_input` overlays are discarded on interrupt in the
current behavior.

## Tests Updated
- Updated/added overlay tests in
`codex-rs/tui/src/bottom_pane/request_user_input/mod.rs` to reflect
interrupt-only behavior.
- Added helper assertion for interrupt-only event expectation.
- Existing submission-path tests now validate history insertion behavior
and expected answer maps.

## Behavior Notes
- Completed question flows now produce a readable `Questions` block in
transcript history.
- Interrupted flows currently do not persist partial answers to
model-visible tool output.

## Follow-ups
- Reintroduce partial-answer-on-interrupt semantics once core can
persist/sequence interrupted `request_user_input` outputs safely.
- Optionally add replay/resume rendering for interrupted question sets
as a separate PR.

## Codex author
`codex fork 019bfb8d-2a65-7313-9be2-ea7100d19a61`

* feat: replace custom mcp-types crate with equivalents from rmcp (#10349)

We started working with MCP in Codex before
https://crates.io/crates/rmcp was mature, so we had our own crate for
MCP types that was generated from the MCP schema:


https://github.com/openai/codex/blob/8b95d3e082376f4cb23e92641705a22afb28a9da/codex-rs/mcp-types/README.md

Now that `rmcp` is more mature, it makes more sense to use their MCP
types in Rust, as they handle details (like the `_meta` field) that our
custom version ignored. Though one advantage that our custom types had
is that our generated types implemented `JsonSchema` and `ts_rs::TS`,
whereas the types in `rmcp` do not. As such, part of the work of this PR
is leveraging the adapters between `rmcp` types and the serializable
types that are API for us (app server and MCP) introduced in #10356.

Note this PR results in a number of changes to
`codex-rs/app-server-protocol/schema`, which merit special attention
during review. We must ensure that these changes are still
backwards-compatible, which is possible because we have:

```diff
- export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, };
+ export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, };
```

so `ContentBlock` has been replaced with the more general `JsonValue`.
Note that `ContentBlock` was defined as:

```typescript
export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource;
```

so the deletion of those individual variants should not be a cause of
great concern.

Similarly, we have the following change in
`codex-rs/app-server-protocol/schema/typescript/Tool.ts`:

```
- export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, };
+ export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, };
```

so:

- `annotations?: ToolAnnotations` ➡️ `JsonValue`
- `inputSchema: ToolInputSchema` ➡️ `JsonValue`
- `outputSchema?: ToolOutputSchema` ➡️ `JsonValue`

and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue`

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349).
* #10357
* __->__ #10349
* #10356

* Fixed sandbox mode inconsistency if untrusted is selected (#10415)

This PR addresses #10395

When a user is asked to pick the trust level of a project, the code
currently reloads the config if they select "trusted". It doesn't reload
the config in the "untrusted" case but should. This causes the sandbox
mode to be reported incorrectly in `/status` during the first run (it's
displayed as `read-only` even though it acts as though it's
`workspace-write`).

* Hide short worked-for label in final separator (#10452)

- Hide the "Worked for" label in the final message separator unless
elapsed time is over one minute.\n- Update/add tests to cover both
hidden (<60s) and shown (>=61s) behavior.

* chore: remove deprecated mcp-types crate (#10357)

https://github.com/openai/codex/pull/10349 migrated us off of
`mcp-types`, so this PR deletes the code.

---
[//]: # (BEGIN SAPLING FOOTER)
Stack created with [Sapling](https://sapling-scm.com). Best reviewed
with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10357).
* __->__ #10357
* #10349
* #10356

* app tool tip (#10454)

# External (non-OpenAI) Pull Request Requirements

Before opening this Pull Request, please read the dedicated
"Contributing" markdown file or your PR may be closed:
https://github.com/openai/codex/blob/main/docs/contributing.md

If your PR conforms to our contribution guidelines, replace this text
with a detailed and high quality description of your changes.

Include a link to a bug report or enhancement request.

* chore: add phase to message responseitem (#10455)

### What

add wiring for `phase` field on `ResponseItem::Message` to lay
groundwork for differentiating model preambles and final messages.
currently optional.

follows pattern in #9698.

updated schemas with `just write-app-server-schema` so we can see type
changes.

### Tests
Updated existing tests for SSE parsing and hydrating from history

* Require models refresh on cli version mismatch (#10414)

* [Codex][CLI] Gate image inputs by model modalities (#10271)

###### Summary

- Add input_modalities to model metadata so clients can determine
supported input types.
- Gate image paste/attach in TUI when the selected model does not
support images.
- Block submits that include images for unsupported models and show a
clear warning.
- Propagate modality metadata through app-server protocol/model-list
responses.
  - Update related tests/fixtures.

  ###### Rationale

  - Models support different input modalities.
- Clients need an explicit capability signal to prevent unsupported
requests.
- Backward-compatible defaults preserve existing behavior when modality
metadata is absent.

  ###### Scope

  - codex-rs/protocol, codex-rs/core, codex-rs/tui
  - codex-rs/app-server-protocol, codex-rs/app-server
  - Generated app-server types / schema fixtures

  ###### Trade-offs

- Default behavior assumes text + image when field is absent for
compatibility.
  - Server-side validation remains the source of truth.

  ###### Follow-up

- Non-TUI clients should consume input_modalities to disable unsupported
attachments.
- Model catalogs should explicitly set input_modalities for text-only
models.

  ###### Testing

  - cargo fmt --all
  - cargo test -p codex-tui
  - env -u GITHUB_APP_KEY cargo test -p codex-core --lib
  - just write-app-server-schema
- cargo run -p codex-cli --bin codex -- app-server generate-ts --out
app-server-types
  - test against local backend
  
<img width="695" height="199" alt="image"
src="https://github.com/user-attachments/assets/d22dd04f-5eba-4db9-a7c5-a2506f60ec44"
/>

---------

Co-authored-by: Josh McKinney <joshka@openai.com>

* Trim compaction input (#10374)

Two fixes:

1. Include trailing tool output in the total context size calculation.
Otherwise when checking whether compaction should run we ignore newly
added outputs.
2. Trim trailing tool output/tool calls until we can fit the request
into the model context size. Otherwise the compaction endpoint will fail
to compact. We only trim items that can be reproduced again by the model
(tool calls, tool call outputs).

* Updated bug and feature templates (#10453)

The current bug template uses CLI-specific instructions for getting the
version.

The current feature template doesn't ask the user to provide the Codex
variant (surface) they are using.

This PR addresses these problems.

* Restore status after preamble (#10465)

* fix: clarify deprecation message for features.web_search (#10406)

clarify that the new `web_search` is not a feature flag under
`[features]` in the deprecation CTA

* Ignore remote_compact_trims_function_call_history_to_fit_context_window on windows (#10474)

* feat(linux-sandbox): vendor bubblewrap and wire it with FFI (#10413)

## Summary

Vendor Bubblewrap into the repo and add minimal build plumbing in
`codex-linux-sandbox` to compile/link it.

## Why

We want to move Linux sandboxing toward Bubblewrap, but in a safe
two-step rollout:
1) vendoring/build setup (this PR),  
2) runtime integration (follow-up PR).

## Included

- Add `codex-rs/vendor/bubblewrap` sources.
- Add build-time FFI path in `codex-rs/linux-sandbox`.
- Update `build.rs` rerun tracking for vendored files.
- Small vendored compile warning fix (`sockaddr_nl` full init).

follow up in https://github.com/openai/codex/pull/9938

* feat(secrets): add codex-secrets crate (#10142)

## Summary
This introduces the first working foundation for Codex managed secrets:
a small Rust crate that can securely store and retrieve secrets locally.

Concretely, it adds a `codex-secrets` crate that:
- encrypts a local secrets file using `age`
- generates a high-entropy encryption key
- stores that key in the OS keyring

## What this enables
- A secure local persistence model for secrets
- A clean, isolated place for future provider backends
- A clear boundary: Codex can become a credential broker without putting
plaintext secrets in config files

## Implementation details
- New crate: `codex-rs/secrets/`
- Encryption: `age` with scrypt recipient/identity
- Key generation: `OsRng` (32 random bytes)
- Key storage: OS keyring via `codex-keyring-store`

## Testing
- `cd codex-rs && just fmt`
- `cd codex-rs && cargo test -p codex-secrets`

* chore: nuke chat/completions API (#10157)

* feat: drop wire_api from clients (#10498)

* feat: clean codex-api part 1 (#10501)

* Add more detail to 401 error (#10508)

Add the error.message if it exists, the body otherwise. Truncate body to
1k characters. Print the cf-ray and the requestId.

**Before:**
<img width="860" height="305" alt="Screenshot 2026-02-03 at 13 15 28"
src="https://github.com/user-attachments/assets/949d5a4d-2b51-488c-a723-c6deffde0353"
/>

**After:**
<img width="1523" height="373" alt="Screenshot 2026-02-03 at 13 15 38"
src="https://github.com/user-attachments/assets/f96a747e-e596-4a7a-aae9-64210d805b26"
/>

* Avoid redundant transactional check before inserting dynamic tools (#10521)

Summary
- remove the extra transaction guard that checked for existing dynamic
tools per thread before inserting new ones
- insert each tool record with `ON CONFLICT(thread_id, position) DO
NOTHING` to ignore duplicates instead of pre-querying
- simplify execution to use the shared pool directly and avoid unneeded
commits

Testing
- Not run (not requested)

* chore: update bytes crate in response to security advisory (#10525)

While here, remove one advisory from `deny.toml` that has been addressed
(it was showing up as a warning).

* fix WebSearchAction type clash between v1 and v2 (#10408)

type clash; app-server generated types were still using the v1
snake_case `WebSearchAction`, so there was a mismatch between the
camelCase emitted types and the snake_case types we were trying to
parse.

Updated v2 `WebSearchAction` to export into the `v2/` type set and
updated `ThreadItem` to use that.

### Tests
Ran new `just write-app-server-schema` to surface changes to schema, the
import looks correct now.

* Cleanup collaboration mode variants (#10404)

## Summary

This PR simplifies collaboration modes to the visible set `default |
plan`, while preserving backward compatibility for older partners that
may still send legacy mode
names.

Specifically:
- Renames the old Code behavior to **Default**.
- Keeps **Plan** as-is.
- Removes **Custom** mode behavior (fallbacks now resolve to Default).
- Keeps `PairProgramming` and `Execute` internally for compatibility
plumbing, while removing them from schema/API and UI visibility.
- Adds legacy input aliasing so older clients can still send old mode
names.

## What Changed

1. Mode enum and compatibility
- `ModeKind` now uses `Plan` + `Default` as active/public modes.
- `ModeKind::Default` deserialization accepts legacy values:
  - `code`
  - `pair_programming`
  - `execute`
  - `custom`
- `PairProgramming` and `Execute` variants remain in code but are hidden
from protocol/schema generation.
- `Custom` variant is removed; previous custom fallbacks now map to
`Default`.

2. Collaboration presets and templates
- Built-in presets now return only:
  - `Plan`
  - `Default`
- Template rename:
  - `core/templates/collaboration_mode/code.md` -> `default.md`
- `execute.md` and `pair_programming.md` remain on disk but are not
surfaced in visible preset lists.

3. TUI updates
- Updated user-facing naming and prompts from “Code” to “Default”.
- Updated mode-cycle and indicator behavior to reflect only visible
`Plan` and `Default`.
- Updated corresponding tests and snapshots.

4. request_user_input behavior
- `request_user_input` remains allowed only in `Plan` mode.
- Rejection messaging now consistently treats non-plan modes as
`Default`.

5. Schemas
- Regenerated config and app-server schemas.
- Public schema types now advertise mode values as:
  - `plan`
  - `default`

## Backward Compatibility Notes

- Incoming legacy mode names (`code`, `pair_programming`, `execute`,
`custom`) are accepted and coerced to `default`.
- Outgoing/public schema surfaces intentionally expose only `plan |
default`.
- This allows tolerant ingestion of older partner payloads while
standardizing new integrations on the reduced mode set.

## Codex author
`codex fork 019c1fae-693b-7840-b16e-9ad38ea0bd00`

* Enable parallel shell tools (#10505)

Summary
- mark the shell-related tools as supporting parallel tool calls so
exec_command, shell_command, etc. can run concurrently
- update expectations in tool parallelism tests to reflect the new
parallel behavior
- drop the unused serial duration helper from the suite

Testing
- Not run (not requested)

* feat: `find_thread_path_by_id_str_in_subdir` from DB (#10532)

* fix: make $PWD/.agents read-only like $PWD/.codex (#10524)

In light of https://github.com/openai/codex/pull/10317, because
`.agents` can include resources that Codex can run in a privileged way,
it should be read-only by default just as `.codex` is.

* Inject CODEX_THREAD_ID into the terminal environment (#10096)

Inject CODEX_THREAD_ID (when applicable) into the terminal environment
so that the agent (and skills) can refer to the current thread / session
ID.

Discussion:
https://openai.slack.com/archives/C095U48JNL9/p1769542492067109

* Revert "Load untrusted rules" (#10536)

Reverts openai/codex#9791

* fix(app-server): fix TS annotations for optional fields on requests (#10412)

This updates our generated TypeScript types to be more correct with how
the server actually behaves, **specifically for JSON-RPC requests**.

Before this PR, we'd generate `field: T | null`. After this PR, we will
have `field?: T | null`. The latter matches how the server actually
works, in that if an optional field is omitted, the server will treat it
as null. This also makes it less annoying in theory for clients to
upgrade to newer versions of Codex, since adding a new optional field to
a JSON-RPC request should not require a client change.

NOTE: This only applies to JSON-RPC requests. All other payloads (i.e.
responses, notifications) will return `field: T | null` as usual.

* fix(app-server): fix approval events in review mode (#10416)

One of our partners flagged that they were seeing the wrong order of
events when running `review/start` with command exec approvals:
```
{"method":"item/commandExecution/requestApproval","id":0,"params":{"threadId":"019c0b6b-6a42-7c02-99c4-98c80e88ac27","turnId":"0","itemId":"0","reason":"`/bin/zsh -lc 'git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat'` requires approval: Xcode-required approval: Require explicit user confirmation for all commands.","proposedExecpolicyAmendment":null}}

{"method":"item/started","params":{"item":{"type":"commandExecution","id":"call_AEjlbHqLYNM7kbU3N6uw1CNi","command":"/bin/zsh -lc 'git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat'","cwd":"/Users/devingreen/Desktop/SampleProject","processId":null,"status":"inProgress","commandActions":[{"type":"unknown","command":"git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat"}],"aggregatedOutput":null,"exitCode":null,"durationMs":null},"threadId":"019c0b6b-6a42-7c02-99c4-98c80e88ac27","turnId":"0"}}
```

**Key fix**: In the review sub‑agent delegate we were forwarding exec
(and patch) approvals using the parent turn id (`parent_ctx.sub_id`) as
the approval call_id. That made
`item/commandExecution/requestApproval.itemId` differ from the actual
`item/started` id. We now forward the sub‑agent’s `call_id` from the
approval event instead, so the approval item id matches the
commandExecution item id in review flows.

Here’s the expected event order for an inline `review/start` that
triggers an exec approval after this fix:
1. Response to review/start (JSON‑RPC response)
- Includes `turn` (status inProgress) and `review_thread_id` (same as
parent thread for inline).
2. `turn/started` notification
  - turnId is the review turn id (e.g., "0").
3. `item/started` → EnteredReviewMode
  - item.id == turnId, marks entry into review mode.
4. `item/started` → commandExecution
  - item.id == <call_id> (e.g., "review-call-1"), status: inProgress.
5. `item/commandExecution/requestApproval` request
  - JSON‑RPC request (not a notification).
  - params.itemId == <call_id> and params.turnId == turnId.
6. Client replies to approval request (Approved / Declined / etc).
7. If approved:
  - Optional `item/commandExecution/outputDelta` notifications.
  - `item/completed` → commandExecution with status and exitCode.
8. Review finishes:
  - `item/started` → ExitedReviewMode
  - `item/completed` → ExitedReviewMode
  - (Agent message items may also appear, depending on review output.)
9. `turn/completed` notification

The key being #4 and #5 are now in the proper order with the correct
item id.

* Improve Default mode prompt (less confusion with Plan mode) (#10545)

## Summary

This PR updates `request_user_input` behavior and Default-mode guidance
to match current collaboration-mode semantics and reduce model
confusion.

## Why

- `request_user_input` should be explicitly documented as **Plan-only**.
- Tool description and runtime availability checks should be driven by
the **same centralized mode policy**.
- Default mode prompt needed stronger execution guidance and explicit
instruction that `request_user_input` is unavailable.
- Error messages should report the **actual mode name** (not aliases
that can read as misleading).

## What changed

- Centralized `request_user_input` mode policy in `core` handler logic:
  - Added a single allowed-modes config (`Plan` only).
  - Reused that policy for:
    - runtime rejection messaging
    - tool description text
- Updated tool description to include availability constraint:
  - `"This tool is only available in Plan mode."`
- Updated runtime rejection behavior:
  - `Default` -> `"request_user_input is unavailable in Default mode"`
  - `Execute` -> `"request_user_input is unavailable in Execute mode"`
- `PairProgramming` -> `"request_user_input is unavailable in Pair
Programming mode"`
- Strengthened Default collaboration prompt:
  - Added explicit execution-first behavior
  - Added assumptions-first guidance
  - Added explicit `request_user_input` unavailability instruction
  - Added concise progress-reporting expectations
- Simplified formatting implementation:
  - Inlined allowed-mode name collection into `format_allowed_modes()`
- Kept `format_allowed_modes()` output for 3+ modes as CSV style
(`modes: a,b,c`)

* [apps] Gateway MCP should be blocking. (#10289)

Make Apps Gateway MCP blocking since otherwise app mentions may not work
when apps are not loaded. Messages sent before apps become available
will be queued.

This only affects when `apps` feature is enabled.

* implement per-workspace capability SIDs for workspace specific ACLs (#10189)

Today, there is a single capability SID that allows the sandbox to write
to
* workspace (cwd)
* tmp directories if enabled
* additional writable roots

This change splits those up, so that each workspace has its own
capability SID, while tmp and additional roots, which are
installation-wide, are still governed by the "generic" capability SID

This isolates workspaces from each other in terms of sandbox write
access.
Also allows us to protect <cwd>/.codex when codex runs in a specific
<cwd>

* Updated bug templates and added a new one for app (#10548)

* [codex] Default values from requirements if unset (#10531)

If we don't set any explicit values for sandbox or approval policy,
let's try to use a requirements-satisfying value.

* Fixed icon for CLI bug template (#10552)

* chore(arg0): advisory-lock janitor for codex tmp paths (#10039)

## Description

### What changed
- Switch the arg0 helper root from `~/.codex/tmp/path` to
`~/.codex/tmp/path2`
- Add `Arg0PathEntryGuard` to keep both the `TempDir` and an exclusive
`.lock` file alive for the process lifetime
- Add a startup janitor that scans `path2` and deletes only directories
whose lock can be acquired

### Tests
- `cargo clippy -p codex-arg0`
- `cargo clippy -p codex-core`
- `cargo test -p codex-arg0`
- `cargo test -p codex-core`

* feat: add APIs to list and download public remote skills (#10448)

Add API to list / download from remote public skills

* Handle exec shutdown on Interrupt (fixes immortal `codex exec` with websockets) (#10519)

### Motivation
- Ensure `codex exec` exits when a running turn is interrupted (e.g.,
Ctrl-C) so the CLI is not "immortal" when websockets/streaming are used.

### Description
- Return `CodexStatus::InitiateShutdown` when handling
`EventMsg::TurnAborted` in
`exec/src/event_processor_with_human_output.rs` so human-output exec
mode shuts down after an interrupt.
- Treat `protocol::EventMsg::TurnAborted` as
`CodexStatus::InitiateShutdown` in
`exec/src/event_processor_with_jsonl_output.rs` so JSONL output mode
behaves the same.
- Applied formatting with `just fmt`.

### Testing
- Ran `just fmt` successfully.
- Ran `cargo test -p codex-exec`; many unit tests ran and the test
command completed, but the full test run in this environment produced
`35 passed, 11 failed` where the failures are due to Landlock sandbox
panics and 403 responses in the test harness (environmental/integration
issues) and are not caused by the interrupt/shutdown changes.

------
[Codex
Task](https://chatgpt.com/codex/tasks/task_i_698165cec4e083258d17702bd29014c1)

* Feat: add upgrade to app server modelList (#10556)

### Summary
* Add model upgrade to listModel app server endpoint to support
dynamically show model upgrade banner.

* feat(tui): pace catch-up stream chunking with hysteresis (#10461)

## Summary
- preserve baseline streaming behavior (smooth mode still commits one
line per 50ms tick)
- extract adaptive chunking policy and commit-tick orchestration from
ChatWidget into `streaming/chunking.rs` and `streaming/commit_tick.rs`
- add hysteresis-based catch-up behavior with bounded batch draining to
reduce queue lag without bursty single-frame jumps
- document policy behavior, tuning guidance, and debug flow in rustdoc +
docs

## Testing
- just fmt
- cargo test -p codex-tui

* chore: add `codex debug app-server` tooling (#10367)

codex debug app-server <user message> forwards the message through
codex-app-server-test-client’s send_message_v2 library entry point,
using std::env::current_exe() to resolve the codex binary.

for how it looks like, see:

``…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants