feat: add `--experimental` to `generate-ts` by jif-oai · Pull Request #10402 · openai/codex

jif-oai · 2026-02-02T19:09:50Z

Adding a --experimental flag to the generate-ts fct in the app-sever.

It can be called through one of those 2 command

just write-app-server-schema --experimental
codex app-server generate-ts --experimental

jif-oai · 2026-02-02T19:09:57Z

@codex review

chatgpt-codex-connector · 2026-02-02T19:12:29Z

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@pap-openai

- Added `codex app <path>` on macOS to launch Codex Desktop from the CLI, with automatic DMG download if it is missing. (#10418) - Added personal skill loading from `~/.agents/skills` (with `~/.codex/skills` compatibility), plus app-server APIs/events to list and download public remote skills. (#10437, #10448) - `/plan` now accepts inline prompt arguments and pasted images, and slash-command editing/highlighting in the TUI is more polished. (#10269) - Shell-related tools can now run in parallel, improving multi-command execution throughput. (#10505) - Shell executions now receive `CODEX_THREAD_ID`, so scripts and skills can detect the active thread/session. (#10096) - Added vendored Bubblewrap + FFI wiring in the Linux sandbox as groundwork for upcoming runtime integration. (#10413) ## Bug Fixes - Hardened Git command safety so destructive or write-capable invocations no longer bypass approval checks. (#10258) - Improved resume/thread browsing reliability by correctly showing saved thread names and fixing thread listing behavior. (#10340, #10383) - Fixed first-run trust-mode handling so sandbox mode is reported consistently, and made `$PWD/.agents` read-only like `$PWD/.codex`. (#10415, #10524) - Fixed `codex exec` hanging after interrupt in websocket/streaming flows; interrupted turns now shut down cleanly. (#10519) - Fixed review-mode approval event wiring so `requestApproval` IDs align with the corresponding command execution items. (#10416) - Improved 401 error diagnostics by including server message/body details plus `cf-ray` and `requestId`. (#10508) ## Documentation - Expanded TUI chat composer docs to cover slash-command arguments and attachment handling in plan/review flows. (#10269) - Refreshed issue templates and labeler prompts to better separate CLI/app bug reporting and feature requests. (#10411, #10453, #10548, #10552) ## Chores - Completed migration off the deprecated `mcp-types` crate to `rmcp`-based protocol types/adapters, then removed the legacy crate. (#10356, #10349, #10357) - Updated the `bytes` dependency for a security advisory and cleaned up resolved advisory configuration. (#10525) ## Changelog Full Changelog: rust-v0.94.0...rust-v0.95.0 - #10340 Session picker shows thread_name if set @pap-openai - #10381 chore: collab experimental @jif-oai - #10231 feat: experimental flags @jif-oai - #10382 nit: shell snapshot retention to 3 days @jif-oai - #10383 fix: thread listing @jif-oai - #10386 fix: Rfc3339 casting @jif-oai - #10356 feat: add MCP protocol types and rmcp adapters @bolinfest - #10269 Nicer highlighting of slash commands, /plan accepts prompt args and pasted images @charley-oai - #10274 Add credits tooltip @pakrym-oai - #10394 chore: ignore synthetic messages @jif-oai - #10398 feat: drop sqlx logging @jif-oai - #10281 Select experimental features with space @pakrym-oai - #10402 feat: add `--experimental` to `generate-ts` @jif-oai - #10258 fix: unsafe auto-approval of git commands @viyatb-oai - #10411 Updated labeler workflow prompt to include "app" label @etraut-openai - #10399 emit a separate metric when the user cancels UAT during elevated setup @iceweasel-oai - #10377 chore(tui) /personalities tip @dylan-hurd-oai - #10252 [feat] persist thread_dynamic_tools in db @celia-oai - #10437 feat: Read personal skills from .agents/skills @gverma-openai - #10145 make codex better at git @pash-openai - #10418 Add `codex app` macOS launcher @aibrahim-oai - #10447 Fix plan implementation prompt reappearing after /agent thread switch @charley-oai - #10064 TUI: Render request_user_input results in history and simplify interrupt handling @charley-oai - #10349 feat: replace custom mcp-types crate with equivalents from rmcp @bolinfest - #10415 Fixed sandbox mode inconsistency if untrusted is selected @etraut-openai - #10452 Hide short worked-for label in final separator @aibrahim-oai - #10357 chore: remove deprecated mcp-types crate @bolinfest - #10454 app tool tip @aibrahim-oai - #10455 chore: add phase to message responseitem @sayan-oai - #10414 Require models refresh on cli version mismatch @aibrahim-oai - #10271 [Codex][CLI] Gate image inputs by model modalities @ccy-oai - #10374 Trim compaction input @pakrym-oai - #10453 Updated bug and feature templates @etraut-openai - #10465 Restore status after preamble @pakrym-oai - #10406 fix: clarify deprecation message for features.web_search @sayan-oai - #10474 Ignore remote_compact_trims_function_call_history_to_fit_context_window on windows @pakrym-oai - #10413 feat(linux-sandbox): vendor bubblewrap and wire it with FFI @viyatb-oai - #10142 feat(secrets): add codex-secrets crate @viyatb-oai - #10157 chore: nuke chat/completions API @jif-oai - #10498 feat: drop wire_api from clients @jif-oai - #10501 feat: clean codex-api part 1 @jif-oai - #10508 Add more detail to 401 error @gt-oai - #10521 Avoid redundant transactional check before inserting dynamic tools @jif-oai - #10525 chore: update bytes crate in response to security advisory @bolinfest - #10408 fix WebSearchAction type clash between v1 and v2 @sayan-oai - #10404 Cleanup collaboration mode variants @charley-oai - #10505 Enable parallel shell tools @jif-oai - #10532 feat: `find_thread_path_by_id_str_in_subdir` from DB @jif-oai - #10524 fix: make $PWD/.agents read-only like $PWD/.codex @bolinfest - #10096 Inject CODEX_THREAD_ID into the terminal environment @maxj-oai - #10536 Revert "Load untrusted rules" @viyatb-oai - #10412 fix(app-server): fix TS annotations for optional fields on requests @owenlin0 - #10416 fix(app-server): fix approval events in review mode @owenlin0 - #10545 Improve Default mode prompt (less confusion with Plan mode) @charley-oai - #10289 [apps] Gateway MCP should be blocking. @mzeng-openai - #10189 implement per-workspace capability SIDs for workspace specific ACLs @iceweasel-oai - #10548 Updated bug templates and added a new one for app @etraut-openai - #10531 [codex] Default values from requirements if unset @gt-oai - #10552 Fixed icon for CLI bug template @etraut-openai - #10039 chore(arg0): advisory-lock janitor for codex tmp paths @viyatb-oai - #10448 feat: add APIs to list and download public remote skills @xl-openai - #10519 Handle exec shutdown on Interrupt (fixes immortal `codex exec` with websockets) @rasmusrygaard - #10556 Feat: add upgrade to app server modelList @shijie-oai - #10461 feat(tui): pace catch-up stream chunking with hysteresis @joshka-oai - #10367 chore: add `codex debug app-server` tooling @celia-oai

* chore: unify metric (#10220) * Conversation naming (#8991) Session renaming: - `/rename my_session` - `/rename` without arg and passing an argument in `customViewPrompt` - AppExitInfo shows resume hint using the session name if set instead of uuid, defaults to uuid if not set - Names are stored in `CODEX_HOME/sessions.jsonl` Session resuming: - codex resume <name> lookup for `CODEX_HOME/sessions.jsonl` first entry matching the name and resumes the session --------- Co-authored-by: jif-oai <jif@openai.com> * Fetch Requirements from cloud (#10167) Load requirements from Codex Backend. It only does this for enterprise customers signed in with ChatGPT. Todo in follow-up PRs: * Add to app-server and exec too * Switch from fail-open to fail-closed on failure * explorer prompt (#10225) * fix: make sure the shell exists (#10222) * chore: do not clean the DB anymore (#10232) * feat: improve logs client (#10229) * feat: heuristic coloring of logs (#10228) * nit: fix db with multiple metadata lines (#10237) * feat: refactor CodexAuth so invalid state cannot be represented (#10208) Previously, `CodexAuth` was defined as follows: https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L39-L46 But if you looked at its constructors, we had creation for `AuthMode::ApiKey` where `storage` was built using a nonsensical path (`PathBuf::new()`) and `auth_dot_json` was `None`: https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L212-L220 By comparison, when `AuthMode::ChatGPT` was used, `api_key` was always `None`: https://github.com/openai/codex/blob/d550fbf41afc09d7d7b5ac813aea38de07b2a73f/codex-rs/core/src/auth.rs#L665-L671 https://github.com/openai/codex/pull/10012 took things further because it introduced a new `ChatgptAuthTokens` variant to `AuthMode`, which is important in when invoking `account/login/start` via the app server, but most logic _internal_ to the app server should just reason about two `AuthMode` variants: `ApiKey` and `ChatGPT`. This PR tries to clean things up as follows: - `LoginAccountParams` and `AuthMode` in `codex-rs/app-server-protocol/` both continue to have the `ChatgptAuthTokens` variant, though it is used exclusively for the on-the-wire messaging. - `codex-rs/core/src/auth.rs` now has its own `AuthMode` enum, which only has two variants: `ApiKey` and `ChatGPT`. - `CodexAuth` has been changed from a struct to an enum. It is a disjoint union where each variant (`ApiKey`, `ChatGpt`, and `ChatGptAuthTokens`) have only the associated fields that make sense for that variant. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10208). * #10224 * __->__ #10208 * chore(feature) Experimental: Personality (#10212) ## Summary Let users start opting in to trying out personalities ## Testing - [x] existing tests pass * chore(feature) Experimental: Smart Approvals (#10211) ## Summary Let's start getting feedback on this feature 😅 ## Testing - [x] existing tests pass * Load exec policy rules from requirements (#10190) `requirements.toml` should be able to specify rules which always run. My intention here was that these rules could only ever be restrictive, which means the decision can be "prompt" or "forbidden" but never "allow". A requirement of "you must always allow this command" didn't make sense to me, but happy to be gaveled otherwise. Rules already applies the most restrictive decision, so we can safely merge these with rules found in other config folders. * plan mode: add TL;DR checkpoint and client behavior note (#10195) ## Summary - Tightens Plan Mode to encourage exploration-first behavior and more back-and-forth alignment. - Adds a required TL;DR checkpoint before drafting the full plan. - Clarifies client behavior that can cause premature “Implement this plan?” prompts. ## What changed - Require at least one targeted non-mutating exploration pass before the first user question. - Insert a TL;DR checkpoint between Phase 2 (intent) and Phase 3 (implementation). - TL;DR checkpoint guidance: - Label: “Proposed Plan (TL;DR)” - Format: 3–5 bullets using `- ` - Options: exactly one option, “Approve” - `isOther: true`, with explicit guidance that “None of the above” is the edit path in the current UI. - Require the final plan to include a TL;DR consistent with the approved checkpoint. ## Why - In Plan Mode, any normal assistant message at turn completion is treated as plan content by the client. This can trigger premature “Implement this plan?” prompts. - The TL;DR checkpoint aligns on direction before Codex drafts a long, decision-complete plan. ## Testing - Manual: built the local CLI and verified the flow now explores first, presents a TL;DR checkpoint, and only drafts the full plan after approval. --------- Co-authored-by: Nick Baumann <@openai.com> * chore: fix the build breakage that came from a merge race (#10239) I think I needed to rebase on top of https://github.com/openai/codex/pull/10167 before merging https://github.com/openai/codex/pull/10208. * Plan mode: stream proposed plans, emit plan items, and render in TUI (#9786) ## Summary - Stream proposed plans in Plan Mode using `<proposed_plan>` tags parsed in core, emitting plan deltas plus a plan `ThreadItem`, while stripping tags from normal assistant output. - Persist plan items and rebuild them on resume so proposed plans show in thread history. - Wire plan items/deltas through app-server protocol v2 and render a dedicated proposed-plan view in the TUI, including the “Implement this plan?” prompt only when a plan item is present. ## Changes ### Core (`codex-rs/core`) - Added a generic, line-based tag parser that buffers each line until it can disprove a tag prefix; implements auto-close on `finish()` for unterminated tags. `codex-rs/core/src/tagged_block_parser.rs` - Refactored proposed plan parsing to wrap the generic parser. `codex-rs/core/src/proposed_plan_parser.rs` - In plan mode, stream assistant deltas as: - **Normal text** → `AgentMessageContentDelta` - **Plan text** → `PlanDelta` + `TurnItem::Plan` start/completion (`codex-rs/core/src/codex.rs`) - Final plan item content is derived from the completed assistant message (authoritative), not necessarily the concatenated deltas. - Strips `<proposed_plan>` blocks from assistant text in plan mode so tags don’t appear in normal messages. (`codex-rs/core/src/stream_events_utils.rs`) - Persist `ItemCompleted` events only for plan items for rollout replay. (`codex-rs/core/src/rollout/policy.rs`) - Guard `update_plan` tool in Plan Mode with a clear error message. (`codex-rs/core/src/tools/handlers/plan.rs`) - Updated Plan Mode prompt to: - keep `<proposed_plan>` out of non-final reasoning/preambles - require exact tag formatting - allow only one `<proposed_plan>` block per turn (`codex-rs/core/templates/collaboration_mode/plan.md`) ### Protocol / App-server protocol - Added `TurnItem::Plan` and `PlanDeltaEvent` to core protocol items. (`codex-rs/protocol/src/items.rs`, `codex-rs/protocol/src/protocol.rs`) - Added v2 `ThreadItem::Plan` and `PlanDeltaNotification` with EXPERIMENTAL markers and note that deltas may not match the final plan item. (`codex-rs/app-server-protocol/src/protocol/v2.rs`) - Added plan delta route in app-server protocol common mapping. (`codex-rs/app-server-protocol/src/protocol/common.rs`) - Rebuild plan items from persisted `ItemCompleted` events on resume. (`codex-rs/app-server-protocol/src/protocol/thread_history.rs`) ### App-server - Forward plan deltas to v2 clients and map core plan items to v2 plan items. (`codex-rs/app-server/src/bespoke_event_handling.rs`, `codex-rs/app-server/src/codex_message_processor.rs`) - Added v2 plan item tests. (`codex-rs/app-server/tests/suite/v2/plan_item.rs`) ### TUI - Added a dedicated proposed plan history cell with special background and padding, and moved “• Proposed Plan” outside the highlighted block. (`codex-rs/tui/src/history_cell.rs`, `codex-rs/tui/src/style.rs`) - Only show “Implement this plan?” when a plan item exists. (`codex-rs/tui/src/chatwidget.rs`, `codex-rs/tui/src/chatwidget/tests.rs`) <img width="831" height="847" alt="Screenshot 2026-01-29 at 7 06 24 PM" src="https://github.com/user-attachments/assets/69794c8c-f96b-4d36-92ef-c1f5c3a8f286" /> ### Docs / Misc - Updated protocol docs to mention plan deltas. (`codex-rs/docs/protocol_v1.md`) - Minor plumbing updates in exec/debug clients to tolerate plan deltas. (`codex-rs/debug-client/src/reader.rs`, `codex-rs/exec/...`) ## Tests - Added core integration tests: - Plan mode strips plan from agent messages. - Missing `</proposed_plan>` closes at end-of-message. (`codex-rs/core/tests/suite/items.rs`) - Added unit tests for generic tag parser (prefix buffering, non-tag lines, auto-close). (`codex-rs/core/src/tagged_block_parser.rs`) - Existing app-server plan item tests in v2. (`codex-rs/app-server/tests/suite/v2/plan_item.rs`) ## Notes / Behavior - Plan output no longer appears in standard assistant text in Plan Mode; it streams via `PlanDelta` and completes as a `TurnItem::Plan`. - The final plan item content is authoritative and may diverge from streamed deltas (documented as experimental). - Reasoning summaries are not filtered; prompt instructs the model not to include `<proposed_plan>` outside the final plan message. ## Codex Author `codex fork 019bec2d-b09d-7450-b292-d7bcdddcdbfb` * Tui: hide Code mode footer label (#10063) Title Hide Code mode footer label/cycle hint; add Plan footer-collapse snapshots Summary - Keep Code mode internal naming but suppress the footer mode label + cycle hint when Code is active. - Only show the cycle hint when a non‑Code mode indicator is present. - Add Plan-mode footer collapse snapshot coverage (empty + queued, across widths) and update existing footer collapse snapshots for the new Code behavior. Notes - The test run currently fails in codex-cloud-requirements on origin/main due to a stale auth.mode field; no fix is included in this PR to keep the diff minimal. Codex author `codex resume 019c0296-cfd4-7193-9b0a-6949048e4546` * chore: rename ChatGpt -> Chatgpt in type names (#10244) When using ChatGPT in names of types, we should be consistent, so this renames some types with `ChatGpt` in the name to `Chatgpt`. From https://rust-lang.github.io/api-guidelines/naming.html: > In `UpperCamelCase`, acronyms and contractions of compound words count as one word: use `Uuid` rather than `UUID`, `Usize` rather than `USize` or `Stdin` rather than `StdIn`. In `snake_case`, acronyms and contractions are lower-cased: `is_xid_start`. This PR updates existing uses of `ChatGpt` and changes them to `Chatgpt`. Though in all cases where it could affect the wire format, I visually inspected that we don't change anything there. That said, this _will_ change the codegen because it will affect the spelling of type names. For example, this renames `AuthMode::ChatGPT` to `AuthMode::Chatgpt` in `app-server-protocol`, but the wire format is still `"chatgpt"`. This PR also updates a number of types in `codex-rs/core/src/auth.rs`. * Plan mode prompt (#10238) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request. * Fix deploy (#10251) Fix https://github.com/openai/codex/actions/runs/21527697445/job/62035898666 * plan prompt (#10255) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request. * Make plan highlight use popup grey background (#10253) ## Summary - align proposed plan background with popup surface color by reusing `user_message_bg` - remove the custom blue-tinted plan background <img width="1572" height="1568" alt="image" src="https://github.com/user-attachments/assets/63a5341e-4342-4c07-b6b0-c4350c3b2639" /> * Skip loading codex home as project layer (#10207) Summary: - Fixes issue #9932: https://github.com/openai/codex/issues/9932 - Prevents `$CODEX_HOME` (typically `~/.codex`) from being discovered as a project `.codex` layer by skipping it during project layer traversal. We compare both normalized absolute paths and best-effort canonicalized paths to handle symlinks. - Adds regression tests for home-directory invocation and for the case where `CODEX_HOME` points to a project `.codex` directory (e.g., worktrees/editor integrations). Testing: - `cargo build -p codex-cli --bin codex` - `cargo build -p codex-rmcp-client --bin test_stdio_server` - `cargo test -p codex-core` - `cargo test --all-features` - Manual: ran `target/debug/codex` from `~` and confirmed the disabled-folder warning and trust prompt no longer appear. * Update copy (#10256) <img width="839" height="62" alt="image" src="https://github.com/user-attachments/assets/ca987cdb-9e8c-403e-8856-a9b37baa7673" /> * core: prevent shell_snapshot from inheriting stdin (#9735) Fixes #9559. When `shell_snapshot` runs, it may execute user startup files (e.g. `.bashrc`). If those files read from stdin (or if stdin is an interactive TTY under job control), the snapshot subprocess can block or receive `SIGTTIN` (as reported over SSH). This change explicitly sets `stdin` to `Stdio::null()` for the snapshot subprocess, so it can't read from the terminal. Regression test added that would hang/timeout without this change. Tests: `ulimit -n 4096 && cargo test -p codex-core`. cc @dongdongbh @etraut-openai --------- Co-authored-by: Skylar Graika <sgraika127@gmail.com> * Fix main (#10262) * Hide /approvals from the slash-command list (#10265) `/permissions` is the replacement. `/approvals` still available when typing. * Update announcement_tip.toml (#10267) Extend the test for dev version * file-search: multi-root walk (#10240) Instead of a separate walker for each root in a multi-root walk, use a single walker. * fix: dont auto-enable web_search for azure (#10266) seeing issues with azure after default-enabling web search: #10071, #10257. need to work with azure to fix api-side, for now turning off default-enable of web_search for azure. diff is big because i moved logic to reuse * fix: update file search directory when session CWD changes (#9279) ## Summary Fixes #9041 - Adds update_search_dir() method to FileSearchManager to allow updating the search directory after initialization - Calls this method when the session CWD changes: new session, resume, or fork ## Problem The FileSearchManager was created once with the initial search_dir and never updated. When a user: 1. Starts Codex in a non-git directory (e.g., /tmp/random) 2. Resumes or forks a session from a different workspace 3. The @filename lookup still searched the original directory This caused no matches to be returned even when files existed in the current workspace. ## Solution Update FileSearchManager.search_dir whenever the session working directory changes: - AppEvent::NewSession: Use current config CWD - SessionSelection::Resume: Use resumed session CWD - SessionSelection::Fork: Use forked session CWD ## Test plan - [ ] Start Codex in /tmp/test-dir (non-git) - [ ] Resume a session from a project with actual files - [ ] Verify @filename returns matches from the resumed session directory --------- Co-authored-by: Eric Traut <etraut@openai.com> * Validate CODEX_HOME before resolving (#10249) Summary - require `CODEX_HOME` to point to an existing directory before canonicalizing and surface clear errors otherwise - share the same helper logic in both `core` and `rmcp-client` and add unit tests that cover missing, non-directory, valid, and default paths This addresses #9222 * chore: implement Mul for TruncationPolicy (#10272) Codex thought this was a good idea while working on https://github.com/openai/codex/pull/10192. * Wire up cloud reqs in exec, app-server (#10241) We're fetching cloud requirements in TUI in https://github.com/openai/codex/pull/10167. This adds the same fetching in exec and app-server binaries also. * Add enforce_residency to requirements (#10263) Add `enforce_residency` to requirements.toml and thread it through to a header on `default_client`. * add missing fields to WebSearchAction and update app-server types (#10276) - add `WebSearchAction` to app-server v2 types - add `queries` to `WebSearchAction::Search` type Updated tests. * Turn on cloud requirements for business too (#10283) Need to check "enterprise" and "business" * Fix minor typos in comments and documentation (#10287) ## Summary I have read the contribution guidelines. All changes in this PR are limited to text corrections and do not modify any business logic, runtime behavior, or user-facing functionality. ## Details This PR fixes several minor typos, including: - `create` -> `crate` - `analagous` -> `analogous` - `apply-patch` -> `apply_patch` - `codecs` -> `codex` - ` '/" ` -> ` '/' ` - `Respesent` -> `Represent` * feat(core) Smart approvals on (#10286) ## Summary Turn on Smart Approvals by default ## Testing - [x] Updated unit tests * feat: show runtime metrics in console (#10278) Summary of changes: - Adds a new feature flag: runtime_metrics - Declared in core/src/features.rs - Added to core/config.schema.json - Wired into OTEL init in core/src/otel_init.rs - Enables on-demand runtime metric snapshots in OTEL - Adds runtime_metrics: bool to otel/src/config.rs - Enables experimental custom reader features in otel/Cargo.toml - Adds snapshot/reset/summary APIs in: - otel/src/lib.rs - otel/src/metrics/client.rs - otel/src/metrics/config.rs - otel/src/metrics/error.rs - Defines metric names and a runtime summary builder - New files: - otel/src/metrics/names.rs - otel/src/metrics/runtime_metrics.rs - Summarizes totals for: - Tool calls - API requests - SSE/streaming events - Instruments metrics collection in OTEL manager - otel/src/traces/otel_manager.rs now records: - API call counts + durations - SSE event counts + durations (success/failure) - Tool call metrics now use shared constants - Surfaces runtime metrics in the TUI - Resets runtime metrics at turn start in tui/src/chatwidget.rs - Displays metrics in the final separator line in tui/src/history_cell.rs - Adds tests - New OTEL tests: - otel/tests/suite/snapshot.rs - otel/tests/suite/runtime_summary.rs - New TUI test: - final_message_separator_includes_runtime_metrics in tui/src/history_cell.rs Scope: - 19 files changed - ~652 insertions, 38 deletions <img width="922" height="169" alt="Screenshot 2026-01-30 at 4 11 34 PM" src="https://github.com/user-attachments/assets/1efd754d-a16d-4564-83a5-f4442fd2f998" /> * display promo message in usage error (#10285) If a promo message is attached to a rate limit response, then display it in the error message. * fix(nix): update flake for newer Rust toolchain requirements (#10302) ## Summary - Add rust-overlay input to provide newer Rust versions (rama crates require rustc 1.91.0+) - Add devShells output with complete development environment - Add missing git dependency hashes to codex-rs/default.nix ## Changes **flake.nix:** - Added `rust-overlay` input to get newer Rust toolchains - Updated `packages` output to use `rust-bin.stable.latest.minimal` for builds - Added `devShells` output with: - Rust with `rust-src` and `rust-analyzer` extensions for IDE support - Required build dependencies: `pkg-config`, `openssl`, `cmake`, `libclang` - Environment variables: `PKG_CONFIG_PATH`, `LIBCLANG_PATH` **codex-rs/default.nix:** - Added missing `outputHashes` for git dependencies: - `nucleo-0.5.0`, `nucleo-matcher-0.3.1` - `runfiles-0.1.0` - `tokio-tungstenite-0.28.0`, `tungstenite-0.28.0` ## Test Plan - [x] `nix develop` enters shell successfully - [x] `nix develop -c rustc --version` shows 1.93.0 - [x] `nix develop -c cargo build` completes successfully * chore(features) remove Experimental tag from UTF8 (#10296) ## Summary This has been default on for some time, it should now be the default. ## Testing - [x] Existing tests pass * Fix npm README image link (#10303) ### Motivation - The image referenced in the package README was 404ing on the npm package page because it used a relative file path that doesn't resolve on npm, so the splash image needs a GitHub-hosted URL to render correctly. ### Description - Update `README.md` to replace the relative image path `./.github/codex-cli-splash.png` with the GitHub-hosted URL `https://github.com/openai/codex/blob/main/.github/codex-cli-splash.png`. ### Testing - No automated tests were run because this is a docs-only change and does not affect code or test behavior. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_697e58dbce34832d87c7847779e8f4a5) * chore(app-server) add personality update test (#10306) ## Summary Add some additional validation to ensure app-server handles Personality changes ## Testing - [x] These are tests * plan mode prompt (#10308) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request. * chore(core) Default to friendly personality (#10305) ## Summary Update default personality to friendly ## Testing - [x] Unit tests pass * feat(core,tui,app-server) personality migration (#10307) ## Summary Keep existing users on Pragmatic, to preserve behavior while new users default to Friendly ## Testing - [x] Tested locally - [x] add integration tests * enable plan mode (#10313) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request. * feat: fire tracking events for skill invocation (#10120) * feat: Support loading skills from .agents/skills (#10317) This PR adds support for loading [skills](https://developers.openai.com/codex/skills) from `.agents/skills/`. - Issue: https://github.com/agentskills/agentskills/issues/15 - Motivation: When skills live on the filesystem, sharing them across agents is awkward and often ends up requiring symlinks/duplication. A single location under `.agents/` makes it easier to share skills. - Loading from `.codex/skills/` will remain but will be deprecated soon. The change only applies to the [REPO scope](https://developers.openai.com/codex/skills#where-to-save-skills). - Documentation will be updated before this change is live. Testing with skills in two locations of this repo: <img width="960" height="152" alt="image" src="https://github.com/user-attachments/assets/28975ff9-7363-46dd-ad40-f4c7bfdb8234" /> When starting Codex with CWD in `$repo_root` (should only pick up at root): <img width="513" height="143" alt="image" src="https://github.com/user-attachments/assets/389e1ea7-020c-481e-bda0-ce58562db59f" /> When starting Codex with CWD in `$repo_root/codex-rs` (should pick up at cwd and crawl up to root): <img width="552" height="177" alt="image" src="https://github.com/user-attachments/assets/a5beb8de-11b4-45ed-8660-80707c77006a" /> * Make skills prompt explicit about relative-path lookup (#10282) Fix cases where the model tries to locate skill scripts from the cwd and fails. * Add websocket telemetry metrics and labels (#10316) Summary - expose websocket telemetry hooks through the responses client so request durations and event processing can be reported - record websocket request/event metrics and emit runtime telemetry events that the history UI now surfaces - improve tests to cover websocket telemetry reporting and guard runtime summary updates <img width="824" height="79" alt="Screenshot 2026-01-31 at 5 28 12 PM" src="https://github.com/user-attachments/assets/ea9a7965-d8b4-4e3c-a984-ef4fdc44c81d" /> * chore(config) Rename config setting to personality (#10314) ## Summary Let's make the setting name consistent with the SlashCommand! ## Testing - [x] Updated tests * chore(features) Personality => Stable (#10310) ## Summary Bump `/personality` to stable ## Testing - [x] unit tests pass * Sync system skills from public repo (#10320) Syncs the system skills included in Codex with the updates in https://github.com/openai/skills/tree/main/skills/.system * Sync system skills from public repo for openai yaml changes (#10322) Follow-up to https://github.com/openai/codex/pull/10320 Syncing additional changes from https://github.com/openai/skills/tree/main/skills/.system * fix(config) config schema newline (#10323) ## Summary Looks like we may have introduced a formatting issue in recent PRs. ## Testing - [x] ran `just write-config-schema` * Improve plan mode interaction rules (#10329) ## Summary - Replace the “Hard interaction rule” with a clearer “Response constraints” section that enumerates the allowed exceptions for Plan Mode replies. - Remove the stray Phase 1 exception line about simple questions. - Update plan content requirements to ask for a brief summary section and generalize API/type wording. * Bump thread updated_at on unarchive to refresh sidebar ordering (#10280) ## Summary - Touch restored rollout files on `thread/unarchive` so `updatedAt` reflects the unarchive time. - Add a regression test to ensure unarchiving bumps `updated_at` from an old mtime. ## Notes This fixes the UX issue where unarchived old threads don’t reappear near the top of recent threads. * fix: System skills marker includes nested folders recursively (#10350) Updated system skills bundled with Codex were not correctly replacing the user's skills in their .system folder. - Fix `.codex-system-skills.marker` not updating by hashing embedded system skills recursively (nested dirs + file contents), so updates trigger a reinstall. - Added a build Cargo hook to rerun if there are changes in `src/skills/assets/samples/*`, ensuring embedded skill updates rebuild correctly under caching. - Add a small unit test to ensure nested entries are included in the fingerprint. * fix(rules) Limit rules listed in conversation (#10351) ## Summary We should probably warn users that they have a million rules, and help clean them up. But for now, we should handle this unbounded case. Limit rules listed in conversations, with shortest / broadest rules first. ## Testing - [x] Updated unit tests * Do not append items on override turn context (#10354) * fix(core) Deduplicate prefix_rules before appending (#10309) ## Summary We ideally shouldn't make it to this point in the first place, but if we do try to append a rule that already exists, we shouldn't append the same rule twice. ## Testing - [x] Added unit test for this case * chore(core) gpt-5.2-codex personality template (#10373) ## Summary Consolidate prompts ## Testing - [x] Existing tests pass * fix(personality) prompt patch (#10375) ## Summary We had 2 typos in #10373 ## Testing - [x] unit tests pass * feat: vendor app-server protocol schema fixtures (#10371) Similar to what @sayan-oai did in openai/codex#8956 for `config.schema.json`, this PR updates the repo so that it includes the output of `codex app-server generate-json-schema` and `codex app-server generate-ts` and adds a test to verify it is in sync with the current code. Motivation: - This makes any schema changes introduced by a PR transparent during code review. - In particular, this should help us catch PRs that would introduce a non-backwards-compatible change to the app schema (eventually, this should also be enforced by tooling). - Once https://github.com/openai/codex/pull/10231 is in to formalize the notion of "experimental" fields, we can work on ensuring the non-experimental bits are backwards-compatible. `codex-rs/app-server-protocol/tests/schema_fixtures.rs` was added as the test and `just write-app-server-schema` can be use to generate the vendored schema files. Incidentally, when I run: ``` rg _ codex-rs/app-server-protocol/schema/typescript/v2 ``` I see a number of `snake_case` names that should be `camelCase`. * Session picker shows thread_name if set (#10340) - shows names of threads in the ResumePicker used by `/resume` and `codex resume` if set, default to preview (previous behaviour) if none - adds a `find_thread_names_by_ids` that maps names to IDs in `codex-rs/core/src/rollout/session_index.rs`. It reads sequentially in normal (instead of reverse order in `codex resume <name>`) the index mapping file. This function is called from a list of session (default page is 25, pages loaded depends of height of terminal), for which most of them will always have at least one session unnamed and require the whole file to be read therefore. Could be better and sqlite integration will make this better - those reads won't be needed when leveraging sqlite Opened questions: - We could rename the TUI "Conversation" column to "Name" or "Thread" that would feel more accurate. Could be a fast-follow if we implement auto-naming as it'll always be a name instead? * chore: collab experimental (#10381) * feat: experimental flags (#10231) ## Problem being solved - We need a single, reliable way to mark app-server API surface as experimental so that: 1. the runtime can reject experimental usage unless the client opts in 2. generated TS/JSON schemas can exclude experimental methods/fields for stable clients. Right now that’s easy to drift or miss when done ad-hoc. ## How to declare experimental methods and fields - **Experimental method**: add `#[experimental("method/name")]` to the `ClientRequest` variant in `client_request_definitions!`. - **Experimental field**: on the params struct, derive `ExperimentalApi` and annotate the field with `#[experimental("method/name.field")]` + set `inspect_params: true` for the method variant so `ClientRequest::experimental_reason()` inspects params for experimental fields. ## How the macro solves it - The new derive macro lives in `codex-rs/codex-experimental-api-macros/src/lib.rs` and is used via `#[derive(ExperimentalApi)]` plus `#[experimental("reason")]` attributes. - **Structs**: - Generates `ExperimentalApi::experimental_reason(&self)` that checks only annotated fields. - The “presence” check is type-aware: - `Option<T>`: `is_some_and(...)` recursively checks inner. - `Vec`/`HashMap`/`BTreeMap`: must be non-empty. - `bool`: must be `true`. - Other types: considered present (returns `true`). - Registers each experimental field in an `inventory` with `(type_name, serialized field name, reason)` and exposes `EXPERIMENTAL_FIELDS` for that type. Field names are converted from `snake_case` to `camelCase` for schema/TS filtering. - **Enums**: - Generates an exhaustive `match` returning `Some(reason)` for annotated variants and `None` otherwise (no wildcard arm). - **Wiring**: - Runtime gating uses `ExperimentalApi::experimental_reason()` in `codex-rs/app-server/src/message_processor.rs` to reject requests unless `InitializeParams.capabilities.experimental_api == true`. - Schema/TS export filters use the inventory list and `EXPERIMENTAL_CLIENT_METHODS` from `client_request_definitions!` to strip experimental methods/fields when `experimental_api` is false. * nit: shell snapshot retention to 3 days (#10382) * fix: thread listing (#10383) * fix: Rfc3339 casting (#10386) * feat: add MCP protocol types and rmcp adapters (#10356) Currently, types from our custom `mcp-types` crate are part of some of our APIs: https://github.com/openai/codex/blob/03fcd12e77fedf4fa327af27e2e476e1ebc5f651/codex-rs/app-server-protocol/src/protocol/v2.rs#L43-L46 To eliminate this crate in #10349 by switching to `rmcp`, we need our own wrappers for the `rmcp` types that we can use in our API, which is what this PR does. Note this PR introduces the new API types, but we do not make use of them until #10349. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10356). * #10357 * #10349 * __->__ #10356 * Nicer highlighting of slash commands, /plan accepts prompt args and pasted images (#10269) ## Summary - Make typed slash commands become text elements when the user hits space, including paste‑burst spaces. - Enable `/plan` to accept inline args and submit them in plan mode, mirroring `/review` behavior and blocking submission while a task is running. - Preserve text elements/attachments for slash commands that take args. <img width="1510" height="500" alt="image" src="https://github.com/user-attachments/assets/446024df-b69a-4249-85db-1a85110e07f1" /> ## Changes - Add safe helper to insert element ranges in the textarea. - Extend command‑with‑args pipeline to carry text elements and reuse submission prep. - Update `/plan` dispatch to switch to plan mode then submit prompt + elements. - Document new composer behavior and add tests. ## Notes - `/plan` is blocked during active tasks (same as `/review`). - Slash‑command elementization recognizes built‑ins and `/prompts:` custom commands only. ## Codex author `codex fork 019c16d3-4520-7bb0-9b9d-48720d40a8ab` * Add credits tooltip (#10274) * chore: ignore synthetic messages (#10394) This will be fixed once this is settled: https://www.notion.so/openai/Artificial-context-management-2fb8e50b62b080db8b8ed93b3b19d1a2#2fb8e50b62b080d2bffce2dd1e60972b * feat: drop sqlx logging (#10398) * Select experimental features with space (#10281) * feat: add `--experimental` to `generate-ts` (#10402) Adding a `--experimental` flag to the `generate-ts` fct in the app-sever. It can be called through one of those 2 command ``` just write-app-server-schema --experimental codex app-server generate-ts --experimental ``` * fix: unsafe auto-approval of git commands (#10258) fixes https://github.com/openai/codex/issues/10160 and some more. ## Description Hardens Git command safety to prevent approval bypasses for destructive or write-capable invocations (branch delete, risky push forms, output/config-override flags), so these commands no longer auto-run as “safe.” - `git branch -d` variants (especially in worktrees / with global options like -C / -c) - `git show|diff|log --output` ... style file-write flags - risky Git config override flags (-c, --config-env) that can trigger external execution - dangerous push forms that weren’t fully caught (`--force*`, `--delete`, `+refspec`, `:refspec`) - grouped short-flag delete forms (e.g. stacked branch flags containing `d/D`) will fast follow with a common git policy to bring windows to parity. --------- Co-authored-by: Eric Traut <etraut@openai.com> * Updated labeler workflow prompt to include "app" label (#10411) Support for desktop app issues * emit a separate metric when the user cancels UAT during elevated setup (#10399) Currently this shows up as elevated setup failure, which isn't quite accurate. * chore(tui) /personalities tip (#10377) ## Summary We have /personality now. ## Testing - [x] tested locally * [feat] persist thread_dynamic_tools in db (#10252) Persist thread_dynamic_tools in sqlite and read first from it. Fall back to rollout files if it's not found. Persist dynamic tools to both sqlite and rollout files. Saw that new sessions get populated to db correctly & old sessions get backfilled correctly at startup: ``` celia@com-92114 codex-rs % sqlite3 ~/.codex/state.sqlite \ "select thread_id, position,name,description,input_schema from thread_dynamic_tools;" 019c0cad-ec0d-74b2-a787-e8b33a349117|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"} .... 019c10ca-aa4b-7620-ae40-c0919fbd7ea7|0|geo_lookup|lookup a city|{"properties":{"city":{"type":"string"}},"required":["city"],"type":"object"} ``` * feat: Read personal skills from .agents/skills (#10437) - Issue: https://github.com/agentskills/agentskills/issues/15 - Follow-up to https://github.com/openai/codex/pull/10317 (for team/repo skills) - This change now also loads personal/user skills from `$HOME/.agents/skills` (or `~/.agents/skills`) in addition to loading from `.agents/skills` inside of git repos. - The location of `.system` skills remains unchanged. - Keeping backwards compatibility with `~/.codex/skills` for now until we fully deprecate. With skills in both personal folders: <img width="831" height="421" alt="image" src="https://github.com/user-attachments/assets/ad8ac918-bfe6-4a2d-8a8e-d608c9d3d701" /> We load from both places: <img width="607" height="236" alt="image" src="https://github.com/user-attachments/assets/480f4db0-ae64-4dc1-bdf5-c5de98c16f5c" /> * make codex better at git (#10145) adds basic git context to the session prefix so the model can anchor git actions and be a bit more version-aware. structured it in a multiroot-friendly shape even though we only have one root today * Add `codex app` macOS launcher (#10418) - Add `codex app <path>` to launch the Codex Desktop app. - On macOS, auto-downloads the DMG if missing; non-macOS prints a link to chatgpt.com/codex. * Fix plan implementation prompt reappearing after /agent thread switch (#10447) ## Summary This fixes a UX bug (https://github.com/openai/codex/issues/10442) where the **"Implement this plan?"** prompt could reappear after switching agents with `/agent` and then switching back to the original agent during plan execution. ## Root Cause On thread switch, the TUI rebuilds `ChatWidget`, replays buffered thread events, then drains any queued live events. In this flow, a `TurnComplete` can be handled twice for the same logical turn: 1. replayed (`from_replay = true`) 2. then live (`from_replay = false`) `ChatWidget` used `saw_plan_item_this_turn` to decide whether to show the plan implementation prompt, but that flag was only reset on `TurnStarted`. If duplicate completion events occurred, stale `saw_plan_item_this_turn = true` could cause the prompt to re-trigger unexpectedly. ## Fix - Clear `saw_plan_item_this_turn` at the end of `on_task_complete`, after prompt gating runs. - This keeps the flag truly turn-scoped and prevents duplicate `TurnComplete` handling from reopening the prompt. * TUI: Render request_user_input results in history and simplify interrupt handling (#10064) ## Summary This PR improves the TUI experience for `request_user_input` by rendering submitted question/answer sets directly in conversation history with clear, structured formatting. It also intentionally simplifies interrupt behavior for now: on `Esc` / `Ctrl+C`, the questions overlay interrupts the turn without attempting to submit partial answers. <img width="1344" height="573" alt="Screenshot 2026-02-02 at 4 51 40 PM" src="https://github.com/user-attachments/assets/ff752131-7060-44c1-9ded-af061969a533" /> ## Scope - TUI-only changes. - No core/protocol/app-server behavior changes in this PR. - Resume reconstruction of interrupted question sets is out of scope for this PR. ## What Changed - Added a new history cell: `RequestUserInputResultCell` in `codex-rs/tui/src/history_cell.rs`. - On normal `request_user_input` submission, TUI now inserts that history cell immediately after sending `Op::UserInputAnswer`. - Rendering includes a `Questions` header with `answered/total` count. - Rendering shows each question as a bullet item. - Rendering styles submitted answer lines in cyan. - Rendering styles notes (for option questions) as `note:` lines in cyan. - Rendering styles freeform text (for no-option questions) as `answer:` lines in cyan. - Rendering dims only the `(unanswered)` suffix. - Rendering can include an interrupted suffix and summary text when the cell is marked interrupted. - Rendering redacts secret questions as `••••••` instead of showing raw values. - Added `wrap_with_prefix(...)` in `history_cell.rs` for wrapped prefixed lines. - Added `split_request_user_input_answer(...)` in `history_cell.rs` for decoding `"user_note: ..."` entries. ## Interrupt Behavior (Intentional for this PR) - `Esc` / `Ctrl+C` in the questions overlay now performs `Op::Interrupt` and exits the overlay. - It does **not** submit partial/committed answers on interrupt. - Added TODO comments in `request_user_input` overlay interrupt paths indicating where interrupted partial result emission should be reintroduced once core support is finalized. - Queued `request_user_input` overlays are discarded on interrupt in the current behavior. ## Tests Updated - Updated/added overlay tests in `codex-rs/tui/src/bottom_pane/request_user_input/mod.rs` to reflect interrupt-only behavior. - Added helper assertion for interrupt-only event expectation. - Existing submission-path tests now validate history insertion behavior and expected answer maps. ## Behavior Notes - Completed question flows now produce a readable `Questions` block in transcript history. - Interrupted flows currently do not persist partial answers to model-visible tool output. ## Follow-ups - Reintroduce partial-answer-on-interrupt semantics once core can persist/sequence interrupted `request_user_input` outputs safely. - Optionally add replay/resume rendering for interrupted question sets as a separate PR. ## Codex author `codex fork 019bfb8d-2a65-7313-9be2-ea7100d19a61` * feat: replace custom mcp-types crate with equivalents from rmcp (#10349) We started working with MCP in Codex before https://crates.io/crates/rmcp was mature, so we had our own crate for MCP types that was generated from the MCP schema: https://github.com/openai/codex/blob/8b95d3e082376f4cb23e92641705a22afb28a9da/codex-rs/mcp-types/README.md Now that `rmcp` is more mature, it makes more sense to use their MCP types in Rust, as they handle details (like the `_meta` field) that our custom version ignored. Though one advantage that our custom types had is that our generated types implemented `JsonSchema` and `ts_rs::TS`, whereas the types in `rmcp` do not. As such, part of the work of this PR is leveraging the adapters between `rmcp` types and the serializable types that are API for us (app server and MCP) introduced in #10356. Note this PR results in a number of changes to `codex-rs/app-server-protocol/schema`, which merit special attention during review. We must ensure that these changes are still backwards-compatible, which is possible because we have: ```diff - export type CallToolResult = { content: Array<ContentBlock>, isError?: boolean, structuredContent?: JsonValue, }; + export type CallToolResult = { content: Array<JsonValue>, structuredContent?: JsonValue, isError?: boolean, _meta?: JsonValue, }; ``` so `ContentBlock` has been replaced with the more general `JsonValue`. Note that `ContentBlock` was defined as: ```typescript export type ContentBlock = TextContent | ImageContent | AudioContent | ResourceLink | EmbeddedResource; ``` so the deletion of those individual variants should not be a cause of great concern. Similarly, we have the following change in `codex-rs/app-server-protocol/schema/typescript/Tool.ts`: ``` - export type Tool = { annotations?: ToolAnnotations, description?: string, inputSchema: ToolInputSchema, name: string, outputSchema?: ToolOutputSchema, title?: string, }; + export type Tool = { name: string, title?: string, description?: string, inputSchema: JsonValue, outputSchema?: JsonValue, annotations?: JsonValue, icons?: Array<JsonValue>, _meta?: JsonValue, }; ``` so: - `annotations?: ToolAnnotations` ➡️ `JsonValue` - `inputSchema: ToolInputSchema` ➡️ `JsonValue` - `outputSchema?: ToolOutputSchema` ➡️ `JsonValue` and two new fields: `icons?: Array<JsonValue>, _meta?: JsonValue` --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10349). * #10357 * __->__ #10349 * #10356 * Fixed sandbox mode inconsistency if untrusted is selected (#10415) This PR addresses #10395 When a user is asked to pick the trust level of a project, the code currently reloads the config if they select "trusted". It doesn't reload the config in the "untrusted" case but should. This causes the sandbox mode to be reported incorrectly in `/status` during the first run (it's displayed as `read-only` even though it acts as though it's `workspace-write`). * Hide short worked-for label in final separator (#10452) - Hide the "Worked for" label in the final message separator unless elapsed time is over one minute.\n- Update/add tests to cover both hidden (<60s) and shown (>=61s) behavior. * chore: remove deprecated mcp-types crate (#10357) https://github.com/openai/codex/pull/10349 migrated us off of `mcp-types`, so this PR deletes the code. --- [//]: # (BEGIN SAPLING FOOTER) Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/openai/codex/pull/10357). * __->__ #10357 * #10349 * #10356 * app tool tip (#10454) # External (non-OpenAI) Pull Request Requirements Before opening this Pull Request, please read the dedicated "Contributing" markdown file or your PR may be closed: https://github.com/openai/codex/blob/main/docs/contributing.md If your PR conforms to our contribution guidelines, replace this text with a detailed and high quality description of your changes. Include a link to a bug report or enhancement request. * chore: add phase to message responseitem (#10455) ### What add wiring for `phase` field on `ResponseItem::Message` to lay groundwork for differentiating model preambles and final messages. currently optional. follows pattern in #9698. updated schemas with `just write-app-server-schema` so we can see type changes. ### Tests Updated existing tests for SSE parsing and hydrating from history * Require models refresh on cli version mismatch (#10414) * [Codex][CLI] Gate image inputs by model modalities (#10271) ###### Summary - Add input_modalities to model metadata so clients can determine supported input types. - Gate image paste/attach in TUI when the selected model does not support images. - Block submits that include images for unsupported models and show a clear warning. - Propagate modality metadata through app-server protocol/model-list responses. - Update related tests/fixtures. ###### Rationale - Models support different input modalities. - Clients need an explicit capability signal to prevent unsupported requests. - Backward-compatible defaults preserve existing behavior when modality metadata is absent. ###### Scope - codex-rs/protocol, codex-rs/core, codex-rs/tui - codex-rs/app-server-protocol, codex-rs/app-server - Generated app-server types / schema fixtures ###### Trade-offs - Default behavior assumes text + image when field is absent for compatibility. - Server-side validation remains the source of truth. ###### Follow-up - Non-TUI clients should consume input_modalities to disable unsupported attachments. - Model catalogs should explicitly set input_modalities for text-only models. ###### Testing - cargo fmt --all - cargo test -p codex-tui - env -u GITHUB_APP_KEY cargo test -p codex-core --lib - just write-app-server-schema - cargo run -p codex-cli --bin codex -- app-server generate-ts --out app-server-types - test against local backend <img width="695" height="199" alt="image" src="https://github.com/user-attachments/assets/d22dd04f-5eba-4db9-a7c5-a2506f60ec44" /> --------- Co-authored-by: Josh McKinney <joshka@openai.com> * Trim compaction input (#10374) Two fixes: 1. Include trailing tool output in the total context size calculation. Otherwise when checking whether compaction should run we ignore newly added outputs. 2. Trim trailing tool output/tool calls until we can fit the request into the model context size. Otherwise the compaction endpoint will fail to compact. We only trim items that can be reproduced again by the model (tool calls, tool call outputs). * Updated bug and feature templates (#10453) The current bug template uses CLI-specific instructions for getting the version. The current feature template doesn't ask the user to provide the Codex variant (surface) they are using. This PR addresses these problems. * Restore status after preamble (#10465) * fix: clarify deprecation message for features.web_search (#10406) clarify that the new `web_search` is not a feature flag under `[features]` in the deprecation CTA * Ignore remote_compact_trims_function_call_history_to_fit_context_window on windows (#10474) * feat(linux-sandbox): vendor bubblewrap and wire it with FFI (#10413) ## Summary Vendor Bubblewrap into the repo and add minimal build plumbing in `codex-linux-sandbox` to compile/link it. ## Why We want to move Linux sandboxing toward Bubblewrap, but in a safe two-step rollout: 1) vendoring/build setup (this PR), 2) runtime integration (follow-up PR). ## Included - Add `codex-rs/vendor/bubblewrap` sources. - Add build-time FFI path in `codex-rs/linux-sandbox`. - Update `build.rs` rerun tracking for vendored files. - Small vendored compile warning fix (`sockaddr_nl` full init). follow up in https://github.com/openai/codex/pull/9938 * feat(secrets): add codex-secrets crate (#10142) ## Summary This introduces the first working foundation for Codex managed secrets: a small Rust crate that can securely store and retrieve secrets locally. Concretely, it adds a `codex-secrets` crate that: - encrypts a local secrets file using `age` - generates a high-entropy encryption key - stores that key in the OS keyring ## What this enables - A secure local persistence model for secrets - A clean, isolated place for future provider backends - A clear boundary: Codex can become a credential broker without putting plaintext secrets in config files ## Implementation details - New crate: `codex-rs/secrets/` - Encryption: `age` with scrypt recipient/identity - Key generation: `OsRng` (32 random bytes) - Key storage: OS keyring via `codex-keyring-store` ## Testing - `cd codex-rs && just fmt` - `cd codex-rs && cargo test -p codex-secrets` * chore: nuke chat/completions API (#10157) * feat: drop wire_api from clients (#10498) * feat: clean codex-api part 1 (#10501) * Add more detail to 401 error (#10508) Add the error.message if it exists, the body otherwise. Truncate body to 1k characters. Print the cf-ray and the requestId. **Before:** <img width="860" height="305" alt="Screenshot 2026-02-03 at 13 15 28" src="https://github.com/user-attachments/assets/949d5a4d-2b51-488c-a723-c6deffde0353" /> **After:** <img width="1523" height="373" alt="Screenshot 2026-02-03 at 13 15 38" src="https://github.com/user-attachments/assets/f96a747e-e596-4a7a-aae9-64210d805b26" /> * Avoid redundant transactional check before inserting dynamic tools (#10521) Summary - remove the extra transaction guard that checked for existing dynamic tools per thread before inserting new ones - insert each tool record with `ON CONFLICT(thread_id, position) DO NOTHING` to ignore duplicates instead of pre-querying - simplify execution to use the shared pool directly and avoid unneeded commits Testing - Not run (not requested) * chore: update bytes crate in response to security advisory (#10525) While here, remove one advisory from `deny.toml` that has been addressed (it was showing up as a warning). * fix WebSearchAction type clash between v1 and v2 (#10408) type clash; app-server generated types were still using the v1 snake_case `WebSearchAction`, so there was a mismatch between the camelCase emitted types and the snake_case types we were trying to parse. Updated v2 `WebSearchAction` to export into the `v2/` type set and updated `ThreadItem` to use that. ### Tests Ran new `just write-app-server-schema` to surface changes to schema, the import looks correct now. * Cleanup collaboration mode variants (#10404) ## Summary This PR simplifies collaboration modes to the visible set `default | plan`, while preserving backward compatibility for older partners that may still send legacy mode names. Specifically: - Renames the old Code behavior to **Default**. - Keeps **Plan** as-is. - Removes **Custom** mode behavior (fallbacks now resolve to Default). - Keeps `PairProgramming` and `Execute` internally for compatibility plumbing, while removing them from schema/API and UI visibility. - Adds legacy input aliasing so older clients can still send old mode names. ## What Changed 1. Mode enum and compatibility - `ModeKind` now uses `Plan` + `Default` as active/public modes. - `ModeKind::Default` deserialization accepts legacy values: - `code` - `pair_programming` - `execute` - `custom` - `PairProgramming` and `Execute` variants remain in code but are hidden from protocol/schema generation. - `Custom` variant is removed; previous custom fallbacks now map to `Default`. 2. Collaboration presets and templates - Built-in presets now return only: - `Plan` - `Default` - Template rename: - `core/templates/collaboration_mode/code.md` -> `default.md` - `execute.md` and `pair_programming.md` remain on disk but are not surfaced in visible preset lists. 3. TUI updates - Updated user-facing naming and prompts from “Code” to “Default”. - Updated mode-cycle and indicator behavior to reflect only visible `Plan` and `Default`. - Updated corresponding tests and snapshots. 4. request_user_input behavior - `request_user_input` remains allowed only in `Plan` mode. - Rejection messaging now consistently treats non-plan modes as `Default`. 5. Schemas - Regenerated config and app-server schemas. - Public schema types now advertise mode values as: - `plan` - `default` ## Backward Compatibility Notes - Incoming legacy mode names (`code`, `pair_programming`, `execute`, `custom`) are accepted and coerced to `default`. - Outgoing/public schema surfaces intentionally expose only `plan | default`. - This allows tolerant ingestion of older partner payloads while standardizing new integrations on the reduced mode set. ## Codex author `codex fork 019c1fae-693b-7840-b16e-9ad38ea0bd00` * Enable parallel shell tools (#10505) Summary - mark the shell-related tools as supporting parallel tool calls so exec_command, shell_command, etc. can run concurrently - update expectations in tool parallelism tests to reflect the new parallel behavior - drop the unused serial duration helper from the suite Testing - Not run (not requested) * feat: `find_thread_path_by_id_str_in_subdir` from DB (#10532) * fix: make $PWD/.agents read-only like $PWD/.codex (#10524) In light of https://github.com/openai/codex/pull/10317, because `.agents` can include resources that Codex can run in a privileged way, it should be read-only by default just as `.codex` is. * Inject CODEX_THREAD_ID into the terminal environment (#10096) Inject CODEX_THREAD_ID (when applicable) into the terminal environment so that the agent (and skills) can refer to the current thread / session ID. Discussion: https://openai.slack.com/archives/C095U48JNL9/p1769542492067109 * Revert "Load untrusted rules" (#10536) Reverts openai/codex#9791 * fix(app-server): fix TS annotations for optional fields on requests (#10412) This updates our generated TypeScript types to be more correct with how the server actually behaves, **specifically for JSON-RPC requests**. Before this PR, we'd generate `field: T | null`. After this PR, we will have `field?: T | null`. The latter matches how the server actually works, in that if an optional field is omitted, the server will treat it as null. This also makes it less annoying in theory for clients to upgrade to newer versions of Codex, since adding a new optional field to a JSON-RPC request should not require a client change. NOTE: This only applies to JSON-RPC requests. All other payloads (i.e. responses, notifications) will return `field: T | null` as usual. * fix(app-server): fix approval events in review mode (#10416) One of our partners flagged that they were seeing the wrong order of events when running `review/start` with command exec approvals: ``` {"method":"item/commandExecution/requestApproval","id":0,"params":{"threadId":"019c0b6b-6a42-7c02-99c4-98c80e88ac27","turnId":"0","itemId":"0","reason":"`/bin/zsh -lc 'git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat'` requires approval: Xcode-required approval: Require explicit user confirmation for all commands.","proposedExecpolicyAmendment":null}} {"method":"item/started","params":{"item":{"type":"commandExecution","id":"call_AEjlbHqLYNM7kbU3N6uw1CNi","command":"/bin/zsh -lc 'git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat'","cwd":"/Users/devingreen/Desktop/SampleProject","processId":null,"status":"inProgress","commandActions":[{"type":"unknown","command":"git show b7a92b4eacf262c575f26b1e1ed621a357642e55 --stat"}],"aggregatedOutput":null,"exitCode":null,"durationMs":null},"threadId":"019c0b6b-6a42-7c02-99c4-98c80e88ac27","turnId":"0"}} ``` **Key fix**: In the review sub‑agent delegate we were forwarding exec (and patch) approvals using the parent turn id (`parent_ctx.sub_id`) as the approval call_id. That made `item/commandExecution/requestApproval.itemId` differ from the actual `item/started` id. We now forward the sub‑agent’s `call_id` from the approval event instead, so the approval item id matches the commandExecution item id in review flows. Here’s the expected event order for an inline `review/start` that triggers an exec approval after this fix: 1. Response to review/start (JSON‑RPC response) - Includes `turn` (status inProgress) and `review_thread_id` (same as parent thread for inline). 2. `turn/started` notification - turnId is the review turn id (e.g., "0"). 3. `item/started` → EnteredReviewMode - item.id == turnId, marks entry into review mode. 4. `item/started` → commandExecution - item.id == <call_id> (e.g., "review-call-1"), status: inProgress. 5. `item/commandExecution/requestApproval` request - JSON‑RPC request (not a notification). - params.itemId == <call_id> and params.turnId == turnId. 6. Client replies to approval request (Approved / Declined / etc). 7. If approved: - Optional `item/commandExecution/outputDelta` notifications. - `item/completed` → commandExecution with status and exitCode. 8. Review finishes: - `item/started` → ExitedReviewMode - `item/completed` → ExitedReviewMode - (Agent message items may also appear, depending on review output.) 9. `turn/completed` notification The key being #4 and #5 are now in the proper order with the correct item id. * Improve Default mode prompt (less confusion with Plan mode) (#10545) ## Summary This PR updates `request_user_input` behavior and Default-mode guidance to match current collaboration-mode semantics and reduce model confusion. ## Why - `request_user_input` should be explicitly documented as **Plan-only**. - Tool description and runtime availability checks should be driven by the **same centralized mode policy**. - Default mode prompt needed stronger execution guidance and explicit instruction that `request_user_input` is unavailable. - Error messages should report the **actual mode name** (not aliases that can read as misleading). ## What changed - Centralized `request_user_input` mode policy in `core` handler logic: - Added a single allowed-modes config (`Plan` only). - Reused that policy for: - runtime rejection messaging - tool description text - Updated tool description to include availability constraint: - `"This tool is only available in Plan mode."` - Updated runtime rejection behavior: - `Default` -> `"request_user_input is unavailable in Default mode"` - `Execute` -> `"request_user_input is unavailable in Execute mode"` - `PairProgramming` -> `"request_user_input is unavailable in Pair Programming mode"` - Strengthened Default collaboration prompt: - Added explicit execution-first behavior - Added assumptions-first guidance - Added explicit `request_user_input` unavailability instruction - Added concise progress-reporting expectations - Simplified formatting implementation: - Inlined allowed-mode name collection into `format_allowed_modes()` - Kept `format_allowed_modes()` output for 3+ modes as CSV style (`modes: a,b,c`) * [apps] Gateway MCP should be blocking. (#10289) Make Apps Gateway MCP blocking since otherwise app mentions may not work when apps are not loaded. Messages sent before apps become available will be queued. This only affects when `apps` feature is enabled. * implement per-workspace capability SIDs for workspace specific ACLs (#10189) Today, there is a single capability SID that allows the sandbox to write to * workspace (cwd) * tmp directories if enabled * additional writable roots This change splits those up, so that each workspace has its own capability SID, while tmp and additional roots, which are installation-wide, are still governed by the "generic" capability SID This isolates workspaces from each other in terms of sandbox write access. Also allows us to protect <cwd>/.codex when codex runs in a specific <cwd> * Updated bug templates and added a new one for app (#10548) * [codex] Default values from requirements if unset (#10531) If we don't set any explicit values for sandbox or approval policy, let's try to use a requirements-satisfying value. * Fixed icon for CLI bug template (#10552) * chore(arg0): advisory-lock janitor for codex tmp paths (#10039) ## Description ### What changed - Switch the arg0 helper root from `~/.codex/tmp/path` to `~/.codex/tmp/path2` - Add `Arg0PathEntryGuard` to keep both the `TempDir` and an exclusive `.lock` file alive for the process lifetime - Add a startup janitor that scans `path2` and deletes only directories whose lock can be acquired ### Tests - `cargo clippy -p codex-arg0` - `cargo clippy -p codex-core` - `cargo test -p codex-arg0` - `cargo test -p codex-core` * feat: add APIs to list and download public remote skills (#10448) Add API to list / download from remote public skills * Handle exec shutdown on Interrupt (fixes immortal `codex exec` with websockets) (#10519) ### Motivation - Ensure `codex exec` exits when a running turn is interrupted (e.g., Ctrl-C) so the CLI is not "immortal" when websockets/streaming are used. ### Description - Return `CodexStatus::InitiateShutdown` when handling `EventMsg::TurnAborted` in `exec/src/event_processor_with_human_output.rs` so human-output exec mode shuts down after an interrupt. - Treat `protocol::EventMsg::TurnAborted` as `CodexStatus::InitiateShutdown` in `exec/src/event_processor_with_jsonl_output.rs` so JSONL output mode behaves the same. - Applied formatting with `just fmt`. ### Testing - Ran `just fmt` successfully. - Ran `cargo test -p codex-exec`; many unit tests ran and the test command completed, but the full test run in this environment produced `35 passed, 11 failed` where the failures are due to Landlock sandbox panics and 403 responses in the test harness (environmental/integration issues) and are not caused by the interrupt/shutdown changes. ------ [Codex Task](https://chatgpt.com/codex/tasks/task_i_698165cec4e083258d17702bd29014c1) * Feat: add upgrade to app server modelList (#10556) ### Summary * Add model upgrade to listModel app server endpoint to support dynamically show model upgrade banner. * feat(tui): pace catch-up stream chunking with hysteresis (#10461) ## Summary - preserve baseline streaming behavior (smooth mode still commits one line per 50ms tick) - extract adaptive chunking policy and commit-tick orchestration from ChatWidget into `streaming/chunking.rs` and `streaming/commit_tick.rs` - add hysteresis-based catch-up behavior with bounded batch draining to reduce queue lag without bursty single-frame jumps - document policy behavior, tuning guidance, and debug flow in rustdoc + docs ## Testing - just fmt - cargo test -p codex-tui * chore: add `codex debug app-server` tooling (#10367) codex debug app-server <user message> forwards the message through codex-app-server-test-client’s send_message_v2 library entry point, using std::env::current_exe() to resolve the codex binary. for how it looks like, see: ``…

feat: add --experimental to generate-ts

dbc93be

shijie-oai approved these changes Feb 2, 2026

View reviewed changes

jif-oai merged commit 059d386 into main Feb 2, 2026
60 of 65 checks passed

jif-oai deleted the jif/bind-experimental branch February 2, 2026 20:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add `--experimental` to `generate-ts`#10402

feat: add `--experimental` to `generate-ts`#10402
jif-oai merged 1 commit intomainfrom
jif/bind-experimental

jif-oai commented Feb 2, 2026

Uh oh!

jif-oai commented Feb 2, 2026

Uh oh!

chatgpt-codex-connector bot commented Feb 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants