Skip to content

feat: implement (multi) subagent orchestration system#1

Open
Grinsven wants to merge 28 commits intomainfrom
pr-3655
Open

feat: implement (multi) subagent orchestration system#1
Grinsven wants to merge 28 commits intomainfrom
pr-3655

Conversation

@Grinsven
Copy link
Copy Markdown
Owner

Port of upstream PR openai#3655 (originally by @cognitive-glitch).

- Add agent.rs module with AgentRegistry for managing custom agents
- Support loading agent configurations from ~/.codex/agents.toml
- Enable agent-specific system prompts with file-based loading
- Add comprehensive subagents documentation with usage examples
- Include example-agents.toml with 10 specialized agent templates
- Update README with multi-agent system documentation links
- Integrate agent system into core Codex architecture
- Support inheritance of tools/permissions from workspace context
- Prevent recursive agent spawning for system stability

The multi-agent system allows users to define specialized AI agents
with custom system prompts for tasks like code review, testing,
refactoring, security auditing, and performance analysis.
Added a new function `validate_prompt_path` in the `AgentRegistry` to ensure that prompt file paths do not escape allowed directories, preventing potential path traversal attacks. This function checks if the provided path is within the user's home directory or the base directory, returning an error if it is not. Updated the agent loading logic to utilize this validation, enhancing security during agent initialization.
Updated the agent context creation to improve clarity and functionality. The task extraction logic from the agent prompt has been refined to ensure that the user's task is correctly identified and handled. This change includes better error handling for cases where no task is found, and improved logging for agent execution. Additionally, comments have been updated to reflect the new structure and intent of the code.
Updated the agent ID generation logic to safely truncate the call ID, ensuring unique identifiers are created without risk of overflow. Additionally, introduced an `AgentContextGuard` struct to manage the agent context flag more effectively, ensuring it is reset appropriately after use. This change enhances clarity and reliability in agent execution tracking.
Enhanced the `AgentContextGuard` struct to ensure the agent context flag is reset appropriately after use, improving reliability in agent execution tracking. Updated comments for clarity and added logging to indicate when the agent context is marked and unmarked, ensuring better traceability during agent operations.
Refined the logic for sending the `AgentBegin` and `AgentEnd` events to ensure they are only sent under appropriate conditions. Improved the handling of agent context creation and task extraction, ensuring better clarity and reliability in agent execution tracking. Updated comments for better understanding and added error handling for agent stream errors, enhancing overall robustness in agent operations.
Added validation logic to ensure that the AgentConfig struct requires either a prompt or a prompt_file, but not both. Implemented functionality to load the prompt from a specified file if prompt_file is provided. Enhanced the get_prompt method to cache the loaded prompt for efficiency. Updated tests to cover various scenarios for prompt validation and retrieval, ensuring robust agent configuration management.
Enhanced the handling of agent tool calls by introducing a structure to manage pending tool calls, allowing for parallel execution of agent and non-agent tool calls. Updated the logic to categorize and process tool calls efficiently, ensuring that agent calls are executed concurrently while maintaining proper context management. Improved error handling and logging for better traceability during agent operations.
Enhanced the agent execution logic to support true parallel execution of agents, ensuring that all agents run concurrently rather than sequentially. Introduced thread-safe wrappers for Session and TurnContext to facilitate safe concurrent access. Updated comments for clarity and improved logging to reflect the parallel execution status, enhancing performance tracking during agent operations.
Refactored the mutex handling in the codex module to replace `lock_unchecked()` with `lock_or_recover()`, enhancing the robustness of the code by allowing recovery from poisoned locks. Updated various functions to ensure safe access to shared state, improving error handling and overall clarity. This change aims to provide a cleaner API for managing mutex locks while maintaining performance during concurrent operations.
Refactored the handling of tool calls in the codex module to implement a more efficient concurrency strategy. Agent calls are executed in parallel, while other tool calls are processed sequentially to respect dependencies.
Refactored the agent message structure in the codex module to replace `OutputText` with `InputText` for better clarity and alignment with input processing. This change enhances the handling of user messages, ensuring that the content is appropriately categorized and processed as input. The update aims to improve the overall readability and maintainability of the code.
Refactored the agent tool handling in the codex module to streamline the processing of agent calls and non-agent tool calls. This update emphasizes true parallel execution for agent calls while ensuring that other tool calls are processed sequentially, respecting dependencies. Improved the clarity of comments and the overall structure of the code to enhance maintainability and performance tracking during agent operations.
Enhanced the agent execution logic in the codex module by introducing a flag to track whether the execution context is an agent context. This update prevents recursion during agent calls, ensuring that agents do not spawn other agents. The changes include setting and resetting the agent context flag appropriately, improving the clarity of the code and maintaining the integrity of parallel execution. Updated comments for better understanding of the context management process.
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @Grinsven, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates a robust multi-agent orchestration system into Codex, enabling users to leverage specialized AI agents for various tasks. The system allows for flexible agent configuration via ~/.codex/agents.toml, supporting custom system prompts and tool/permission overrides. A key enhancement is the ability to execute multiple agent calls in parallel, optimizing performance for complex workflows. User interaction is streamlined through a new @agent mention syntax and a /agents command in the TUI, providing real-time feedback and automatic plan tracking. The changes span core logic, protocol definitions, UI components, and extensive documentation, making the multi-agent capabilities a central feature of Codex.

Highlights

  • Multi-Agent Orchestration System: Introduced a new system allowing the definition and invocation of specialized AI agents.
  • Customizable Agents: Agents can be configured with custom system prompts, either inline or loaded from files, and can optionally override tools and permissions.
  • Parallel Agent Execution: Multiple agent calls within a single turn can now execute concurrently, significantly improving performance for complex, multi-agent tasks.
  • Enhanced User Interface: New @agent mention syntax for natural agent invocation and a /agents command to list available agents have been added to the TUI.
  • Comprehensive Documentation: New and updated documentation files (README.md, docs/config.md, docs/getting-started.md, docs/subagents.md) detail the setup, usage, and best practices for the multi-agent system.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an impressive pull request that introduces a powerful multi-agent orchestration system. The feature is well-designed, with custom agent configurations, parallel execution, and good user-facing features like @agent mentions and the /agents command. The addition of comprehensive documentation and examples is also excellent.

My review focuses on a few key areas:

  • Concurrency Safety: There is a critical issue regarding the use of unsafe to implement Sync for SessionWrapper. This is a potential source of data races and should be addressed with a safer pattern like Weak pointers.
  • Robustness: A couple of medium severity suggestions to improve cross-platform compatibility and robustness in path handling within the new agent module.
  • Maintainability: A medium severity suggestion to refactor a loop in codex.rs to reduce code duplication.

Overall, this is a fantastic contribution. Addressing the safety concern around concurrency is the highest priority.

Comment on lines +2858 to +2871
/// Thread-safe wrapper for Session to enable concurrent execution
struct SessionWrapper {
// We use a raw pointer here because:
// 1. We need to share &Session across threads but Session isn't Clone
// 2. The Session is guaranteed to outlive this wrapper (created in same scope)
// 3. Session has internal thread-safety via Mutex fields
ptr: *const Session,
_phantom: std::marker::PhantomData<Session>,
}

// SAFETY: Session has internal synchronization via Mutex fields
// The wrapper ensures the Session outlives all uses
unsafe impl Send for SessionWrapper {}
unsafe impl Sync for SessionWrapper {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The manual unsafe implementation of Send and Sync for SessionWrapper is extremely dangerous and can lead to data races. The Session struct is not Sync because it contains fields that are not Sync (due to a reference cycle with AgentTask holding an Arc<Session>). Bypassing the compiler's safety checks with unsafe is a major security and correctness risk.

A safer approach to break the reference cycle would be to use std::sync::Weak<Session> inside AgentTask instead of Arc<Session>. This is the idiomatic way to handle cycles in Rust.

Comment on lines +88 to +92
let path = if prompt_file.starts_with('/') {
PathBuf::from(prompt_file)
} else {
base_dir.join(prompt_file)
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check prompt_file.starts_with('/') is not cross-platform for detecting absolute paths. On Windows, absolute paths can start with a drive letter (e.g., C:\). Using Path::is_absolute() would make this function more robust and portable.

Suggested change
let path = if prompt_file.starts_with('/') {
PathBuf::from(prompt_file)
} else {
base_dir.join(prompt_file)
};
let path = PathBuf::from(prompt_file);
let path = if path.is_absolute() {
path
} else {
base_dir.join(path)
};

Comment on lines +207 to +212
fn get_agents_directory() -> Option<PathBuf> {
std::env::var("HOME")
.or_else(|_| std::env::var("USERPROFILE"))
.ok()
.map(|home| PathBuf::from(home).join(".codex"))
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency and robustness, get_agents_directory should use dirs::home_dir() to locate the user's home directory, just like validate_prompt_path does. The current implementation using std::env::var("HOME").or_else(|_| std::env::var("USERPROFILE")) is less reliable across different platforms and configurations. The dirs crate is already a dependency.

    fn get_agents_directory() -> Option<PathBuf> {
        dirs::home_dir().map(|home| home.join(".codex"))
    }

Comment on lines +2279 to +2376
for item in collected_items {
match &item {
ResponseItem::FunctionCall {
name,
call_id,
arguments,
..
} if name == "agent" => {
// Collect agent calls for parallel execution
pending_agent_calls.push(PendingToolCall {
item: item.clone(),
call_id: call_id.clone(),
name: name.clone(),
arguments: Some(arguments.clone()),
});
}
ResponseItem::FunctionCall { .. } => {
// Process non-agent function calls immediately in order
let response = handle_response_item(
sess,
turn_context,
turn_diff_tracker,
sub_id,
item.clone(),
)
.await?;
if let Some(resp) = response.clone() {
output.push(resp);
}
processed_items.push(ProcessedResponseItem {
item: item.clone(),
response,
});
}
ResponseItem::LocalShellCall {
call_id: Some(id), ..
} => {
// Process shell calls immediately in order
let response = handle_response_item(
sess,
turn_context,
turn_diff_tracker,
sub_id,
item.clone(),
)
.await?;
if let Some(resp) = response.clone() {
output.push(resp);
}
processed_items.push(ProcessedResponseItem {
item: item.clone(),
response,
});
}
ResponseItem::CustomToolCall { name, call_id, .. } if name == "agent" => {
// Collect agent calls for parallel execution
pending_agent_calls.push(PendingToolCall {
item: item.clone(),
call_id: call_id.clone(),
name: name.clone(),
arguments: None,
});
}
ResponseItem::CustomToolCall { .. } => {
// Process non-agent custom tool calls immediately in order
let response = handle_response_item(
sess,
turn_context,
turn_diff_tracker,
sub_id,
item.clone(),
)
.await?;
if let Some(resp) = response.clone() {
output.push(resp);
}
processed_items.push(ProcessedResponseItem {
item: item.clone(),
response,
});
}
_ => {
// Process non-tool items immediately in order
let response = handle_response_item(
sess,
turn_context,
turn_diff_tracker,
sub_id,
item.clone(),
)
.await?;
if let Some(resp) = response.clone() {
output.push(resp);
}
processed_items.push(ProcessedResponseItem { item, response });
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The loop for processing collected_items contains significant code duplication for handling different types of tool calls (FunctionCall, LocalShellCall, CustomToolCall). The logic for processing these items is nearly identical. This can be refactored to reduce repetition and improve maintainability. You can check if an item is an agent call first, and then have a common block for all other tool calls.

    for item in collected_items {
        match &item {
            ResponseItem::FunctionCall { name, call_id, arguments, .. } if name == "agent" => {
                // Collect agent calls for parallel execution
                pending_agent_calls.push(PendingToolCall {
                    item: item.clone(),
                    call_id: call_id.clone(),
                    name: name.clone(),
                    arguments: Some(arguments.clone()),
                });
            }
            ResponseItem::CustomToolCall { name, call_id, .. } if name == "agent" => {
                // Collect agent calls for parallel execution
                pending_agent_calls.push(PendingToolCall {
                    item: item.clone(),
                    call_id: call_id.clone(),
                    name: name.clone(),
                    arguments: None,
                });
            }
            _ => {
                // Process non-agent items immediately in order
                let response = handle_response_item(
                    sess,
                    turn_context,
                    turn_diff_tracker,
                    sub_id,
                    item.clone(),
                )
                .await?;
                if let Some(resp) = response.clone() {
                    output.push(resp);
                }
                processed_items.push(ProcessedResponseItem { item, response });
            }
        }
    }

@Grinsven
Copy link
Copy Markdown
Owner Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-architected feature: a multi-agent orchestration system. The changes are extensive, touching core logic, the communication protocol, UI components, and documentation to support this new capability. The highlight is the implementation of parallel agent execution, which is a complex but powerful addition. My review focuses on the core implementation in codex-rs/core, where I've identified a potential security vulnerability, some opportunities for code simplification and cleanup, and a risky use of unsafe code that warrants careful consideration for future maintenance. Overall, this is a very impressive piece of work.

Comment on lines +53 to +75
pub fn get_prompt(&mut self, agents_dir: Option<&Path>) -> anyhow::Result<String> {
if let Some(prompt) = &self.prompt {
return Ok(prompt.clone());
}

if let Some(prompt_file) = &self.prompt_file {
let full_path = if let Some(dir) = agents_dir {
dir.join(prompt_file)
} else {
PathBuf::from(prompt_file)
};

let prompt_content = std::fs::read_to_string(&full_path).map_err(|e| {
anyhow::anyhow!("Cannot read prompt file '{}': {}", full_path.display(), e)
})?;

// Cache the loaded prompt
self.prompt = Some(prompt_content.clone());
Ok(prompt_content)
} else {
Err(anyhow::anyhow!("No prompt or prompt_file specified"))
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This public function get_prompt is vulnerable to path traversal. It constructs a path by joining agents_dir and prompt_file without validating prompt_file. An attacker could craft a prompt_file like "../../../../etc/passwd" to read arbitrary files on the system. The path validation logic from AgentRegistry::validate_prompt_path should be used here to prevent this. The logic for loading prompts from files is duplicated between this function and AgentRegistry::new, which is also a maintainability issue. It would be better to consolidate this logic.

Comment on lines +207 to +212
fn get_agents_directory() -> Option<PathBuf> {
std::env::var("HOME")
.or_else(|_| std::env::var("USERPROFILE"))
.ok()
.map(|home| PathBuf::from(home).join(".codex"))
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency and robustness, it's better to use dirs::home_dir() to get the home directory, as it's already used in validate_prompt_path. The dirs crate handles more edge cases across different operating systems than manually checking HOME and USERPROFILE environment variables.

    fn get_agents_directory() -> Option<PathBuf> {
        dirs::home_dir().map(|home| home.join(".codex"))
    }

Comment on lines +2013 to +2018
let agent_infos =
if let Ok(Some(registry)) = sess.agent_registry.lock().map(|g| g.as_ref().cloned()) {
Some(registry.list_agent_details())
} else {
None
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic for getting agent_infos can be simplified by using the lock_or_recover extension method, which is consistent with other mutex locks in this file. This makes the code more concise and readable.

    let agent_infos = sess.agent_registry.lock_or_recover()
        .as_ref()
        .map(|registry| registry.list_agent_details());

Comment on lines +2802 to +2812
"agent" => {
// Agent calls are now handled in parallel at the turn level
// Return a placeholder response that will be replaced by parallel execution
ResponseInputItem::FunctionCallOutput {
call_id,
output: FunctionCallOutputPayload {
content: "Agent execution deferred for parallel processing".to_string(),
success: Some(false),
},
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This "agent" match arm appears to be dead code. The try_run_turn function explicitly separates agent tool calls into pending_agent_calls and only calls handle_response_item (which in turn calls handle_function_call) for non-agent tool calls. This branch will therefore never be reached. It should be removed to avoid confusion and code rot.

Comment on lines +2858 to +2871
/// Thread-safe wrapper for Session to enable concurrent execution
struct SessionWrapper {
// We use a raw pointer here because:
// 1. We need to share &Session across threads but Session isn't Clone
// 2. The Session is guaranteed to outlive this wrapper (created in same scope)
// 3. Session has internal thread-safety via Mutex fields
ptr: *const Session,
_phantom: std::marker::PhantomData<Session>,
}

// SAFETY: Session has internal synchronization via Mutex fields
// The wrapper ensures the Session outlives all uses
unsafe impl Send for SessionWrapper {}
unsafe impl Sync for SessionWrapper {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of raw pointers and unsafe blocks to share &Session across threads is a high-risk pattern. While the comments justify its use and the current implementation seems to manage lifetimes correctly via join_all, this is a potential footgun. Any future changes to how agent futures are executed (e.g., moving from join_all to tokio::spawn) could easily lead to use-after-free bugs. Consider adding more prominent warnings or exploring alternative, safer patterns if possible, even if they are more verbose (e.g., refactoring Session to be Sync or passing owned data to threads).

Comment on lines +2888 to +2898
/// Thread-safe wrapper for TurnContext to enable concurrent execution
struct TurnContextWrapper {
// We use a raw pointer here for the same reasons as SessionWrapper
ptr: *const TurnContext,
_phantom: std::marker::PhantomData<TurnContext>,
}

// SAFETY: TurnContext contains only thread-safe types
// The wrapper ensures the TurnContext outlives all uses
unsafe impl Send for TurnContextWrapper {}
unsafe impl Sync for TurnContextWrapper {}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to SessionWrapper, this use of raw pointers and unsafe to share &TurnContext is risky. While it appears safe in the current context, it makes the code fragile and prone to memory safety issues if the calling patterns change. It would be safer to refactor to avoid unsafe if possible.

Grinsven pushed a commit that referenced this pull request Feb 28, 2026
…ai#8950)

Agent wouldn't "see" attached images and would instead try to use the
view_file tool:
<img width="1516" height="504" alt="image"
src="https://github.com/user-attachments/assets/68a705bb-f962-4fc1-9087-e932a6859b12"
/>

In this PR, we wrap image content items in XML tags with the name of
each image (now just a numbered name like `[Image #1]`), so that the
model can understand inline image references (based on name). We also
put the image content items above the user message which the model seems
to prefer (maybe it's more used to definitions being before references).

We also tweak the view_file tool description which seemed to help a bit

Results on a simple eval set of images:

Before
<img width="980" height="310" alt="image"
src="https://github.com/user-attachments/assets/ba838651-2565-4684-a12e-81a36641bf86"
/>

After
<img width="918" height="322" alt="image"
src="https://github.com/user-attachments/assets/10a81951-7ee6-415e-a27e-e7a3fd0aee6f"
/>

```json
[
  {
    "id": "single_describe",
    "prompt": "Describe the attached image in one sentence.",
    "images": ["image_a.png"]
  },
  {
    "id": "single_color",
    "prompt": "What is the dominant color in the image? Answer with a single color word.",
    "images": ["image_b.png"]
  },
  {
    "id": "orientation_check",
    "prompt": "Is the image portrait or landscape? Answer in one sentence.",
    "images": ["image_c.png"]
  },
  {
    "id": "detail_request",
    "prompt": "Look closely at the image and call out any small details you notice.",
    "images": ["image_d.png"]
  },
  {
    "id": "two_images_compare",
    "prompt": "I attached two images. Are they the same or different? Briefly explain.",
    "images": ["image_a.png", "image_b.png"]
  },
  {
    "id": "two_images_captions",
    "prompt": "Provide a short caption for each image (Image 1, Image 2).",
    "images": ["image_c.png", "image_d.png"]
  },
  {
    "id": "multi_image_rank",
    "prompt": "Rank the attached images from most colorful to least colorful.",
    "images": ["image_a.png", "image_b.png", "image_c.png"]
  },
  {
    "id": "multi_image_choice",
    "prompt": "Which image looks more vibrant? Answer with 'Image 1' or 'Image 2'.",
    "images": ["image_b.png", "image_d.png"]
  }
]
```
Grinsven pushed a commit that referenced this pull request Feb 28, 2026
I've seen this test fail with:

```
 - Mock #1.
        	Expected range of matching incoming requests: == 2
        	Number of matched incoming requests: 1
```

This is because we pop the wrong task_complete events and then the test
exits. I think this is because the MCP events are now buffered after
openai#8874.

So:
1. clear the buffer before we do any user message sending
2. additionally listen for task start before task complete
3. use the ID from task start to find the correct task complete event.
Grinsven pushed a commit that referenced this pull request Feb 28, 2026
Fixes openai#9050

When a draft is stashed with Ctrl+C, we now persist the full draft state
(text elements, local image paths, and pending paste payloads) in local
history. Up/Down recall rehydrates placeholder elements and attachments
so styling remains correct and large pastes still expand on submit.
Persistent (cross‑session) history remains text‑only.

Backtrack prefills now reuse the selected user message’s text elements
and local image paths, so image placeholders/attachments rehydrate when
rolling back.

External editor replacements keep only attachments whose placeholders
remain and then normalize image placeholders to `[Image #1]..[Image #N]`
to keep the attachment mapping consistent.

Docs:
- docs/tui-chat-composer.md

Testing:
- just fix -p codex-tui
- cargo test -p codex-tui

Co-authored-by: Eric Traut <etraut@openai.com>
Grinsven pushed a commit that referenced this pull request Feb 28, 2026
…i#10590)

## Summary
This PR makes app-server-provided image URLs first-class attachments in
TUI, so they survive resume/backtrack/history recall and are resubmitted
correctly.

<img width="715" height="491" alt="Screenshot 2026-02-12 at 8 27 08 PM"
src="https://github.com/user-attachments/assets/226cbd35-8f0c-4e51-a13e-459ef5dd1927"
/>

Can delete the attached image upon backtracking:
<img width="716" height="301" alt="Screenshot 2026-02-12 at 8 27 31 PM"
src="https://github.com/user-attachments/assets/4558d230-f1bd-4eed-a093-8e1ab9c6db27"
/>

In both history and composer, remote images are rendered as normal
`[Image #N]` placeholders, with numbering unified with local images.

## What changed
- Plumb remote image URLs through TUI message state:
  - `UserHistoryCell`
  - `BacktrackSelection`
  - `ChatComposerHistory::HistoryEntry`
  - `ChatWidget::UserMessage`
- Show remote images as placeholder rows inside the composer box (above
textarea), and in history cells.
- Support keyboard selection/deletion for remote image rows in composer
(`Up`/`Down`, `Delete`/`Backspace`).
- Preserve remote-image-only turns in local composer history (Up/Down
recall), including restore after backtrack.
- Ensure submit/queue/backtrack resubmit include remote images in model
input (`UserInput::Image`), and keep request shape stable for
remote-image-only turns.
- Keep image numbering contiguous across remote + local images:
  - remote images occupy `[Image #1]..[Image #M]`
  - local images start at `[Image #M+1]`
  - deletion renumbers consistently.
- In protocol conversion, increment shared image index for remote images
too, so mixed remote/local image tags stay in a single sequence.
- Simplify restore logic to trust in-memory attachment order (no
placeholder-number parsing path).
- Backtrack/replay rollback handling now queues trims through
`AppEvent::ApplyThreadRollback` and syncs transcript overlay/deferred
lines after trims, so overlay/transcript state stays consistent.
- Trim trailing blank rendered lines from user history rendering to
avoid oversized blank padding.

## Docs + tests
- Updated: `docs/tui-chat-composer.md` (remote image flow,
selection/deletion, numbering offsets)
- Added/updated tests across `tui/src/chatwidget/tests.rs`,
`tui/src/app.rs`, `tui/src/app_backtrack.rs`, `tui/src/history_cell.rs`,
and `tui/src/bottom_pane/chat_composer.rs`
- Added snapshot coverage for remote image composer states, including
deleting the first of two remote images.

## Validation
- `just fmt`
- `cargo test -p codex-tui`

## Codex author
`codex fork 019c2636-1571-74a1-8471-15a3b1c3f49d`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants