From 7f99542e4c79d5bb7c78bd60badae5e3af23dd9a Mon Sep 17 00:00:00 2001 From: James Peter Date: Tue, 28 Oct 2025 13:47:40 +1000 Subject: [PATCH 1/3] docs: remove temporary issue report files --- REPORT_issues.md | 145 --------------------------------------------- issue-responses.md | 59 ------------------ 2 files changed, 204 deletions(-) delete mode 100644 REPORT_issues.md delete mode 100644 issue-responses.md diff --git a/REPORT_issues.md b/REPORT_issues.md deleted file mode 100644 index 7049bdd68a3..00000000000 --- a/REPORT_issues.md +++ /dev/null @@ -1,145 +0,0 @@ -# Issue Fix Report - -## Issue #207 — Remove Codex Requirement (Won’t Fix) - -### Problem Summary -Request aimed to drop the Codex requirement entirely so users without Codex access could run the CLI. - -### Root Cause -The product still depends on a valid OpenAI (or alternative provider) account, so changing the default model string alone cannot remove that requirement. - -### Resolution & Documentation Update -- Commit `e5c111b70` retained clarification-only work: defaults/docs now mention `gpt-5` alongside `gpt-5-codex` and highlight provider setup, but runtime behavior is unchanged. -- Issue is reclassified as “won’t fix” because an active provider account remains mandatory. - -### Validation -- No functional change; `./build-fast.sh` already passed in this branch. - -### Remaining Risks / Next Steps -None. Provider provisioning guidance is documented; no further engineering planned. - -### Quality Rating: n/a -Informational update only; no code fix to evaluate. - ---- - -## Issue #343 — `OPENAI_WIRE_API` Override Ignored - -### Problem Summary -Setting `OPENAI_WIRE_API=chat` (documented override) no longer forced the Chat Completions API, leaving users stuck on Responses. - -### Root Cause -`built_in_model_providers()` in `code-rs/core/src/model_provider_info.rs` hard-coded `WireApi::Responses`, ignoring the environment override. - -### Fix Description -- Added `wire_api_override_from_env` helper to parse `OPENAI_WIRE_API` (`code-rs/core/src/model_provider_info.rs`). -- OpenAI provider construction now respects the helper, falling back to Responses on invalid input. -- Added regression tests (`code-rs/core/tests/openai_wire_api_env.rs`) for defaults, chat/responses selection, invalid input, and case-insensitivity. - -### Validation -- New test suite covers five override scenarios. -- `./build-fast.sh` passes. - -### Remaining Risks / Next Steps -Minimal. Behavior still requires restart to pick up changed env vars—documented expectation. - -### Quality Rating: 5/5 -Tight implementation with exhaustive unit coverage and robust fallback handling. - ---- - -## Issue #289 — Legacy Custom Prompts Not Loaded - -### Problem Summary -Prompt discovery skipped `~/.codex/prompts`, breaking legacy installations. - -### Root Cause -`default_prompts_dir()` in `code-rs/core/src/custom_prompts.rs` joined `prompts` directly on the code-home path instead of using the legacy-aware resolver. - -### Fix Description -- Switched to `resolve_code_path_for_read` so legacy directories are discovered. -- Added comprehensive async tests in `code-rs/core/tests/custom_prompts_discovery.rs` covering CODE_HOME, legacy fallback, precedence when both exist, and filtering non-Markdown files. - -### Validation -- New tests exercise all discovery branches. -- `./build-fast.sh` passes. - -### Remaining Risks / Next Steps -Low. The resolver is shared with other config paths; document precedence if users keep both directories populated. - -### Quality Rating: 5/5 -One-line fix with strong regression tests and environmental isolation. - ---- - -## Issue #351 — Agents Toggle Didn’t Persist - -### Problem Summary -Toggling the “Enabled” checkbox in the Agents overlay didn’t immediately persist; users had to exit the overlay to see updates. - -### Root Cause -`AgentEditorView` updated the in-memory flag but only persisted configuration when the Save action fired. - -### Fix Description -- Introduced `persist_current_agent()` and invoked it after Left/Right/Space toggle actions and on Enter (`code-rs/tui/src/bottom_pane/agent_editor_view.rs`). -- Expanded smoke helpers to handle `UpdateAgentConfig` and `ShowAgentsOverview` events (`code-rs/tui/src/chatwidget/smoke_helpers.rs`). -- Added VT100 snapshot regression `agents_toggle_claude_opus_persists_via_slash_command` asserting UI and persisted state stay in sync (`code-rs/tui/tests/vt100_chatwidget_snapshot.rs`). - -### Validation -- Snapshot test covers toggle → persist → reopen flow. -- `./build-fast.sh` passes. - -### Remaining Risks / Next Steps -Low. Persistence still assumes config writes succeed; future work could surface errors in the UI. - -### Quality Rating: 5/5 -Fix is localized, regression-tested, and improves UX without regressions. - ---- - -## Issue #307 — Gemini CLI Flag Rejected - -### Problem Summary -Gemini agents invoked the CLI with `-m`, which newer releases rejected, causing launch failures. - -### Root Cause -`agent_defaults.rs` still used the short flag for Gemini specs. - -### Fix Description -- Gemini Pro/Flash specs now use `--model` (`code-rs/core/src/agent_defaults.rs`). -- Added unit test ensuring both specs keep the long flag (`code-rs/core/tests/gemini_model_args.rs`). - -### Validation -- New unit test passes under `./build-fast.sh`. - -### Remaining Risks / Next Steps -Minor. Could consider an integration test that spawns the CLI, but static spec coverage prevents regression in this layer. - -### Quality Rating: 4/5 -Targeted fix with guardrail; slight deduction for lack of end-to-end validation. - ---- - -## Issue #333 — Duplicate MCP Servers After `/new` - -### Problem Summary -Invoking `/new` created a fresh MCP session without shutting down the previous one, leaving duplicate stdio servers running and cluttering the Agents menu. - -### Root Cause -The submission loop aborted the old session but didn’t await cleanup of its `McpConnectionManager` clients; legacy stdio processes relied on drop semantics, and RMCP transports lacked an explicit shutdown path. - -### Fix Description -- Added `Session::shutdown_mcp_clients` to await `McpConnectionManager::shutdown_all` before constructing the next session (`code-rs/core/src/codex.rs:1123`, `code-rs/core/src/codex.rs:3438`). -- `McpConnectionManager` now stores clients in an `RwLock`, drains them, and calls `McpClientAdapter::into_shutdown` to dispose of each client deterministically (`code-rs/core/src/mcp_connection_manager.rs`). Legacy adapters drop and yield to propagate `kill_on_drop`; RMCP adapters call the new `shutdown` method. -- Added `RmcpClient::shutdown` to cancel the running service (`code-rs/rmcp-client/src/rmcp_client.rs`). -- Introduced async regression test `mcp_session_cleanup` that compiles a stub wrapper and asserts the first MCP process exits before the second manager starts (`code-rs/tui/tests/mcp_session_cleanup.rs`). - -### Validation -- Regression test exercises `/new` behavior and fails if the first process never exits. -- `./build-fast.sh` passes. - -### Remaining Risks / Next Steps -Moderate. Legacy stdio still relies on `kill_on_drop` plus a `yield_now`; consider adding explicit wait-on-child for extra certainty and covering HTTP transports in tests. - -### Quality Rating: 4/5 -Substantial improvement with automated coverage; small residual risk remains around stdio teardown timing. diff --git a/issue-responses.md b/issue-responses.md deleted file mode 100644 index 30f3ae8b020..00000000000 --- a/issue-responses.md +++ /dev/null @@ -1,59 +0,0 @@ -# Issue Responses - -## #207 — Remove Codex Requirement - -Thanks for the note! This isn’t a functional bug—we still require a valid OpenAI (or other configured) provider account to run. We kept the clarification changes that shipped earlier (docs now highlight provider setup, and defaults mention `gpt-5` alongside the Codex option), but there’s no runtime change here. We’ve reclassified the issue as “won’t fix” because provider access remains mandatory. - -## #343 — OPENAI_WIRE_API support was removed - -We’ve reinstated the documented `OPENAI_WIRE_API` override. Setting `OPENAI_WIRE_API=chat` now forces the built-in OpenAI provider to use the Chat Completions endpoint, while `responses` (or the default) stays on the Responses API. The new regression tests in `code-rs/core/tests/openai_wire_api_env.rs` cover chat/responses/default/invalid cases so this stays verified going forward. - -## #289 — Custom prompts not discovered - -Prompt discovery now uses the same legacy-aware resolver as other config files, so `~/.codex/prompts/*.md` is picked up when `~/.code/prompts` is absent. Fresh async tests in `code-rs/core/tests/custom_prompts_discovery.rs` lock environment variables and cover CODE_HOME override, legacy fallback, dual-directory preference, and ignoring non-Markdown files. - -## #351 — Agents toggle state not persisting - -Thanks for the detailed report! The Agents overlay now persists the “Enabled” checkbox immediately, so you no longer need to close and reopen the menu to see the change. We also added a VT100 snapshot regression test (`agents_toggle_claude_opus_persists_via_slash_command`) to keep the toggle flow locked. The fix is merged and will ride out in the next release. - -## #333 — Duplicate MCP servers after `/new` - -Great catch—new sessions were launching a fresh MCP stdio server without shutting down the previous one, so duplicate tool providers lingered in the Agents menu. `Session::shutdown_mcp_clients` now drains the connection manager before we rebuild the session, which kills any legacy stdio clients via `kill_on_drop` and awaits RMCP shutdowns. The regression test in `code-rs/tui/tests/mcp_session_cleanup.rs` builds a stub MCP server and fails if the first process is still alive when `/new` spins up the replacement. Look for this fix in the upcoming release. - -### Triage Summary -- #351 — fixed (pending release notes) -- #348 — needs-repro (sandbox write errors still under investigation) -- #343 — fixed (tests added for OPENAI_WIRE_API) -- #341 — backlog (consider prompt override UX after /auto refactors) -- #338 — packaging (nix flake vendoring gap) -- #333 — fixed (MCP session cleanup, new shutdown guard) -- #332 — backlog (consider dark-mode default once theme revamp lands) -- #331 — needs-repro (depends on org streaming entitlement) -- #329 — backlog (Windows apply_patch ergonomics) -- #328 — packaging (Homebrew asset mismatch) -- #327 — needs-repro (malloc crash data gathering) -- #311 — backlog (UI polish request) -- #307 — fixed (Gemini CLI flag switched to --model) -- #306 — backlog (resume UX improvements queued) -- #305 — needs-repro (MCP menu scroll regression) -- #304 — needs-repro (remote white overlay environment-specific) -- #299 — needs-repro (Claude agent handoff failure) -- #298 — backlog (external Chrome support investigation) -- #296 — needs-repro (IME spacing in PowerShell) -- #293 — backlog (planning mode providers) -- #292 — packaging (release tagging/version alignment) -- #290 — backlog (multi-branch command idea) -- #289 — fixed (legacy prompts resolved) -- #285 — backlog (resume screen roadmap) -- #284 — backlog (title bar UX) -- #279 — backlog (Zed ACP integration polish) -- #278 — backlog (WSL browser support) -- #277 — needs-repro (Chinese input handling) -- #263 — backlog (sub-agent context exploration) -- #255 — backlog (Ctrl-A shortcut conflict) -- #232 — backlog (background bash support) -- #207 — won’t-fix (provider access still required; docs clarified) - -## #307 — Gemini agent error - -Appreciate the heads-up! The Gemini sub-agents now invoke the CLI with the long `--model` flag, which unblocks the newer releases that rejected `-m`. A dedicated unit test guards the flag so we don’t regress, and the update will land in the next release. From 94c288bf119d2839dbb036cee7076fb2b673e1d5 Mon Sep 17 00:00:00 2001 From: James Peter Date: Tue, 28 Oct 2025 14:43:54 +1000 Subject: [PATCH 2/3] fix(core/tool-call): sanitize streamed arguments --- code-rs/core/src/chat_completions.rs | 92 ++++++++++++++++++- code-rs/core/src/lib.rs | 1 + .../tests/tool_call_arguments_sanitizer.rs | 58 ++++++++++++ 3 files changed, 150 insertions(+), 1 deletion(-) create mode 100644 code-rs/core/tests/tool_call_arguments_sanitizer.rs diff --git a/code-rs/core/src/chat_completions.rs b/code-rs/core/src/chat_completions.rs index 89c4a724c25..b60c959bf8e 100644 --- a/code-rs/core/src/chat_completions.rs +++ b/code-rs/core/src/chat_completions.rs @@ -17,6 +17,7 @@ use tokio::sync::mpsc; use tokio::time::timeout; use tracing::debug; use tracing::trace; +use tracing::warn; use crate::auth::AuthManager; use crate::ModelProviderInfo; @@ -37,6 +38,84 @@ use code_protocol::models::ContentItem; use code_protocol::models::ReasoningItemContent; use code_protocol::models::ResponseItem; +/// Sanitizes streamed tool-call arguments by stripping markdown fences and extracting +/// the first valid JSON object or array if possible. Falls back to the trimmed input +/// when no valid JSON can be located. +pub fn sanitize_tool_call_arguments(raw: &str) -> String { + let trimmed = raw.trim(); + + if trimmed.is_empty() { + return String::new(); + } + + if serde_json::from_str::(trimmed).is_ok() { + return trimmed.to_string(); + } + + let without_fences = if let Some(start) = trimmed.find("```") { + let after_start = &trimmed[start + 3..]; + let content_start = if let Some(newline_pos) = after_start.find('\n') { + start + 3 + newline_pos + 1 + } else { + start + 3 + }; + + if let Some(end) = trimmed[content_start..].find("```") { + &trimmed[content_start..content_start + end] + } else { + &trimmed[content_start..] + } + } else { + trimmed + }; + + let cleaned = without_fences.trim(); + + for (idx, ch) in cleaned.char_indices() { + if ch == '{' || ch == '[' { + let candidate = &cleaned[idx..]; + if let Ok(parsed) = serde_json::from_str::(candidate) { + if let Ok(serialized) = serde_json::to_string(&parsed) { + return serialized; + } + } + + let closing = if ch == '{' { '}' } else { ']' }; + let mut depth = 0; + let mut in_string = false; + let mut escape_next = false; + + for (end_idx, end_ch) in cleaned[idx..].char_indices() { + if escape_next { + escape_next = false; + continue; + } + + match end_ch { + '\\' if in_string => escape_next = true, + '"' => in_string = !in_string, + c if c == ch && !in_string => depth += 1, + c if c == closing && !in_string => { + depth -= 1; + if depth == 0 { + let candidate = &cleaned[idx..idx + end_idx + 1]; + if let Ok(parsed) = serde_json::from_str::(candidate) { + if let Ok(serialized) = serde_json::to_string(&parsed) { + return serialized; + } + } + break; + } + } + _ => {} + } + } + } + } + + cleaned.to_string() +} + /// Implementation for the classic Chat Completions API. pub(crate) async fn stream_chat_completions( prompt: &Prompt, @@ -790,11 +869,22 @@ async fn process_chat_sse( let _ = tx_event.send(Ok(ResponseEvent::OutputItemDone { item, sequence_number: None, output_index: None })).await; } + let raw_arguments = fn_call_state.arguments.clone(); + let sanitized_arguments = sanitize_tool_call_arguments(&raw_arguments); + if sanitized_arguments != raw_arguments { + warn!( + "Sanitized tool-call arguments for function '{}'. original_len={} sanitized_len={}", + fn_call_state.name.as_deref().unwrap_or(""), + raw_arguments.len(), + sanitized_arguments.len() + ); + } + // Then emit the FunctionCall response item. let item = ResponseItem::FunctionCall { id: current_item_id.clone(), name: fn_call_state.name.clone().unwrap_or_else(|| "".to_string()), - arguments: fn_call_state.arguments.clone(), + arguments: sanitized_arguments, call_id: fn_call_state.call_id.clone().unwrap_or_else(String::new), }; diff --git a/code-rs/core/src/lib.rs b/code-rs/core/src/lib.rs index 0d1b356fcc2..335ee06ad4e 100644 --- a/code-rs/core/src/lib.rs +++ b/code-rs/core/src/lib.rs @@ -100,6 +100,7 @@ mod user_notification; pub mod util; pub use apply_patch::CODEX_APPLY_PATCH_ARG1; +pub use chat_completions::sanitize_tool_call_arguments; pub use command_safety::is_safe_command; pub use safety::get_platform_sandbox; pub use housekeeping::run_housekeeping_if_due; diff --git a/code-rs/core/tests/tool_call_arguments_sanitizer.rs b/code-rs/core/tests/tool_call_arguments_sanitizer.rs new file mode 100644 index 00000000000..ee5b7879668 --- /dev/null +++ b/code-rs/core/tests/tool_call_arguments_sanitizer.rs @@ -0,0 +1,58 @@ +use code_core::sanitize_tool_call_arguments; +use serde_json::Value; + +#[test] +fn valid_object_left_intact() { + let input = r#"{"key": "value"}"#; + let sanitized = sanitize_tool_call_arguments(input); + assert_eq!(sanitized, input); +} + +#[test] +fn strips_markdown_fence_with_language() { + let input = "```json\n{\n \"answer\": 42\n}\n```"; + let sanitized = sanitize_tool_call_arguments(input); + let value: Value = serde_json::from_str(&sanitized).unwrap(); + assert_eq!(value["answer"], 42); +} + +#[test] +fn extracts_json_after_prose() { + let input = "Here you go:\n[{'oops': 'not json'}]{\"name\":\"ok\"}"; + // Replace single quotes to keep it invalid except for the object portion. + let input = input.replace("'", "\""); + let sanitized = sanitize_tool_call_arguments(&input); + let value: Value = serde_json::from_str(&sanitized).unwrap(); + assert_eq!(value["name"], "ok"); +} + +#[test] +fn handles_arrays() { + let input = "```\n[ {\"tool\": 1}, {\"tool\": 2} ]\n```"; + let sanitized = sanitize_tool_call_arguments(input); + let value: Value = serde_json::from_str(&sanitized).unwrap(); + assert!(value.is_array()); + assert_eq!(value.as_array().unwrap().len(), 2); +} + +#[test] +fn preserves_braces_inside_strings() { + let input = r#"{"msg": "brace } inside string"}"#; + let sanitized = sanitize_tool_call_arguments(input); + let value: Value = serde_json::from_str(&sanitized).unwrap(); + assert_eq!(value["msg"], "brace } inside string"); +} + +#[test] +fn returns_trimmed_on_failure() { + let input = "```json\n{ broken json\n```"; + let sanitized = sanitize_tool_call_arguments(input); + assert!(!sanitized.contains("```")); + assert!(!sanitized.is_empty()); +} + +#[test] +fn whitespace_only_yields_empty() { + let sanitized = sanitize_tool_call_arguments(" \n\t"); + assert!(sanitized.is_empty()); +} From 29fe62ea056a8c7b9a947613d10eb8ee4161450b Mon Sep 17 00:00:00 2001 From: James Peter Date: Tue, 28 Oct 2025 14:59:44 +1000 Subject: [PATCH 3/3] fix(ci): align preview build toolchain with rust-toolchain --- .github/workflows/preview-build.yml | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/.github/workflows/preview-build.yml b/.github/workflows/preview-build.yml index 2590fead3a8..4450c3233e5 100644 --- a/.github/workflows/preview-build.yml +++ b/.github/workflows/preview-build.yml @@ -63,15 +63,22 @@ jobs: - name: Install Rust toolchain shell: bash run: | + TOOLCHAIN=1.90.0 rustup set profile minimal - rustup toolchain install 1.89.0 --profile minimal --target ${{ matrix.target }} - rustup default 1.89.0 + rustup toolchain install "$TOOLCHAIN" --profile minimal + rustup default "$TOOLCHAIN" + + if [[ "$RUNNER_OS" == "Linux" ]]; then + rustup target add x86_64-unknown-linux-musl aarch64-unknown-linux-musl + fi + + rustup target add "${{ matrix.target }}" - name: Rust cache (target + registries) uses: Swatinem/rust-cache@v2 with: prefix-key: v1-preview - shared-key: preview-${{ matrix.target }}-rust-1.89 + shared-key: preview-${{ matrix.target }}-rust-1.90 workspaces: | code-rs -> target codex-rs -> target