Skip to content

feat(diagnostics): bridge generate failures to diagnose() hypotheses (#130)#165

Merged
hqhq1025 merged 2 commits intomainfrom
worktree-agent-a75d65c7
Apr 23, 2026
Merged

feat(diagnostics): bridge generate failures to diagnose() hypotheses (#130)#165
hqhq1025 merged 2 commits intomainfrom
worktree-agent-a75d65c7

Conversation

@hqhq1025
Copy link
Copy Markdown
Collaborator

Summary

Test-connection 失败已经走 `diagnose()` 给出可操作 hypothesis(如 404 → `missingV1` + 自动加 `/v1`),但真实生成失败完全没接这套系统,用户只看到原始 IPC 错误文案,毫无方向。

本 PR 把 generate 失败接到 diagnostic hypothesis 系统,覆盖 404/4xx/5xx 多种场景,并为 404 missing-`/v1` 加自动 fix 按钮。

What changed

main 进程

  • `apps/desktop/src/main/index.ts`:generate catch 块给 thrown error 附加 `upstream_status` / `upstream_provider` / `upstream_baseurl` / `upstream_wire` metadata
  • 新增 helper `extractUpstreamHttpStatus()`

shared

  • 新增 `diagnoseGenerateFailure()` in `packages/shared/src/diagnostics.ts`:
    • 400 + "instructions" 关键词 → `openaiResponsesMisconfigured` + `switchWire` 提示
    • 5xx + "not implemented" / "page not found" → `gatewayIncompatible` + `switchWire` 提示
    • 通用 5xx → `serverError` + `waitAndRetry`
    • 404(含裸 "404 page not found" 文案,bug(desktop): generation fails with codesign:v1:generate 404 on Win11 relay setup #130 原始形态)→ `missingV1` + `/v1` 自动 fix
    • 401/403/429/402 → 委托现有 `diagnose()`
    • 其它 → `unknown`(toast 跳过避免噪音)

renderer

  • `apps/desktop/src/renderer/src/store.ts` `applyGenerateError`:调用 `diagnoseGenerateFailure` 把 cause sentence 拼到 toast description;404 missing-`/v1` case 加 "Apply fix" action button 通过 `config:v1:update-provider` 自动加 `/v1`

i18n(en + zh-CN)

  • `diagnostics.cause.{gatewayIncompatible,openaiResponsesMisconfigured,serverError}`
  • `diagnostics.fix.switchWire`
  • `notifications.{generationFailedApplyFix,generationFailedBaseUrlUpdated}`

Design trade-offs

  • 新增 `diagnoseGenerateFailure` 而非扩展 `diagnose()` —— connection-test 传 `ErrorCode`,generate 传 `{status, message, wire, ...}`,重载会很别扭
  • status 提取:先读 `err.upstream_status`,fallback 到 message regex —— Electron IPC 在某些版本会剥掉 custom Error props
  • 404 auto-fix 按钮只对 missing-`/v1` 实现,其它 hypothesis 只有提示文本(updateKey/switchWire/addCredits 需要用户判断)

Test plan

  • 10 new tests in `diagnostics.test.ts`(25/25 pass)
  • 870/870 desktop tests green
  • `pnpm typecheck` + `pnpm lint` clean

Follow-up

Closes #130
Refs #124, #134, #158

@github-actions github-actions Bot added docs Documentation area:desktop apps/desktop (Electron shell, renderer) labels Apr 22, 2026
…130)

Main catch in codesign:v1:generate now tags the thrown error with
upstream_status / upstream_provider / upstream_baseurl / upstream_wire so
the renderer can reason about the failure without re-parsing err.message.

New diagnoseGenerateFailure() in @open-codesign/shared maps generate-time
failures to the DiagnosticHypothesis shape already used by the connection
-test path:
  - 404 (or bare "404 page not found" message) -> missing /v1 hint + baseUrl transform
  - 5xx with "not implemented" / "page not found" body -> gatewayIncompatible
  - 400 with "instructions" body -> openaiResponsesMisconfigured
  - 401 / 403 / 429 -> reuse existing hypotheses via diagnose(String(status))

Renderer applyGenerateError appends the most-likely-cause sentence to the
toast description and, for the missing-/v1 case, attaches an "Apply fix"
action that updates the provider baseUrl via config:v1:update-provider —
one-click fix for the #130 Win11 relay-gateway failure instead of a
dead-end error message.

Deliberately out of scope (owned by parallel PRs): retry.ts RetryDecision,
core/errors.ts remapProviderError, error-codes.ts additions, toPiContext,
core/agent.ts first-turn retry.

Signed-off-by: hqhq1025 <1506751656@qq.com>
@hqhq1025 hqhq1025 force-pushed the worktree-agent-a75d65c7 branch from 3771d08 to 48b63c7 Compare April 22, 2026 14:19
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Blocker] Silent fallback when fix-apply IPC is unavailable — violates the project rule that errors must surface instead of failing silently; if window.codesign?.config?.updateProvider is missing, clicking “Apply fix” does nothing with no user-visible error, evidence apps/desktop/src/renderer/src/store.ts:1149
    Suggested fix:
    const api = window.codesign?.config?.updateProvider;
    if (api === undefined) {
      get().reportableErrorToast({
        code: 'GENERATE_FIX_APPLY_FAILED',
        scope: 'generate',
        title: tr('notifications.generationFailed'),
        description: tr('errors.rendererDisconnected'),
      });
      return;
    }

Summary

  • Review mode: initial
  • 1 blocker found in modified lines.
  • docs/VISION.md and docs/PRINCIPLES.md: Not found in repo/docs.

Testing

  • Not run (automation)
  • Suggested tests: add a Vitest case for applyGenerateBaseUrlFix path where window.codesign?.config?.updateProvider is unavailable, asserting an error toast is emitted.

open-codesign Bot

providerId: string,
nextBaseUrl: string,
): Promise<void> {
const api = window.codesign?.config?.updateProvider;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Blocker] This introduces a silent fallback (if (api === undefined) return;). Project constraints require surfacing every error. If preload/API wiring is missing, clicking Apply fix currently no-ops with no feedback.

Suggested minimal change:

const api = window.codesign?.config?.updateProvider;
if (api === undefined) {
  get().reportableErrorToast({
    code: 'GENERATE_FIX_APPLY_FAILED',
    scope: 'generate',
    title: tr('notifications.generationFailed'),
    description: tr('errors.rendererDisconnected'),
  });
  return;
}

…ack (#130)

The Apply-fix toast button silently dropped clicks when
`window.codesign?.config?.updateProvider` was undefined (older app
builds) — violating the project rule that errors must surface, not
fail silently. Also tightened the error title when IPC throws so
users see "Could not apply fix" instead of the generic
"Generation failed".

- Missing IPC -> reportable error toast with code
  GENERATE_FIX_APPLY_UNAVAILABLE and a hint to edit Base URL in
  Settings.
- IPC rejection -> reportable error toast with its own title, carries
  message + stack for the Report dialog.
- Add three vitest cases covering success, rejection, and missing IPC.

Signed-off-by: hqhq1025 <1506751656@qq.com>
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • No actionable issues found in added/modified lines with >=80% confidence.

Summary

  • Review mode: follow-up after new commits
  • No new issues found in the latest diff.
  • docs/VISION.md and docs/PRINCIPLES.md: Not found in repo/docs.
  • Residual risk: diagnoseGenerateFailure relies on message-text heuristics; future provider message variations may reduce classification accuracy.

Testing

  • Not run (automation)
  • Suggested tests: add a renderer-level Vitest for applyGenerateError that asserts (1) 404 message path appends the most-likely-cause sentence and (2) missing-/v1 hypothesis exposes an Apply-fix action.

open-codesign Bot

@hqhq1025 hqhq1025 merged commit 022e1b6 into main Apr 23, 2026
7 checks passed
@hqhq1025 hqhq1025 deleted the worktree-agent-a75d65c7 branch April 23, 2026 02:53
hqhq1025 added a commit that referenced this pull request Apr 23, 2026
Some gateways (older sub2api / claude2api / anyrouter builds) mishandle
OpenAI Responses API SSE events and treat `response.output_text.delta`
/ `response.completed` as `[DONE]`, cutting the stream short with no
HTTP status — the error surfaces as a transport-level "terminated" /
"premature close" / ECONNRESET.

diagnoseGenerateFailure() now recognises this pattern and returns a
`relayStreamingBug` hypothesis with actionable fix copy (upgrade the
relay, switch wire to openai-chat, or use api.openai.com directly).

Detection signal (b): wire=openai-responses AND baseUrl host is not
*.openai.com AND no HTTP status attached AND the message matches a
truncated-stream shape. HTTP-status errors still route to the existing
gatewayIncompatible / serverError / keyInvalid paths.

Refs #167.

Stacked on #165 (introduces diagnoseGenerateFailure).

Signed-off-by: hqhq1025 <1506751656@qq.com>
hqhq1025 added a commit that referenced this pull request Apr 23, 2026
## Summary

Fix #180 (implementation of #167): give users an actionable diagnostic
when a third-party relay (older sub2api / claude2api / anyrouter builds)
mishandles OpenAI Responses API SSE events and cuts the stream short
with no HTTP status.

## What changed

- `packages/shared/src/diagnostics.ts` — new `relayStreamingBug`
hypothesis in `diagnoseGenerateFailure()`. Triggers when:
  - `wire === 'openai-responses'`
  - `baseUrl` host is **not** `*.openai.com`
  - No HTTP status is attached (transport-level failure, not a 4xx/5xx)
- Message matches a truncated-stream shape: `stream ended`, `premature
close`, `terminated`, `ECONNRESET`, `aborted`
- `packages/i18n/src/locales/en.json` + `zh-CN.json` —
`diagnostics.cause.relayStreamingBug` +
`diagnostics.fix.relayStreamingBug`.
- `packages/shared/src/diagnostics.test.ts` — 6 new tests covering
positive + negative cases:
  - openai-responses + custom baseUrl + "terminated" → triggers
- openai-responses + `api.openai.com` + "terminated" → does NOT trigger
(official endpoint isn't the culprit)
- openai-responses + custom baseUrl + 500 status → does NOT trigger
(routes to `serverError`)
  - anthropic wire + "terminated" → does NOT trigger (wrong wire)
  - "premature close" + "ECONNRESET" message shapes both matched

Detection signal used: **(b)** from the task spec — pattern-matching on
baseUrl + wire + message, since the existing error path
(`packages/providers/src/codex/client.ts`, `retry.ts`) does not yet
attach a "stream closed without response.completed" context field.

## Stacked on #165

This PR builds on the `diagnoseGenerateFailure()` function introduced in
#165 (branch `worktree-agent-a75d65c7`). **Merge #165 first**, then this
PR. Rebasing onto main after #165 lands will be a no-op.

## Four-principle check

- Compatibility: green — pure addition, no signature changes
- Upgradeability: green — hypothesis code-path is additive, falls
through to existing `serverError` / `unknown` when signals don't match
- No bloat: green — ~25 LOC + 2 i18n strings per locale
- Elegance: green — same shape as existing hypotheses, reuses the
DiagnosticHypothesis surface

## Test plan

- [x] `pnpm --filter @open-codesign/shared test` — 174 passed (8 files)
- [x] `pnpm typecheck` — 10 packages green
- [x] `pnpm lint` — biome clean (360 files)
- [ ] Manual: reproduce with a gateway that truncates `response.*` SSE
events, confirm the diagnostic panel renders the new cause + fix strings

Refs #167, closes #180.

Signed-off-by: hqhq1025 <1506751656@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:desktop apps/desktop (Electron shell, renderer) docs Documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug(desktop): generation fails with codesign:v1:generate 404 on Win11 relay setup

1 participant