-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat: Add GPT-5 Pro with background mode auto-resume and polling #8608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Not sure yet. Will look at it after we get Native Tool Calling out later this week. Sorry for the delay. |
…for long-running models (e.g., gpt-5-pro) - Introduce ModelInfo.disableTimeout to opt out of request timeouts on a per-model basis - Apply in OpenAI-compatible, Ollama, and LM Studio providers (timeout=0 when flag is true) - Preserve global “API Request Timeout” behavior (0 still disables globally); per-model flag takes precedence for that model - Motivation: gpt-5-pro often requires longer runtimes; per-model override avoids forcing a global setting that impacts all models - Add/extend unit tests to validate provider behavior
…nd non‑streaming notice - Add GPT‑5 Pro to model registry with: - contextWindow: 400k, maxTokens: 272k - supportsImages: true, supportsPromptCache: true, supportsVerbosity: true, supportsTemperature: false - reasoningEffort: high (Responses API only) - pricing: $15/1M input tokens, $120/1M output tokens - Set disableTimeout: true to avoid requiring a global timeout override - Description clarifies: this is a slow, reasoning‑focused model designed for tough problems; requests may take several minutes; it does not stream (UI may appear idle until completion)
…-5-pro model entry (server-side timeouts). Prep for background mode approach.
Enable OpenAI Responses background mode with resilient streaming for GPT‑5 Pro and any model flagged via metadata.
Key changes:
- Background mode enablement
• Auto-enable for models with info.backgroundMode === true (e.g., gpt-5-pro-2025-10-06) defined in [packages/types/src/providers/openai.ts](packages/types/src/providers/openai.ts).
• Also respects manual override (openAiNativeBackgroundMode) from ProviderSettings/ApiHandlerOptions.
- Request shape (Responses API)
• background:true, stream:true, store:true set in [OpenAiNativeHandler.buildRequestBody()](src/api/providers/openai-native.ts:224).
- Streaming UX and status events
• New ApiStreamStatusChunk in [src/api/transform/stream.ts](src/api/transform/stream.ts) with statuses: queued, in_progress, completed, failed, canceled, reconnecting, polling.
• Provider emits status chunks in SDK + SSE paths via [OpenAiNativeHandler.processEvent()](src/api/providers/openai-native.ts:1100) and [OpenAiNativeHandler.handleStreamResponse()](src/api/providers/openai-native.ts:651).
• UI spinner shows background lifecycle labels in [webview-ui/src/components/chat/ChatRow.tsx](webview-ui/src/components/chat/ChatRow.tsx) using [webview-ui/src/utils/backgroundStatus.ts](webview-ui/src/utils/backgroundStatus.ts).
- Resilience: auto-resume + poll fallback
• On stream drop for background tasks, attempt SSE resume using response.id and last sequence_number with exponential backoff in [OpenAiNativeHandler.attemptResumeOrPoll()](src/api/providers/openai-native.ts:1215).
• If resume fails, poll GET /v1/responses/{id} every 2s until terminal and synthesize final output/usage.
• Deduplicate resumed events via resumeCutoffSequence in [handleStreamResponse()](src/api/providers/openai-native.ts:737).
- Settings (no new UI switch)
• Added optional provider settings and ApiHandlerOptions: autoResume, resumeMaxRetries, resumeBaseDelayMs, pollIntervalMs, pollMaxMinutes in [packages/types/src/provider-settings.ts](packages/types/src/provider-settings.ts) and [src/shared/api.ts](src/shared/api.ts).
- Cleanup
• Removed VS Code contributes toggle for background mode; behavior now model-driven + programmatic override.
- Tests
• Provider: coverage for background status emission, auto-resume success, resume→poll fallback, non-background negative in [src/api/providers/__tests__/openai-native.spec.ts](src/api/providers/__tests__/openai-native.spec.ts).
• Usage parity unchanged validated in [src/api/providers/__tests__/openai-native-usage.spec.ts](src/api/providers/__tests__/openai-native-usage.spec.ts).
• UI: label mapping tests for background statuses in [webview-ui/src/utils/__tests__/backgroundStatus.spec.ts](webview-ui/src/utils/__tests__/backgroundStatus.spec.ts).
Notes:
- Aligns with TEMP_OPENAI_BACKGROUND_TASK_DOCS.DM: background requires store=true; supports streaming resume via response.id + sequence_number.
- Default behavior unchanged for non-background models; no breaking changes.
…description, remove duplicate test, revert gitignore
…ded dep to useMemo; test: remove duplicate GPT-5 Pro background-mode test; chore(core): remove temp debug log
…l description for clarity
…background labels; fix deps warning in ChatRow useMemo
…assify permanent vs transient errors; chore(task): remove temporary debug log
…core/task): avoid full-state refresh on each background status chunk to reduce re-renders
3a0add7 to
f138361
Compare
@hannesrudolph Thank you so much for your effort. I am looking forward to use this feature soon in production :) |
Feel free to give it a try in its current form. I have not tested it since I just updated it and have to run out! BBL |
|
@hannesrudolph Hi, I’ve been testing the For long-running background responses, the initial That extra I tested the following change on @@
- private normalizeUsage(usage: any, model: OpenAiNativeModel): ApiStreamUsageChunk | undefined {
+ private buildResponsesUrl(path: string): string {
+ const rawBase = this.options.openAiNativeBaseUrl || "https://api.openai.com"
+ // Normalize base by trimming trailing slashes
+ const normalizedBase = rawBase.replace(/\/+$/, "")
+ // If the base already ends with a version segment (e.g. /v1), do not append another
+ const hasVersion = /\/v\d+(?:\.\d+)?$/.test(normalizedBase)
+ const baseWithVersion = hasVersion ? normalizedBase : `${normalizedBase}/v1`
+ const normalizedPath = path.startsWith("/") ? path : `/${path}`
+ return `${baseWithVersion}${normalizedPath}`
+ }
+
+ private normalizeUsage(usage: any, model: OpenAiNativeModel): ApiStreamUsageChunk | undefined {
@@
- const apiKey = this.options.openAiNativeApiKey ?? "not-provided"
- const baseUrl = this.options.openAiNativeBaseUrl || "https://api.openai.com"
- const url = `${baseUrl}/v1/responses`
+ const apiKey = this.options.openAiNativeApiKey ?? "not-provided"
+ const url = this.buildResponsesUrl("responses")
@@
- const apiKey = this.options.openAiNativeApiKey ?? "not-provided"
- const baseUrl = this.options.openAiNativeBaseUrl || "https://api.openai.com"
+ const apiKey = this.options.openAiNativeApiKey ?? "not-provided"
const resumeMaxRetries = this.options.openAiNativeBackgroundResumeMaxRetries ?? 3
const resumeBaseDelayMs = this.options.openAiNativeBackgroundResumeBaseDelayMs ?? 1000
@@
- const resumeUrl = `${baseUrl}/v1/responses/${responseId}?stream=true&starting_after=${lastSeq}`
+ const resumeUrl = this.buildResponsesUrl(
+ `responses/${responseId}?stream=true&starting_after=${lastSeq}`,
+ )
@@
- const pollRes = await fetch(`${baseUrl}/v1/responses/${responseId}`, {
+ const pollRes = await fetch(this.buildResponsesUrl(`responses/${responseId}`), {
method: "GET",
headers: {
Authorization: `Bearer ${apiKey}`,
},
Behavior-wise:
If you’d prefer, I can also open a small PR from my fork targeting the After applying the change above, everything has been working great with Azure OpenAI. |
|
Closing for now as it is way off from the current codebase. |
Summary
Adds GPT-5 Pro model with OpenAI Responses API background mode support. Background requests can take several minutes, so this implements resilient streaming with automatic recovery.
Why
GPT-5 Pro is a slow, reasoning-focused model that can take several minutes to respond. The standard streaming approach times out or appears stuck. OpenAI's Responses API background mode is designed for these long-running requests.
What Changed
Model Addition
gpt-5-pro-2025-10-06withbackgroundMode: trueflag in model metadataBackground Mode Implementation
background: true,stream: true,store: truefor flagged modelsqueued→in_progress→completed/failedResilient Streaming
GET /v1/responses/{id}?starting_after={seq}Files Changed
packages/types/src/providers/openai.ts- Model metadatasrc/api/providers/openai-native.ts- Background mode logic, auto-resume, pollingsrc/core/task/Task.ts- Status event handlingwebview-ui/src/utils/backgroundStatus.ts- Status label mappingwebview-ui/src/components/chat/*- UI status displayTesting
Important
Adds GPT-5 Pro model with background mode, implementing resilient streaming and UI updates for status tracking.
gpt-5-pro-2025-10-06model withbackgroundMode: trueinopenai.ts.openai-native.ts.ChatRow.tsxandChatView.tsxfor background status display.ChatRow.tsxandChatView.tsxto handle new status labels.backgroundStatus.spec.tsfor status label mapping.openai-native.spec.ts.This description was created by
for 706138e. You can customize this summary. It will automatically update as commits are pushed.