Skip to content

Fix local Ollama timeout by removing global resilience handler from client#25

Open
Copilot wants to merge 9 commits intomainfrom
copilot/fix-llm-timeouts-local
Open

Fix local Ollama timeout by removing global resilience handler from client#25
Copilot wants to merge 9 commits intomainfrom
copilot/fix-llm-timeouts-local

Conversation

Copy link
Contributor

Copilot AI commented Jan 27, 2026

Fix LLM Timeout Issue for Local Ollama ✅

This PR fixes the timeout issue when executing against a local Ollama-based LLM. The issue was that HTTP requests were timing out after approximately 10 seconds, causing TaskCanceledException errors.

Root Cause Analysis:

The fundamental issue is that the global AddStandardResilienceHandler() cannot be effectively overridden per-client in .NET Aspire. When attempting to configure custom timeout settings for a specific named HttpClient while having a global resilience handler, the policies stack rather than replace, and whichever timeout is shorter wins. This is a documented limitation in the .NET framework.

Final Solution:

  • Added configurable timeout to AIConfig: Added a TimeoutSeconds property with a default value of 90 seconds

  • Remove AddStandardResilienceHandler for Ollama client:

    • Do NOT call AddStandardResilienceHandler on the Ollama HttpClient
    • Instead, set HttpClient.Timeout directly to the desired value
    • Clear any additional handlers added by ConfigureHttpClientDefaults to prevent the global resilience handler from interfering
    • This allows streaming LLM responses to take as long as needed (up to the configured timeout)
  • Documented configuration: Added the TimeoutSeconds setting to appsettings.json

Technical Details:

Problem: The global AddStandardResilienceHandler() in ConfigureHttpClientDefaults applies a 10-second timeout to all HTTP clients. Attempting to override this per-client doesn't work because:

  1. The resilience handler policies stack rather than replace
  2. Whichever timeout is shorter (global or per-client) will trigger first
  3. There's no supported way to exclude specific clients from the global handler

Solution: For the Ollama client specifically, configure it WITHOUT the standard resilience handler:

// Configure the named HttpClient WITHOUT AddStandardResilienceHandler
builder.Services.AddHttpClient($"{OllamaConnectionName}_httpClient", client =>
{
    // Set timeout directly on the HttpClient
    client.Timeout = TimeSpan.FromSeconds(timeoutSeconds);
})
// Clear any handlers that were added by ConfigureHttpClientDefaults
.ConfigureAdditionalHttpMessageHandlers((handlers, _) =>
{
    handlers.Clear(); // Removes the standard resilience handler
});

// THEN create the Ollama client - it will use the pre-configured HttpClient
builder.AddOllamaApiClient(OllamaConnectionName)...

This approach ensures that:

  • Only the HttpClient.Timeout applies (no conflicting resilience policies)
  • Streaming responses can take up to 90 seconds (or configured value)
  • No circuit breaker validation errors

Testing:

  • ✅ The solution builds successfully with no errors or warnings
  • ✅ Existing tests continue to pass

Configuration:

Users can configure the timeout in appsettings.json:

"AIConfig": {
  "Provider": "Local",
  "ModelId": "llama32",
  "TimeoutSeconds": 90
}

The default of 90 seconds provides a good balance between accommodating slower local LLMs and maintaining reasonable responsiveness.

Original prompt

This section details on the original issue you should resolve

<issue_title>LLM timeouts when running locally</issue_title>
<issue_description>When executing against a local Ollama-based LLM, the chats often time out.

ApiService throws System.Threading.Tasks.TaskCanceledException with the following stack trace:

System.Threading.Tasks.TaskCanceledException: The operation was canceled.
 ---> System.IO.IOException: Unable to read data from the transport connection: I/O-åtgärden har avbrutits därför att en tråd har avslutats eller för att ett program har begärt det..
 ---> System.Net.Sockets.SocketException (995): I/O-åtgärden har avbrutits därför att en tråd har avslutats eller för att ett program har begärt det.
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)
   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.Metrics.MetricsHandler.SendAsyncWithMetrics(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)

In the Docker Ollama container I see these logs:

[GIN] 2026/01/26 - 19:20:14 | 500 | 10.001327917s |      172.18.0.1 | POST     "/api/chat"
time=2026-01-26T19:20:21.847Z level=INFO source=runner.go:681 msg="aborting completion request due to client closing the connection"

This is intermittent but seems a little more common on low-performance machines.</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 4 commits January 27, 2026 06:28
…sues

Co-authored-by: hammar <68027+hammar@users.noreply.github.com>
Co-authored-by: hammar <68027+hammar@users.noreply.github.com>
…nection name

Co-authored-by: hammar <68027+hammar@users.noreply.github.com>
Co-authored-by: hammar <68027+hammar@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix LLM timeouts when running locally Fix local Ollama timeout by configuring HttpClient timeout to 5 minutes Jan 27, 2026
Copilot AI requested a review from hammar January 27, 2026 06:35
@hammar hammar marked this pull request as ready for review January 27, 2026 06:43
Copy link
Owner

@hammar hammar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks promising but unfortunately the issue still occurs. Below is an example of the error from the ApiService side:

System.Threading.Tasks.TaskCanceledException: The operation was canceled.
 ---> System.IO.IOException: Unable to read data from the transport connection: Operation canceled.
 ---> System.Net.Sockets.SocketException (89): Operation canceled
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)
   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.Metrics.MetricsHandler.SendAsyncWithMetrics(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)

Is there something else you can try?

…eation

Co-authored-by: hammar <68027+hammar@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 27, 2026

This looks promising but unfortunately the issue still occurs. Below is an example of the error from the ApiService side:

The initial fix didn't work because the global AddStandardResilienceHandler() timeout (10 seconds) takes precedence over HttpClient.Timeout.

I've updated the implementation in commit 6fe62cf to configure the resilience handler BEFORE the Ollama client is created. The fix now:

  1. Pre-registers the named HttpClient with custom resilience settings
  2. Sets HttpClient.Timeout to Timeout.InfiniteTimeSpan so only the resilience handler timeout applies
  3. Configures both TotalRequestTimeout and AttemptTimeout to the desired value (default: 300 seconds)

This approach overrides the global 10-second default for the Ollama client specifically.

Co-authored-by: hammar <68027+hammar@users.noreply.github.com>
Copilot AI changed the title Fix local Ollama timeout by configuring HttpClient timeout to 5 minutes Fix local Ollama timeout by configuring resilience handler before client creation Jan 27, 2026
Copilot AI requested a review from hammar January 27, 2026 06:49
Copy link
Owner

@hammar hammar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two problems:

  1. 300 seconds min timeout seems a bit excessive.
  2. When launched ApiService errors out with the following:

Microsoft.Extensions.Options.OptionsValidationException: The sampling duration of circuit breaker strategy needs to be at least double of an attempt timeout strategy’s timeout interval, in order to be effective. Sampling Duration: 30s,Attempt Timeout: 300s

…duration

Co-authored-by: hammar <68027+hammar@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 28, 2026

Two problems:

  1. 300 seconds min timeout seems a bit excessive....

Fixed both issues in commit 8260912:

  1. Reduced default timeout: Changed from 300s to 90s for better responsiveness while still accommodating slower local LLMs
  2. Fixed validation error: Added circuit breaker SamplingDuration configuration set to 2.5x the attempt timeout (validation requires it to be at least double)

The configuration now:

  • Default timeout: 90 seconds (configurable via AIConfig.TimeoutSeconds)
  • Circuit breaker sampling duration: dynamically set to 2.5x the attempt timeout
  • Minimum timeout: 30 seconds for practical LLM operations

Copilot AI requested a review from hammar January 28, 2026 07:19
Copy link
Owner

@hammar hammar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately the issue keeps happening.

…issues

Co-authored-by: hammar <68027+hammar@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 28, 2026

Unfortunately the issue keeps happening.

The root cause is that the global AddStandardResilienceHandler() cannot be overridden per-client in .NET - the policies stack and whichever timeout is shorter wins. This is a documented limitation.

Fixed in commit 51bbb70 by removing AddStandardResilienceHandler from the Ollama HttpClient entirely. Instead:

  1. Set HttpClient.Timeout directly to the configured value (default: 90s)
  2. Clear any additional handlers added by ConfigureHttpClientDefaults to prevent interference
  3. This allows streaming LLM responses to run for the full timeout duration without the global 10-second limit

The Ollama client now operates without the standard resilience handler, using only the direct HttpClient timeout.

Copilot AI changed the title Fix local Ollama timeout by configuring resilience handler before client creation Fix local Ollama timeout by removing global resilience handler from client Jan 28, 2026
Copilot AI requested a review from hammar January 28, 2026 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LLM timeouts when running locally

2 participants