Skip to content

Fix Ollama API timeout for local LLM inference#23

Closed
Copilot wants to merge 3 commits intomainfrom
copilot/fix-llm-timeout-issues
Closed

Fix Ollama API timeout for local LLM inference#23
Copilot wants to merge 3 commits intomainfrom
copilot/fix-llm-timeout-issues

Conversation

Copy link
Contributor

Copilot AI commented Jan 26, 2026

Fix LLM Timeouts for Local Ollama ✅

Successfully resolved the issue where local Ollama-based LLM operations were timing out after ~10 seconds by configuring extended HTTP timeouts for the Ollama API client.

Problem

  • ApiService was throwing System.Threading.Tasks.TaskCanceledException after ~10 seconds
  • Default HTTP resilience handler timeout was 10 seconds, too short for local LLM inference
  • Issue was intermittent but more common on low-performance machines

Solution

  • Rebased on latest main branch with Azure AI Foundry support
  • Configured timeout extension to apply only for Local Ollama provider (not Azure)
  • Increased TotalRequestTimeout to 5 minutes (from 10 seconds default)
  • Set AttemptTimeout to 2 minutes (leaving 3 minutes for retry logic)
  • Set HttpClient.Timeout to InfiniteTimeSpan to avoid conflicts with resilience pipeline
  • Configured connection pooling for better performance with long-running requests
  • Verified build succeeds with both Local and Azure configurations

Changes Made

  • Timeout configuration now only applies when using Local Ollama provider (inside the else block at lines 66-86)
  • Azure AI Foundry provider uses default timeout settings (which are appropriate for cloud services)
  • Configuration is properly scoped within the provider-specific conditional block
  • Added clarifying comment to indicate scope of timeout configuration

Security Summary

No security vulnerabilities were introduced. The changes only involve configuration of timeout values and connection pooling settings for HTTP client communication with the local Ollama service.

Original prompt

This section details on the original issue you should resolve

<issue_title>LLM timeouts when running locally</issue_title>
<issue_description>When executing against a local Ollama-based LLM, the chats often time out.

ApiService throws System.Threading.Tasks.TaskCanceledException with the following stack trace:

System.Threading.Tasks.TaskCanceledException: The operation was canceled.
 ---> System.IO.IOException: Unable to read data from the transport connection: I/O-åtgärden har avbrutits därför att en tråd har avslutats eller för att ett program har begärt det..
 ---> System.Net.Sockets.SocketException (995): I/O-åtgärden har avbrutits därför att en tråd har avslutats eller för att ett program har begärt det.
   --- End of inner exception stack trace ---
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
   at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.System.Threading.Tasks.Sources.IValueTaskSource<System.Int32>.GetResult(Int16 token)
   at System.Net.Http.HttpConnection.InitialFillAsync(Boolean async)
   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.HttpConnection.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.Metrics.MetricsHandler.SendAsyncWithMetrics(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)

In the Docker Ollama container I see these logs:

[GIN] 2026/01/26 - 19:20:14 | 500 | 10.001327917s |      172.18.0.1 | POST     "/api/chat"
time=2026-01-26T19:20:21.847Z level=INFO source=runner.go:681 msg="aborting completion request due to client closing the connection"

This is intermittent but seems a little more common on low-performance machines.</issue_description>

Comments on the Issue (you are @copilot in this section)


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits January 26, 2026 19:28
…meouts

Co-authored-by: hammar <68027+hammar@users.noreply.github.com>
Co-authored-by: hammar <68027+hammar@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix LLM timeouts when executing locally Fix Ollama API timeout for local LLM inference Jan 26, 2026
Copilot AI requested a review from hammar January 26, 2026 19:33
Copy link
Owner

@hammar hammar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks promising but please rebase this on the latest main branch (that supports Azure AI Foundry) and ensure that these changes only kick in if we are using a local Ollama client, rather than the AI Foundry client.

Copy link
Contributor Author

Copilot AI commented Jan 26, 2026

Looks promising but please rebase this on the latest main branch (that supports Azure AI Foundry) and ensure that these changes only kick in if we are using a local Ollama client, rather than the AI F...

Done! Rebased on latest main (commit ed8ddab) which includes Azure AI Foundry support. The timeout configuration now only applies when using the Local Ollama provider (lines 66-87 in Program.cs, inside the else block). Azure AI Foundry uses default timeout settings.

Changes in commits 8ef3299, ffcc634.

@hammar
Copy link
Owner

hammar commented Jan 27, 2026

Restarting work on this PR.

@hammar hammar closed this Jan 27, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LLM timeouts when running locally

2 participants