Skip to content

fix(local_ai): Windows Ollama discovery + DirectML GPU for GLiNER RelEx#416

Merged
senamakel merged 1 commit into
tinyhumansai:mainfrom
sanil-23:fix/ollama-windows-directml-gpu
Apr 7, 2026
Merged

fix(local_ai): Windows Ollama discovery + DirectML GPU for GLiNER RelEx#416
senamakel merged 1 commit into
tinyhumansai:mainfrom
sanil-23:fix/ollama-windows-directml-gpu

Conversation

@sanil-23
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 commented Apr 7, 2026

Summary

  • Ollama not found on Windows: find_system_ollama_binary now checks %LOCALAPPDATA%\Programs\Ollama and %PROGRAMFILES%\Ollama — matching macOS/Linux which already had common-path fallbacks. Also logs spawn errors instead of silently swallowing them, and falls back to system paths after the NSIS installer.
  • GLiNER RelEx GPU acceleration: ONNX session now offers DirectML (Windows, any DX12 GPU), CoreML (macOS), and CUDA execution providers with automatic CPU fallback. Updated release to v0.5-onnx.2 with a DirectML-enabled onnxruntime.dll.
  • Bundle integrity: bundle_complete now requires the platform ORT DLL to exist. managed_bundle_complete verifies SHA256 checksums so stale CPU-only DLLs trigger re-download.

Files changed

File What
src/openhuman/local_ai/install.rs Add Windows common Ollama paths
src/openhuman/local_ai/service/ollama_admin.rs Log spawn errors + system path fallback
src/openhuman/memory/relex.rs DirectML/CoreML/CUDA EPs, bundle checks, new release hashes

Test plan

  • cargo check passes (17 pre-existing warnings, 0 new)
  • Ollama RPC local_ai_status returns "state": "ready" on Windows
  • local_ai_chat returns valid response via Ollama
  • GLiNER RelEx ingest works with DirectML DLL auto-downloaded
  • DirectML onnxruntime.dll confirmed: DmlExecutionProvider present in binary
  • Verify on macOS (CoreML dylib from v0.5-onnx.2)
  • Verify on Linux (unchanged, CPU)

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added platform-specific GPU acceleration support (DirectML on Windows, CoreML on macOS, CUDA universal).
    • Enhanced model bundle integrity validation with SHA256 checksum verification.
  • Bug Fixes

    • Improved Ollama binary discovery on Windows by checking standard installation paths.
    • Better error handling when Ollama installation is missing; system binary now used as fallback.
    • Clearer error messages with guidance for resolving configuration issues.

…or GLiNER RelEx

Ollama was not found on Windows because find_system_ollama_binary lacked
common Windows install paths (%LOCALAPPDATA%\Programs\Ollama). The server
spawn also silently swallowed errors, and the NSIS installer fallback
didn't check system paths after install.

GLiNER RelEx ONNX sessions were CPU-only — no execution providers were
configured. Now offers DirectML (Windows), CoreML (macOS), and CUDA as
GPU backends with automatic fallback. Updated the release to v0.5-onnx.2
with a DirectML-enabled onnxruntime.dll. Bundle completeness now requires
the platform DLL and verifies checksums to trigger re-download on update.

Changes:
- Add Windows common paths to find_system_ollama_binary (install.rs)
- Log and return spawn errors in start_and_wait_for_server (ollama_admin.rs)
- Fall back to find_system_ollama_binary after Windows installer (ollama_admin.rs)
- Add platform_execution_providers() with DirectML/CoreML/CUDA (relex.rs)
- Require ORT DLL in bundle_complete check (relex.rs)
- Verify platform DLL checksums in managed_bundle_complete (relex.rs)
- Update release URL and SHA256 hashes for v0.5-onnx.2 (relex.rs)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 7, 2026

📝 Walkthrough

Walkthrough

The PR enhances system binary discovery for Ollama on Windows by checking additional installation paths, improves Ollama server spawn error handling with better logging, and updates ONNX runtime configuration with platform-specific GPU providers and stricter bundle validation.

Changes

Cohort / File(s) Summary
Windows Ollama Binary Discovery
src/openhuman/local_ai/install.rs
Added Windows-specific search paths for Ollama executable under LOCALAPPDATA and PROGRAMFILES directories to extend existing binary discovery logic.
Ollama Server Lifecycle Management
src/openhuman/local_ai/service/ollama_admin.rs
Enhanced spawn error handling with debug/warning logging and fallback to system Ollama binary when workspace installation is missing; expanded error messages with setup guidance.
ONNX Runtime Configuration & Validation
src/openhuman/memory/relex.rs
Updated bundled ONNX model version, added platform-specific GPU execution provider selection (DirectML/CoreML/CUDA), and tightened bundle completeness validation with SHA256 checksum matching.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • Fix Ollama workspace install flow #310: Addresses related Ollama binary resolution and fallback logic—this PR adds Windows-specific discovery paths while that PR handles workspace install/resolution restructuring.

Poem

🐰 Through Windows paths the bunny hops,
Finding Ollama where it stops,
With GPU speed and checksums true,
The bundle's verified anew!
When servers spawn, no errors hide—
Just fallback grace and logging pride.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly and clearly summarizes the main changes: Windows Ollama discovery improvements and DirectML GPU acceleration for GLiNER RelEx, which are the three primary objectives of the PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/openhuman/memory/relex.rs (1)

575-584: Consider using spawn_blocking or streaming SHA256 for large files.

The managed_bundle_complete function is called from async contexts (resolve_bundle_dir, ensure_managed_bundle) but performs synchronous file reads via file_matches_sha256_sync. ORT DLLs can be 100+ MB, which could block the async runtime.

Since this is an initialization path that runs infrequently and files are local, the impact is limited. However, for robustness:

  1. Wrap the call in tokio::task::spawn_blocking, or
  2. Use a streaming hash approach to avoid loading the entire file into memory.
♻️ Option 1: Use spawn_blocking in resolve_bundle_dir
 // 3. Check managed bundle in user home directory.
 let managed_dir = default_managed_bundle_dir();
-if managed_bundle_complete(&managed_dir) {
+let managed_dir_clone = managed_dir.clone();
+let is_complete = tokio::task::spawn_blocking(move || managed_bundle_complete(&managed_dir_clone))
+    .await
+    .unwrap_or(false);
+if is_complete {
     return Some(managed_dir);
 }
♻️ Option 2: Streaming SHA256 to reduce memory footprint
fn file_matches_sha256_sync(path: &Path, expected: &str) -> bool {
    if expected.is_empty() {
        return path.exists();
    }
    let Ok(file) = std::fs::File::open(path) else {
        return false;
    };
    let mut reader = std::io::BufReader::new(file);
    let mut hasher = Sha256::new();
    let mut buffer = [0u8; 8192];
    loop {
        let Ok(n) = std::io::Read::read(&mut reader, &mut buffer) else {
            return false;
        };
        if n == 0 {
            break;
        }
        hasher.update(&buffer[..n]);
    }
    let actual = hex::encode(hasher.finalize());
    actual.eq_ignore_ascii_case(expected)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/openhuman/memory/relex.rs` around lines 575 - 584,
managed_bundle_complete is sync but called from async paths and uses
file_matches_sha256_sync which can block on large files; update the call path so
hashing runs off the async runtime: either make managed_bundle_complete async
and wrap the checksum checks in tokio::task::spawn_blocking (invoking
file_matches_sha256_sync inside spawn_blocking) from
resolve_bundle_dir/ensure_managed_bundle, or replace file_matches_sha256_sync
with a non-blocking streaming implementation (reading via a BufReader and
incremental Sha256) and call that from managed_bundle_complete; reference the
functions managed_bundle_complete, file_matches_sha256_sync, resolve_bundle_dir,
and ensure_managed_bundle when applying the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/openhuman/memory/relex.rs`:
- Around line 441-464: Update the ort dependency in Cargo.toml to enable the
execution-provider features required by platform_execution_providers: add the
"cuda", "directml", and "coreml" features so that ep::CUDA, ep::DirectML and
ep::CoreML can register (keep existing version, default-features=false and other
features like "std","ndarray","load-dynamic"), then run cargo check and cargo
fmt to ensure the build and formatting pass.

---

Nitpick comments:
In `@src/openhuman/memory/relex.rs`:
- Around line 575-584: managed_bundle_complete is sync but called from async
paths and uses file_matches_sha256_sync which can block on large files; update
the call path so hashing runs off the async runtime: either make
managed_bundle_complete async and wrap the checksum checks in
tokio::task::spawn_blocking (invoking file_matches_sha256_sync inside
spawn_blocking) from resolve_bundle_dir/ensure_managed_bundle, or replace
file_matches_sha256_sync with a non-blocking streaming implementation (reading
via a BufReader and incremental Sha256) and call that from
managed_bundle_complete; reference the functions managed_bundle_complete,
file_matches_sha256_sync, resolve_bundle_dir, and ensure_managed_bundle when
applying the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 71b63262-19ce-41cb-af04-c4a2bd456d4b

📥 Commits

Reviewing files that changed from the base of the PR and between c3dd137 and eec1ca8.

📒 Files selected for processing (3)
  • src/openhuman/local_ai/install.rs
  • src/openhuman/local_ai/service/ollama_admin.rs
  • src/openhuman/memory/relex.rs

Comment thread src/openhuman/memory/relex.rs
@senamakel senamakel merged commit 000b40b into tinyhumansai:main Apr 7, 2026
8 of 9 checks passed
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
…or GLiNER RelEx (tinyhumansai#416)

Ollama was not found on Windows because find_system_ollama_binary lacked
common Windows install paths (%LOCALAPPDATA%\Programs\Ollama). The server
spawn also silently swallowed errors, and the NSIS installer fallback
didn't check system paths after install.

GLiNER RelEx ONNX sessions were CPU-only — no execution providers were
configured. Now offers DirectML (Windows), CoreML (macOS), and CUDA as
GPU backends with automatic fallback. Updated the release to v0.5-onnx.2
with a DirectML-enabled onnxruntime.dll. Bundle completeness now requires
the platform DLL and verifies checksums to trigger re-download on update.

Changes:
- Add Windows common paths to find_system_ollama_binary (install.rs)
- Log and return spawn errors in start_and_wait_for_server (ollama_admin.rs)
- Fall back to find_system_ollama_binary after Windows installer (ollama_admin.rs)
- Add platform_execution_providers() with DirectML/CoreML/CUDA (relex.rs)
- Require ORT DLL in bundle_complete check (relex.rs)
- Verify platform DLL checksums in managed_bundle_complete (relex.rs)
- Update release URL and SHA256 hashes for v0.5-onnx.2 (relex.rs)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants