fix(voice): enable GPU detection and Metal acceleration for whisper (#558)#571
Conversation
📝 WalkthroughWalkthroughDevice GPU detection was expanded (Apple Metal, Intel Mac, NVIDIA via nvidia-smi) and GPU availability/description are now passed through service bootstrap/speech into the Whisper engine; load_engine was extended to accept and apply GPU settings (use_gpu, flash_attn). Changes
Sequence Diagram(s)sequenceDiagram
participant App
participant DeviceDetector
participant Service
participant WhisperEngine
participant System as "Host (nvidia-smi)"
App->>Service: request init/load model
Service->>DeviceDetector: detect_device_profile()
DeviceDetector->>System: (on Windows/Linux) run `nvidia-smi`
System-->>DeviceDetector: GPU name or failure
DeviceDetector-->>Service: DeviceProfile(has_gpu, gpu_description)
Service->>WhisperEngine: spawn_blocking load_engine(handle, model, has_gpu, gpu_description)
WhisperEngine-->>Service: load result (success/failure)
Service-->>App: return initialization result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
…inyhumansai#558) Add Intel Mac detection (no Metal GPU for whisper), nvidia-smi probe for Windows/Linux NVIDIA GPUs, and diagnostic tracing at each decision point in detect_gpu(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ai#558) Thread DeviceProfile has_gpu and gpu_description through both the bootstrap (startup) and speech (lazy) whisper engine load calls so the engine can configure Metal or CUDA acceleration at runtime. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
bb070cc to
9a7e75e
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/openhuman/local_ai/device.rs (1)
99-124: Consider adding a timeout to thenvidia-smisubprocess.The synchronous
Command::output()call could block indefinitely ifnvidia-smihangs (e.g., due to driver issues). While rare, this would prevent startup from completing. A timeout would improve resilience.♻️ Suggested timeout approach
fn probe_nvidia_smi() -> Option<String> { use std::time::Duration; let mut child = std::process::Command::new("nvidia-smi") .args(["--query-gpu=name", "--format=csv,noheader"]) .stdout(std::process::Stdio::piped()) .stderr(std::process::Stdio::null()) .spawn() .ok()?; // Wait up to 5 seconds for nvidia-smi to complete. let start = std::time::Instant::now(); loop { match child.try_wait() { Ok(Some(status)) if status.success() => break, Ok(Some(_)) => return None, // Non-zero exit Ok(None) if start.elapsed() > Duration::from_secs(5) => { let _ = child.kill(); tracing::debug!("nvidia-smi timed out"); return None; } Ok(None) => std::thread::sleep(Duration::from_millis(50)), Err(_) => return None, } } let output = child.wait_with_output().ok()?; let name = String::from_utf8_lossy(&output.stdout) .lines() .next()? .trim() .to_string(); if name.is_empty() { return None; } Some(format!("NVIDIA {name} (CUDA)")) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/local_ai/device.rs` around lines 99 - 124, The probe_nvidia_smi function should avoid blocking indefinitely by replacing the synchronous Command::output() call with spawn() and a timed wait: spawn the process in probe_nvidia_smi, poll with child.try_wait() in a short sleep loop up to a chosen timeout (e.g., 5s), handle Ok(Some(status)) (return None on non-zero), on timeout kill the child and return None, then call child.wait_with_output() to read stdout and proceed with the existing UTF-8/empty checks; include a debug log (e.g., tracing::debug!) when a timeout occurs and ensure all error cases still return None.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/openhuman/local_ai/service/whisper_engine.rs`:
- Around line 69-80: The code calls WhisperContextParameters.use_gpu(has_gpu)
and WhisperContextParameters.flash_attn(true) but those methods may not exist on
the custom whisper-rs-sys fork; inspect the forked definition of
WhisperContextParameters and confirm the presence and signatures of use_gpu and
flash_attn, and if absent either add equivalent setter methods to the fork
(preserving API and safety) or change this code to use the supported GPU
configuration mechanism provided by whisper-rs (e.g., setting explicit fields on
WhisperContextParameters, using a builder or passing GPU/backend options to the
context creation function); specifically update the call sites around params,
WhisperContextParameters and the has_gpu logic to match the verified API.
---
Nitpick comments:
In `@src/openhuman/local_ai/device.rs`:
- Around line 99-124: The probe_nvidia_smi function should avoid blocking
indefinitely by replacing the synchronous Command::output() call with spawn()
and a timed wait: spawn the process in probe_nvidia_smi, poll with
child.try_wait() in a short sleep loop up to a chosen timeout (e.g., 5s), handle
Ok(Some(status)) (return None on non-zero), on timeout kill the child and return
None, then call child.wait_with_output() to read stdout and proceed with the
existing UTF-8/empty checks; include a debug log (e.g., tracing::debug!) when a
timeout occurs and ensure all error cases still return None.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 96b46e0f-6b3a-4954-aedc-f0796553d49e
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
Cargo.tomlsrc/openhuman/local_ai/device.rssrc/openhuman/local_ai/service/bootstrap.rssrc/openhuman/local_ai/service/speech.rssrc/openhuman/local_ai/service/whisper_engine.rs
| let mut params = WhisperContextParameters::default(); | ||
|
|
||
| // Explicitly configure GPU acceleration based on device profile. | ||
| // The default `use_gpu` is `cfg!(feature = "_gpu")` which is only true | ||
| // when a GPU backend feature (metal, cuda, etc.) is compiled in. | ||
| params.use_gpu(has_gpu); | ||
|
|
||
| // Enable flash attention when GPU is available — improves throughput | ||
| // on both Metal and CUDA backends. | ||
| if has_gpu { | ||
| params.flash_attn(true); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
whisper-rs WhisperContextParameters use_gpu method signature Rust
💡 Result:
WhisperContextParameters does not have a public use_gpu method. The official documentation at docs.rs/whisper-rs/latest/whisper_rs/struct.WhisperContextParameters.html lists no such method. Usage examples consistently use WhisperContextParameters::default without modification, as in WhisperContext::new_with_params(model_path, WhisperContextParameters::default). GPU support is enabled via crate feature flags like cuda, metal, vulkan, etc., which implicitly set GPU flags at runtime per the README. A set_use_gpu method exists on WhisperVadContextParams, but that's for VAD only. Code snippets setting ctx_params.use_gpu = true appear to be incorrect or from forks/other crates.
Citations:
- 1: https://docs.rs/whisper-rs/latest/whisper_rs/struct.WhisperVadContextParams.html
- 2: https://docs.rs/whisper-rs/latest/whisper_rs/struct.WhisperContext.html
- 3: https://docs.rs/whisper-rs/latest/whisper_rs/
- 4: https://docs.rs/whisper-rs
- 5: https://crates.io/crates/whisper-rs
- 6: https://github.com/tazz4843/whisper-rs
- 7: https://docs.rs/whisper-rs/latest/src/whisper_rs/lib.rs.html
🏁 Script executed:
# First, find and check the whisper_engine.rs file
cat -n src/openhuman/local_ai/service/whisper_engine.rs | head -100Repository: tinyhumansai/openhuman
Length of output: 4169
🏁 Script executed:
# Check the Cargo.toml for whisper-rs version
grep -r "whisper-rs" Cargo.toml app/src-tauri/Cargo.toml 2>/dev/nullRepository: tinyhumansai/openhuman
Length of output: 343
🏁 Script executed:
# Search for any other uses of WhisperContextParameters in the codebase
rg "WhisperContextParameters" --type rust -B 2 -A 2Repository: tinyhumansai/openhuman
Length of output: 930
🏁 Script executed:
# Search for any trait implementations or extensions to WhisperContextParameters in the codebase
rg "impl.*WhisperContextParameters|trait.*Whisper" --type rust -B 2 -A 5Repository: tinyhumansai/openhuman
Length of output: 48
🏁 Script executed:
# Check if there are any builder pattern methods defined locally
rg "fn use_gpu|fn flash_attn" --type rust -B 2 -A 3Repository: tinyhumansai/openhuman
Length of output: 48
🏁 Script executed:
# Look for whisper-rs imports and usage patterns
rg "use_gpu|flash_attn" --type rust -B 1 -A 1Repository: tinyhumansai/openhuman
Length of output: 1041
🏁 Script executed:
# Check the Cargo.lock or any dependency information for whisper-rs versions
find . -name "Cargo.lock" -type f | head -1 | xargs grep -A 5 "whisper-rs" 2>/dev/null | head -30Repository: tinyhumansai/openhuman
Length of output: 48
🏁 Script executed:
# Check if there's any git history or information about the custom fork
git log --oneline -- src/openhuman/local_ai/service/whisper_engine.rs 2>/dev/null | head -10Repository: tinyhumansai/openhuman
Length of output: 138
🏁 Script executed:
# Look for test files related to whisper
fd ".*whisper.*test.*" --type f 2>/dev/nullRepository: tinyhumansai/openhuman
Length of output: 48
🏁 Script executed:
# Check if there are any comments or documentation about WhisperContextParameters methods
rg "WhisperContextParameters|use_gpu|flash_attn" -B 3 -A 3 --type rustRepository: tinyhumansai/openhuman
Length of output: 2645
🏁 Script executed:
# Check the git config or remotes to see if the custom fork URL is documented
git config --list 2>/dev/null | grep -i whisper || echo "No git config info"Repository: tinyhumansai/openhuman
Length of output: 84
🏁 Script executed:
# Get the full commit details to see what changed
git show 9a7e75e --stat 2>/dev/null | head -50Repository: tinyhumansai/openhuman
Length of output: 2935
🏁 Script executed:
# Check the actual diff of the whisper_engine.rs changes
git show 9a7e75e -- src/openhuman/local_ai/service/whisper_engine.rs 2>/dev/null | head -100Repository: tinyhumansai/openhuman
Length of output: 3726
🏁 Script executed:
# Check if there's any documentation or notes about the custom fork in the repo
find . -name "*.md" -type f | xargs grep -l "whisper-rs-sys\|whisper_rs" 2>/dev/null | head -5Repository: tinyhumansai/openhuman
Length of output: 85
🏁 Script executed:
# Check the memory file for any context
cat ./.claude/memory.md | head -100Repository: tinyhumansai/openhuman
Length of output: 14761
🏁 Script executed:
# Try to find any references to whisper-rs fork or custom implementations
rg "use_gpu|flash_attn" --type rs -B 5 -A 5 | head -80Repository: tinyhumansai/openhuman
Length of output: 92
🏁 Script executed:
# Check if there's a recent successful build
find target -name "libwhisper_engine*" -o -name "*whisper_engine*.rlib" 2>/dev/null | head -5Repository: tinyhumansai/openhuman
Length of output: 48
🏁 Script executed:
# Look for any GitHub issues or PRs mentioning these methods
rg "use_gpu|flash_attn" --type md --type txt 2>/dev/null | head -20Repository: tinyhumansai/openhuman
Length of output: 48
🏁 Script executed:
# Check if the code actually compiles by looking for the compiled file
ls -la target/debug/deps/ 2>/dev/null | grep whisper || echo "No compiled artifact found"Repository: tinyhumansai/openhuman
Length of output: 92
🏁 Script executed:
# Look for any git submodules or fork references
find . -name ".gitmodules" -o -name "*.toml" | xargs grep -l "whisper-rs-sys" 2>/dev/null | head -5Repository: tinyhumansai/openhuman
Length of output: 78
🏁 Script executed:
# Check the actual Cargo.toml to see the custom fork reference
cat app/src-tauri/Cargo.toml | grep -A 3 "whisper-rs"Repository: tinyhumansai/openhuman
Length of output: 48
🏁 Script executed:
# Search for any documentation about the custom fork
find . -type f \( -name "*.md" -o -name "*.txt" \) -exec grep -l "whisper-rs-sys\|github.com/tinyhumansai" {} \; 2>/dev/nullRepository: tinyhumansai/openhuman
Length of output: 415
🏁 Script executed:
# Check BUILDING.md for information about whisper-rs setup
grep -i "whisper" docs/BUILDING.mdRepository: tinyhumansai/openhuman
Length of output: 48
🏁 Script executed:
# Check the main Cargo.toml to see the fork reference context
cat Cargo.toml | grep -B 2 -A 5 "whisper-rs-sys"Repository: tinyhumansai/openhuman
Length of output: 683
🏁 Script executed:
# Check the architecture documentation
grep -i "whisper" gitbooks/technology/architecture.md | head -10Repository: tinyhumansai/openhuman
Length of output: 48
🏁 Script executed:
# Look for any mentions in README
grep -i "whisper" README.md | head -10Repository: tinyhumansai/openhuman
Length of output: 48
🏁 Script executed:
# Check if there's any GitHub issue or discussion about the fork
grep -r "use_gpu\|flash_attn" --include="*.md" --include="*.rs" .Repository: tinyhumansai/openhuman
Length of output: 491
🏁 Script executed:
# Look for any build.rs files that might indicate what the fork does
find . -name "build.rs" -type f -exec cat {} \;Repository: tinyhumansai/openhuman
Length of output: 1444
🏁 Script executed:
# Check if there are any feature flags related to GPU configuration
rg "use_gpu|flash_attn|gpu.*acceleration" --type rs -B 3 -A 3Repository: tinyhumansai/openhuman
Length of output: 92
Verify that use_gpu() and flash_attn() methods exist on WhisperContextParameters.
The code calls params.use_gpu(has_gpu) and params.flash_attn(true) without capturing return values (lines 74 and 79). However, the official whisper-rs 0.16 API documentation does not list these methods on WhisperContextParameters. While this project uses a custom fork of whisper-rs-sys for Windows MSVC compatibility, that fork's documented changes do not include adding GPU or flash attention configuration methods.
Verify:
- These methods exist on the custom fork's
WhisperContextParameters - If they don't exist, either add them to the fork or use an alternative GPU configuration approach supported by whisper-rs
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/openhuman/local_ai/service/whisper_engine.rs` around lines 69 - 80, The
code calls WhisperContextParameters.use_gpu(has_gpu) and
WhisperContextParameters.flash_attn(true) but those methods may not exist on the
custom whisper-rs-sys fork; inspect the forked definition of
WhisperContextParameters and confirm the presence and signatures of use_gpu and
flash_attn, and if absent either add equivalent setter methods to the fork
(preserving API and safety) or change this code to use the supported GPU
configuration mechanism provided by whisper-rs (e.g., setting explicit fields on
WhisperContextParameters, using a builder or passing GPU/backend options to the
context creation function); specifically update the call sites around params,
WhisperContextParameters and the has_gpu logic to match the verified API.
There was a problem hiding this comment.
These methods exist on WhisperContextParameters in whisper-rs 0.16. See whisper_ctx.rs:487-493:
pub fn use_gpu(&mut self, use_gpu: bool) -> &mut Self {
self.use_gpu = use_gpu;
self
}
pub fn flash_attn(&mut self, flash_attn: bool) -> &mut Self {
self.flash_attn = flash_attn;
self
}The web query result was incorrect — these are public setter methods, not fields on WhisperVadContextParams. Code compiles and Metal acceleration is confirmed working (~159ms for 3.1s audio on Apple M2).
There was a problem hiding this comment.
@oxoxDev, thank you for the correction and for pointing directly to the upstream source. You're right — use_gpu() and flash_attn() are confirmed public setter methods on WhisperContextParameters in whisper-rs v0.16.0, and my web search result was incorrect (it confused WhisperVadContextParams with WhisperContextParameters). The implementation is valid, and the confirmed ~159ms transcription of 3.1s audio on Apple M2 is an excellent result. Apologies for the noise.
✏️ Learnings added
Learnt from: oxoxDev
Repo: tinyhumansai/openhuman PR: 571
File: src/openhuman/local_ai/service/whisper_engine.rs:69-80
Timestamp: 2026-04-14T19:59:04.826Z
Learning: In whisper-rs v0.16.0 (crate `whisper-rs`), `WhisperContextParameters` has two public builder-style setter methods: `use_gpu(&mut self, use_gpu: bool) -> &mut Self` and `flash_attn(&mut self, flash_attn: bool) -> &mut Self` (see whisper-rs/blob/v0.16.0/src/whisper_ctx.rs:487-493). Do not flag calls to these methods as missing/invalid — they exist in the upstream crate and are not limited to `WhisperVadContextParams`.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: graycyrus
Repo: tinyhumansai/openhuman PR: 271
File: app/src/pages/onboarding/Onboarding.tsx:65-85
Timestamp: 2026-04-02T13:53:41.803Z
Learning: When reviewing `local_ai` Tauri backend RPCs in `src/openhuman/local_ai/**`, note that `local_ai_download` and `local_ai_download_all_assets` both delegate to the same `service.download_all_models()` obtained from `local_ai::global(config)`, which uses shared internal state (via a shared `Arc`) to make concurrent/duplicate downloads an in-flight no-op. Because the backend is already idempotent for concurrent calls, don’t require or introduce separate frontend cross-component/concurrency guards to prevent duplicate downloads.
Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When using the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, don’t require individual execution-provider feature flags (e.g., `directml`, `coreml`, `cuda`) alongside `load-dynamic` to get EP registration code. The `ort` crate already compiles EP registration via `#[cfg(any(feature = "load-dynamic", feature = "<ep_name>"))]` guards, and adding per-EP feature flags can pull in static-linking dependencies that conflict with the dynamic loading approach. At runtime, EP availability is determined by what the dynamically loaded ONNX Runtime library (`onnxruntime.dll`/`.so`/`.dylib`) supports; ort docs indicate providers like `directml`/`xnnpack`/`coreml` are available in builds when the platform supports them.
Learnt from: sanil-23
Repo: tinyhumansai/openhuman PR: 416
File: src/openhuman/memory/relex.rs:441-464
Timestamp: 2026-04-07T15:49:51.275Z
Learning: When integrating the `ort` Rust crate v2.x with the `load-dynamic` feature enabled, do NOT also require/enable individual provider EP Cargo features like `directml`, `coreml`, or `cuda`. In `ort` v2.x, EP registration for providers (e.g., DirectML, CoreML, CUDA, etc.) is already compiled in under source-level `#[cfg(any(feature = "load-dynamic", feature = "<provider>"))]` guards, such as in `ep/directml.rs`. Adding provider feature flags alongside `load-dynamic` can pull in static-linking dependencies that conflict with the dynamic-loading approach. Provider availability should be treated as runtime-determined by what the loaded `onnxruntime` library (`onnxruntime.dll`/`libonnxruntime.so`/`libonnxruntime.dylib`) actually supports.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
src/openhuman/local_ai/device.rs (2)
79-95: Add structured log prefixes for grep-friendly diagnostics.The debug logs at lines 79, 85, 91, and 95 lack a consistent prefix. Per coding guidelines, logs should use stable prefixes like
[device]or[local_ai]for easier filtering and correlation. The relatedwhisper_engine.rsuses aLOG_PREFIXpattern.♻️ Suggested fix: add a structured prefix
+const LOG_PREFIX: &str = "[device]"; + fn detect_gpu(cpu_brand: &str, os_name: &str) -> (bool, Option<String>) { let brand_lower = cpu_brand.to_ascii_lowercase(); let os_lower = os_name.to_ascii_lowercase(); // Apple Silicon detection: brand contains "apple" or we're on macOS with an ARM chip. if brand_lower.contains("apple") || (os_lower.contains("mac") && brand_lower.contains("arm")) { - tracing::debug!("GPU detected: Apple Silicon (Metal)"); + tracing::debug!("{LOG_PREFIX} GPU detected: Apple Silicon (Metal)"); return (true, Some("Apple Silicon (Metal)".to_string())); } // Intel Mac: macOS with Intel CPU — no Metal GPU acceleration for whisper. if os_lower.contains("mac") { - tracing::debug!("Intel Mac detected — no GPU acceleration available for whisper"); + tracing::debug!("{LOG_PREFIX} Intel Mac detected — no GPU acceleration available for whisper"); return (false, Some("Intel Mac (no Metal GPU)".to_string())); } // Windows / Linux: probe for NVIDIA GPU via nvidia-smi. if let Some(desc) = probe_nvidia_smi() { - tracing::debug!("GPU detected via nvidia-smi: {desc}"); + tracing::debug!("{LOG_PREFIX} GPU detected via nvidia-smi: {desc}"); return (true, Some(desc)); } - tracing::debug!("no GPU detected — voice model will use CPU"); + tracing::debug!("{LOG_PREFIX} no GPU detected — voice model will use CPU"); (false, None) }As per coding guidelines: "Use structured, grep-friendly context in logs with stable prefixes (e.g., [domain], [rpc], [ui-flow]) and include correlation fields."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/local_ai/device.rs` around lines 79 - 95, The device detection debug logs lack the standardized grep-friendly prefix; update the tracing::debug! calls in src/openhuman/local_ai/device.rs (the Apple Silicon, Intel Mac, nvidia-smi branch, and final no-GPU message) to include a stable prefix such as "[device]" (or reuse an existing LOG_PREFIX constant if available, e.g., LOG_PREFIX) and keep the original message and any variable interpolation (e.g., include {desc} in the nvidia-smi log); ensure each call uses the same prefix and optionally include a correlation field like device_id or trace_id if present in scope.
101-124: Add trace-level logging fornvidia-smiprobe failures.When
nvidia-smiis not found or fails, the function silently returnsNone. Adding a trace or debug log on the failure paths would help diagnose GPU detection issues on Windows/Linux systems without NVIDIA drivers or with misconfigured environments.♻️ Suggested fix: log probe failures
fn probe_nvidia_smi() -> Option<String> { - let output = std::process::Command::new("nvidia-smi") + let output = match std::process::Command::new("nvidia-smi") .args(["--query-gpu=name", "--format=csv,noheader"]) .stdout(std::process::Stdio::piped()) .stderr(std::process::Stdio::null()) .output() - .ok()?; + { + Ok(o) => o, + Err(e) => { + tracing::trace!("{LOG_PREFIX} nvidia-smi not available: {e}"); + return None; + } + }; if !output.status.success() { + tracing::trace!("{LOG_PREFIX} nvidia-smi exited with non-zero status"); return None; } let name = String::from_utf8_lossy(&output.stdout) .lines() .next()? .trim() .to_string(); if name.is_empty() { + tracing::trace!("{LOG_PREFIX} nvidia-smi returned empty GPU name"); return None; } Some(format!("NVIDIA {name} (CUDA)")) }As per coding guidelines: "include logs at... external calls... and error handling paths."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/openhuman/local_ai/device.rs` around lines 101 - 124, probe_nvidia_smi currently returns None silently on failures; add trace-level logs on each failure path to aid debugging: when Command::new("nvidia-smi").output() fails (log the io::Error), when output.status.success() is false (log the exit status and maybe stdout/stderr), and when the parsed name is empty (log the raw output bytes from output.stdout). Use the existing tracing/log facility in the project (e.g., trace! or log::trace!) and reference probe_nvidia_smi, the Command::new("nvidia-smi") call, output.status, and output.stdout when adding these messages.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/openhuman/local_ai/device.rs`:
- Around line 79-95: The device detection debug logs lack the standardized
grep-friendly prefix; update the tracing::debug! calls in
src/openhuman/local_ai/device.rs (the Apple Silicon, Intel Mac, nvidia-smi
branch, and final no-GPU message) to include a stable prefix such as "[device]"
(or reuse an existing LOG_PREFIX constant if available, e.g., LOG_PREFIX) and
keep the original message and any variable interpolation (e.g., include {desc}
in the nvidia-smi log); ensure each call uses the same prefix and optionally
include a correlation field like device_id or trace_id if present in scope.
- Around line 101-124: probe_nvidia_smi currently returns None silently on
failures; add trace-level logs on each failure path to aid debugging: when
Command::new("nvidia-smi").output() fails (log the io::Error), when
output.status.success() is false (log the exit status and maybe stdout/stderr),
and when the parsed name is empty (log the raw output bytes from output.stdout).
Use the existing tracing/log facility in the project (e.g., trace! or
log::trace!) and reference probe_nvidia_smi, the Command::new("nvidia-smi")
call, output.status, and output.stdout when adding these messages.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a5ae07ef-5f23-44f4-9c1b-c9f906b687eb
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
Cargo.tomlsrc/openhuman/local_ai/device.rssrc/openhuman/local_ai/service/bootstrap.rssrc/openhuman/local_ai/service/speech.rssrc/openhuman/local_ai/service/whisper_engine.rs
✅ Files skipped from review due to trivial changes (2)
- Cargo.toml
- src/openhuman/local_ai/service/whisper_engine.rs
🚧 Files skipped from review as they are similar to previous changes (2)
- src/openhuman/local_ai/service/speech.rs
- src/openhuman/local_ai/service/bootstrap.rs
…inyhumansai#558) (tinyhumansai#571) * fix(device): expand GPU detection with Intel Mac and NVIDIA probes (tinyhumansai#558) Add Intel Mac detection (no Metal GPU for whisper), nvidia-smi probe for Windows/Linux NVIDIA GPUs, and diagnostic tracing at each decision point in detect_gpu(). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * build(cargo): enable whisper-rs Metal feature on macOS (tinyhumansai#558) Add target-specific dependency for macOS that enables the `metal` feature on whisper-rs, compiling whisper.cpp with Metal GPU support. Cargo merges features from both declarations so non-macOS builds are unaffected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(whisper): configure GPU params from device profile (tinyhumansai#558) Accept has_gpu and gpu_description in load_engine() and explicitly set use_gpu and flash_attn on WhisperContextParameters instead of relying on the compile-time default. Log the selected acceleration backend at startup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(local_ai): pass GPU info to whisper engine load paths (tinyhumansai#558) Thread DeviceProfile has_gpu and gpu_description through both the bootstrap (startup) and speech (lazy) whisper engine load calls so the engine can configure Metal or CUDA acceleration at runtime. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Problem
Voice model GPU detection existed (
DeviceProfile.has_gpu) but was never passed to the whisper engine. Meanwhile,whisper-rswas compiled without any GPU backend feature, soWhisperContextParameters::default()always setuse_gpu: false. Result: voice transcription always ran on CPU regardless of available hardware acceleration.Root causes:
Cargo.toml:83—whisper-rs = "0.16"declared without GPU features, so the internal_gpuflag was never setwhisper_engine.rs:63—WhisperContextParameters::default()used without configuring GPU params from the detected device profiledevice.rs:73-84— GPU detection was Apple Silicon only; no Intel Mac distinction, no NVIDIA probe for Windows/LinuxSolution
Cargo.toml— Added[target.'cfg(target_os = "macos")'.dependencies]section enablingmetalfeature on whisper-rs. Cargo merges features from both declarations so non-macOS builds are unaffected.device.rs— Expandeddetect_gpu()with Intel Mac detection (no Metal for whisper),nvidia-smisubprocess probe for Windows/Linux NVIDIA GPUs, andtracing::debugat each decision point.whisper_engine.rs— Updatedload_engine()to accepthas_gpuandgpu_descriptionparams; explicitly configuresuse_gpuandflash_attnonWhisperContextParametersinstead of relying on compile-time defaults.bootstrap.rs— PassesDeviceProfileGPU info to whisper engine at bootstrap startup.speech.rs— Detects device profile at lazy-load time and passes GPU info through.Submission Checklist
cargo testpasses (11/11 for device, whisper_engine, bootstrap modules)tauri devwithRUST_LOG=debugload_engine()anddetect_gpu()Manual verification:
whisper_backend_init_gpu: using Metal backendin logsGPU name: Apple M2,MTLGPUFamilyMetal3in Metal initImpact
cudafeature enablementload_enginesignature change is crate-internalRelated
Co-Authored-By: Claude Opus 4.6 noreply@anthropic.com
Summary by CodeRabbit
New Features
Bug Fixes