Skip to content

feat(voice): add mic input device selector and stabilize media capture for composer#1616

Merged
graycyrus merged 11 commits into
tinyhumansai:mainfrom
YellowSnnowmann:fix/voice-issue
May 13, 2026
Merged

feat(voice): add mic input device selector and stabilize media capture for composer#1616
graycyrus merged 11 commits into
tinyhumansai:mainfrom
YellowSnnowmann:fix/voice-issue

Conversation

@YellowSnnowmann
Copy link
Copy Markdown
Contributor

@YellowSnnowmann YellowSnnowmann commented May 13, 2026

Summary

  • Added an audio input device selector to MicCloudComposer so users can choose which microphone to record from.

  • Implemented device enumeration and selection wiring for capture setup in the voice composer flow.

  • Improved fake media stream handling to make audio capture behavior more reliable in test/dev and edge capture paths.

  • Removed unused user_profile database indexes to reduce schema clutter and maintenance overhead.

  • Applied Prettier formatting updates across touched files for consistency.


Problem

  • Voice capture in the composer did not expose microphone selection, which can cause wrong-device recording on multi-input systems.

  • Media capture behavior around fake/virtual streams could lead to unstable or inconsistent audio capture outcomes.

  • Unused profile table indexes increased schema complexity without clear runtime benefit.


Solution

  • Extended MicCloudComposer with audio device discovery, selector UI/state, and selected-device capture wiring.

  • Refactored media-capture internals to better preserve audio stream integrity when fake streams are involved.

  • Cleaned profile persistence code by removing unused user_profile index declarations.

  • Kept scope focused to existing flows (composer + capture plumbing) with minimal surface-area changes outside impacted modules.


Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy

  • Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/coverage.yml. Run pnpm test:coverage and pnpm test:rust locally; PRs below 80% on changed lines will not merge.

  • Coverage matrix updated — added/removed/renamed feature rows in docs/TEST-COVERAGE-MATRIX.md reflect this change (or N/A: behaviour-only change)

  • All affected feature IDs from the matrix are listed in the PR description under ## Related

  • No new external network dependencies introduced (mock backend used per Testing Strategy)

  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md)

  • Linked issue closed via Closes #NNN in the ## Related section


Impact

  • Platform/runtime: Desktop app UI + capture flow (app/) and minor core schema cleanup (src/openhuman/...); no new platform targets.

  • Performance: Potentially improved by removing unused DB indexes; no expected regressions in normal capture paths.

  • Security/privacy: No new external dependencies; device selection increases user control over recording source.

  • Compatibility: Backward-compatible behavior with added optional device-selection capability.


Related

Summary by CodeRabbit

  • New Features

    • Optional microphone device selector: choose a specific audio input when recording.
  • Bug Fixes

    • Video injection preserved via file-based fake-video; permission prompts auto-granted without replacing real audio capture.
    • More reliable recording startup and device switching behavior.
  • Tests

    • Expanded coverage for getUserMedia failure modes and device-selection flows.
  • Chores

    • Profile storage schema updated for phased migrations.

Review Change Stack

…tion logic

- Introduced a new interface for audio input devices.
- Added a  prop to enable microphone device selection.
- Implemented device enumeration and selection handling in the MicCloudComposer component.
- Updated UI to display a dropdown for selecting available audio input devices when the selector is enabled.
… audio capture integrity

- Removed the  flag to prevent audio capture device replacement.
- Added comments to clarify the rationale behind the changes and the use of  for video input.
…able

- Deleted the idx_profile_state and idx_profile_class indexes to streamline the database schema.
- This change aims to improve performance and reduce redundancy in the user_profile table.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 13, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 6ddc8b2d-fc31-4cde-a5ba-0b8e5641d264

📥 Commits

Reviewing files that changed from the base of the PR and between 5143c77 and f13c223.

📒 Files selected for processing (1)
  • app/src/features/human/MicCloudComposer.test.tsx

📝 Walkthrough

Walkthrough

Adds an optional microphone device selector to MicCloudComposer (enumeration, selection, exact-device getUserMedia, and tests), enables it in Conversations, removes a CEF fake-audio flag to preserve real audio devices while injecting Y4M video, and narrows Phase 3 profile migration to column-only updates.

Changes

Microphone Device Selection Feature

Layer / File(s) Summary
Types and props
app/src/features/human/MicCloudComposer.tsx
Define AudioInputDevice and add showDeviceSelector?: boolean prop.
State & enumeration
app/src/features/human/MicCloudComposer.tsx
Track devices and selectedDeviceId; add useEffect to enumerate audio inputs and handle devicechange.
Recording constraints & permission refresh
app/src/features/human/MicCloudComposer.tsx
When a device is selected, startRecording uses exact deviceId constraint; refresh device labels after permission grant and map getUserMedia errors to clearer messages.
Device selector UI
app/src/features/human/MicCloudComposer.tsx
Render conditional microphone <select> (shown when enabled and devices exist, disabled unless idle) above the record control.
Conversations wiring
app/src/pages/Conversations.tsx
Enable device selector by passing showDeviceSelector={true} to MicCloudComposer.
Tests for device selector and errors
app/src/features/human/MicCloudComposer.test.tsx
Add tests: getUserMedia error cases, enumeration rendering, default-hidden selector, single-device disable, fallback labels, selecting device triggers exact deviceId getUserMedia, label refresh after permission, and enumerateDevices rejection handling.

CEF Fake-Camera Configuration

Layer / File(s) Summary
Fake camera startup flags
app/src-tauri/src/lib.rs
Remove --use-fake-device-for-media-stream; inject Y4M via --use-file-for-fake-video-capture and keep --use-fake-ui-for-media-stream so real audio devices remain unchanged.

Profile Table Schema Migration

Layer / File(s) Summary
Phase 3 columns
src/openhuman/memory/store/unified/profile.rs, src/openhuman/memory/store/unified/init.rs
Add Phase 3 columns to initial CREATE TABLE and provide PHASE3_COLUMNS_SQL for idempotent column migration; remove execution of Phase 3 index creation from the migration loop.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • tinyhumansai/openhuman#1616: The main PR makes the same mic device-selector/capture wiring changes (MicCloudComposer + Conversations + MicCloudComposer tests) and the same supporting adjustments to fake-media-stream CLI flags and Phase 3 profile schema/index migrations as the retrieved PR.
  • tinyhumansai/openhuman#1359: Both PRs touch the Tauri/CEF startup configuration for “fake camera”/Y4M video capture flags.

Suggested reviewers

  • senamakel

Poem

🐰 A rabbit's note on sound and schema
Pick the mic that hears you true,
Video fed by Y4M, audio stays you-know-who,
Flags trimmed, permissions gently auto-accepted,
Profile columns bloom, migrations tidy and new.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Out of Scope Changes check ❓ Inconclusive The PR includes changes to profile schema initialization (removing unused indexes and refactoring Phase 3 columns). While schema cleanup is mentioned in PR objectives, its necessity for the voice feature is unclear. Clarify whether database schema changes are necessary for voice capture or are unrelated cleanup that should be in a separate PR.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding a microphone input device selector and improving media capture in the composer component.
Linked Issues check ✅ Passed The PR addresses core objectives from #1610: adds device selector for input verification, improves capture setup, and includes comprehensive tests for error handling and device selection behavior.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@YellowSnnowmann YellowSnnowmann marked this pull request as ready for review May 13, 2026 10:14
@YellowSnnowmann YellowSnnowmann requested a review from a team May 13, 2026 10:14
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
app/src/features/human/MicCloudComposer.tsx (1)

234-259: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Prevent stream leak and misclassified mic errors in startRecording.

Line 247 can throw after Line 234 already granted a stream; current catch path (Line 253 onward) returns without stopping that stream, which can leave the mic active. It also labels every failure as permission-denied, including device selection/enumeration failures.

💡 Proposed fix
-    let stream: MediaStream;
+    let stream: MediaStream;
     try {
       // Audio constraints tuned for STT accuracy:
@@
       stream = await navigator.mediaDevices.getUserMedia({
         audio: {
           ...(selectedDeviceId ? { deviceId: { exact: selectedDeviceId } } : {}),
@@
         },
       });
-      // After the first successful grant, refresh device labels (they are
-      // blank until the user has given permission).
-      if (showDeviceSelector) {
-        const all = await navigator.mediaDevices.enumerateDevices();
-        const inputs = all
-          .filter(d => d.kind === 'audioinput')
-          .map((d, i) => ({ deviceId: d.deviceId, label: d.label || `Microphone ${i + 1}` }));
-        setDevices(inputs);
-      }
     } catch (err) {
       startInFlightRef.current = false;
       const msg = err instanceof Error ? err.message : String(err);
       composerLog('getUserMedia rejected: %s', msg);
-      onError?.(`Microphone permission denied: ${msg}`);
+      const name = err instanceof DOMException ? err.name : '';
+      if (name === 'NotAllowedError' || name === 'SecurityError') {
+        onError?.(`Microphone permission denied: ${msg}`);
+      } else if (name === 'NotFoundError' || name === 'OverconstrainedError') {
+        onError?.('Selected microphone is unavailable. Choose a different input device.');
+      } else {
+        onError?.(`Failed to access microphone: ${msg}`);
+      }
       return;
     }
+
+    // After the first successful grant, refresh device labels (best-effort).
+    if (showDeviceSelector && navigator.mediaDevices?.enumerateDevices) {
+      try {
+        const all = await navigator.mediaDevices.enumerateDevices();
+        const inputs = all
+          .filter(d => d.kind === 'audioinput')
+          .map((d, i) => ({ deviceId: d.deviceId, label: d.label || `Microphone ${i + 1}` }));
+        setDevices(inputs);
+      } catch (err) {
+        composerLog('post-grant enumerateDevices failed: %s', err);
+      }
+    }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/features/human/MicCloudComposer.tsx` around lines 234 - 259, In
startRecording (around getUserMedia, enumerateDevices and the catch block)
ensure any MediaStream acquired is stopped if a later step throws: capture the
stream returned by navigator.mediaDevices.getUserMedia in a local variable, and
in the catch path call stop() on all its tracks
(stream.getTracks().forEach(t=>t.stop())) before returning and before resetting
startInFlightRef.current; also improve the error classification sent to onError
by inspecting the thrown error (e.g., DOMException.name or message) so only true
permission errors produce "permission denied" and other errors (enumerateDevices
failures, device-not-found, etc.) forward a precise message (include the actual
err.message/string) instead of always saying permission denied.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@app/src/features/human/MicCloudComposer.tsx`:
- Around line 234-259: In startRecording (around getUserMedia, enumerateDevices
and the catch block) ensure any MediaStream acquired is stopped if a later step
throws: capture the stream returned by navigator.mediaDevices.getUserMedia in a
local variable, and in the catch path call stop() on all its tracks
(stream.getTracks().forEach(t=>t.stop())) before returning and before resetting
startInFlightRef.current; also improve the error classification sent to onError
by inspecting the thrown error (e.g., DOMException.name or message) so only true
permission errors produce "permission denied" and other errors (enumerateDevices
failures, device-not-found, etc.) forward a precise message (include the actual
err.message/string) instead of always saying permission denied.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c9ee3629-74fe-4593-8b73-a2c5abd466c9

📥 Commits

Reviewing files that changed from the base of the PR and between 9871ccf and 798736d.

📒 Files selected for processing (4)
  • app/src-tauri/src/lib.rs
  • app/src/features/human/MicCloudComposer.tsx
  • app/src/pages/Conversations.tsx
  • src/openhuman/memory/store/unified/profile.rs
💤 Files with no reviewable changes (1)
  • src/openhuman/memory/store/unified/profile.rs

coderabbitai[bot]
coderabbitai Bot previously approved these changes May 13, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

The CEF flag fix in lib.rs is the correct root cause fix — --use-fake-device-for-media-stream was silently replacing the real microphone with a sine-wave test tone. That one-line removal and the excellent explanatory comment are solid.

The device selector in MicCloudComposer is a useful addition for multi-input systems, but ships 107 new lines of stateful device-enumeration logic with no new tests (the existing 14 tests don't cover any of the new code paths). The PR checklist says "Tests added or updated" — this is not satisfied.

Missing test coverage

At minimum before merge:

  • showDeviceSelector=true renders the <select> when enumerateDevices returns multiple devices
  • The selected deviceId is passed as { exact: selectedDeviceId } in getUserMedia
  • Changing the selector updates the constraint
  • devicechange event triggers re-enumeration
  • enumerateDevices throwing doesn't crash the component
  • Post-grant label refresh calls enumerateDevices again

The existing beforeEach already stubs navigator.mediaDevices — just add enumerateDevices, addEventListener, and removeEventListener to the stub.

Issue #1610 coverage

This PR addresses 2 of 9 acceptance criteria (device selection + beep root cause). Remaining: permission state UI, specific failure messages, diagnostics/Sentry, beep text handling. Consider noting in the PR body which criteria are deferred to follow-up work.

Comment thread src/openhuman/memory/store/unified/profile.rs
Comment thread app/src/features/human/MicCloudComposer.tsx
Comment thread app/src/features/human/MicCloudComposer.tsx
Comment thread app/src/features/human/MicCloudComposer.tsx
Adds 6 new tests for the device enumeration and selector introduced in
the previous commit — covering the loadDevices useEffect, fallback
labels, deviceId constraint forwarding, post-permission label refresh,
and enumerateDevices error handling.

Closes the coverage gap reported by diff-cover (was 33%; new lines all
covered).
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
app/src/features/human/MicCloudComposer.test.tsx (1)

316-333: ⚡ Quick win

Avoid fixed sleeps in this test path (flaky + slower).

Line 330 uses an arbitrary setTimeout(50); this makes the test timing-dependent. Prefer asserting the absence/presence behavior directly without wall-clock waits.

Suggested change
-    // Give any async effects a chance to run
-    await new Promise(r => setTimeout(r, 50));
     expect(screen.queryByRole('combobox', { name: /microphone device/i })).not.toBeInTheDocument();
     expect(enumerateDevicesMock).not.toHaveBeenCalled();

As per coding guidelines: "Prefer behavior over implementation in tests; use helpers from app/src/test/".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src/features/human/MicCloudComposer.test.tsx` around lines 316 - 333, The
test uses a fixed sleep (await new Promise(r => setTimeout(r, 50))) which is
flaky; replace this with a behavior-based wait (e.g., use RTL's waitFor or
existing helpers from app/src/test/) to await stable UI state and then assert
absence of the combobox and that enumerateDevicesMock was not called. Locate the
test for MicCloudComposer in MicCloudComposer.test.tsx and replace the arbitrary
timeout with a wait that checks screen.queryByRole('combobox', { name:
/microphone device/i }) remains null (or use waitFor(() =>
expect(...).not.toBeInTheDocument())) and then assert enumerateDevicesMock
not.toHaveBeenCalled(); use the project test helpers instead of wall-clock
sleeps.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@app/src/features/human/MicCloudComposer.test.tsx`:
- Around line 316-333: The test uses a fixed sleep (await new Promise(r =>
setTimeout(r, 50))) which is flaky; replace this with a behavior-based wait
(e.g., use RTL's waitFor or existing helpers from app/src/test/) to await stable
UI state and then assert absence of the combobox and that enumerateDevicesMock
was not called. Locate the test for MicCloudComposer in
MicCloudComposer.test.tsx and replace the arbitrary timeout with a wait that
checks screen.queryByRole('combobox', { name: /microphone device/i }) remains
null (or use waitFor(() => expect(...).not.toBeInTheDocument())) and then assert
enumerateDevicesMock not.toHaveBeenCalled(); use the project test helpers
instead of wall-clock sleeps.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 9d17c535-7b45-4cdc-b6ce-4f9314cc3eee

📥 Commits

Reviewing files that changed from the base of the PR and between 798736d and dc6026b.

📒 Files selected for processing (1)
  • app/src/features/human/MicCloudComposer.test.tsx

coderabbitai[bot]
coderabbitai Bot previously approved these changes May 13, 2026
- Updated existing test for permission-denied errors to specify NotAllowedError.
- Added new tests for handling OverconstrainedError, NotReadableError, and generic errors from getUserMedia.
- Included a test to verify the device selector is disabled when only one device is available.
- Improved error messaging in the MicCloudComposer component to provide more specific feedback based on the error type.
…aths

Removes idx_profile_state and idx_profile_class from PHASE3_INDEXES_SQL
and the init.rs migration chain to match the earlier removal from
PROFILE_INIT_SQL. Both indexes are now fully absent from new installs
and existing-DB migrations alike.
…ability

- Simplified the assertion for error handling in the MicCloudComposer tests.
- Enhanced the formatting of the mock device enumeration to improve clarity.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src/features/human/MicCloudComposer.test.tsx`:
- Around line 349-366: Replace the fixed sleep used to wait for async effects in
the test for MicCloudComposer with React Testing Library's waitFor: remove the
new Promise(r => setTimeout(r, 50)) and instead call await waitFor(() =>
expect(screen.queryByRole('combobox', { name: /microphone device/i
})).not.toBeInTheDocument()); keep the existing mocks (enumerateDevicesMock and
getUserMediaMock) and the final assertion that enumerateDevicesMock was not
called so the test polls reliably instead of relying on a fixed timeout.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b4e2d302-512e-4d0a-b9db-e71418673bd9

📥 Commits

Reviewing files that changed from the base of the PR and between dc6026b and 053f819.

📒 Files selected for processing (4)
  • app/src/features/human/MicCloudComposer.test.tsx
  • app/src/features/human/MicCloudComposer.tsx
  • src/openhuman/memory/store/unified/init.rs
  • src/openhuman/memory/store/unified/profile.rs
💤 Files with no reviewable changes (1)
  • src/openhuman/memory/store/unified/profile.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • app/src/features/human/MicCloudComposer.tsx

Comment thread app/src/features/human/MicCloudComposer.test.tsx
graycyrus
graycyrus previously approved these changes May 13, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All four findings addressed cleanly — index removal completed, error classification differentiated by DOMException.name, selector visible on single-mic systems, and devicechange listener properly wrapped. Tests added for each. LGTM.

@graycyrus graycyrus merged commit 10a726d into tinyhumansai:main May 13, 2026
17 of 26 checks passed
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Human tab microphone does not pick up voice input

2 participants