Skip to content

Integrate cactus-based local STT pipeline and shared LLM types#4003

Open
yujonglee wants to merge 16 commits intomainfrom
feat/cactus-stt-integration
Open

Integrate cactus-based local STT pipeline and shared LLM types#4003
yujonglee wants to merge 16 commits intomainfrom
feat/cactus-stt-integration

Conversation

@yujonglee
Copy link
Contributor

Summary

  • Add cactus-sys, cactus, and transcribe-cactus crates to support cactus-backed local STT with both streaming and batch service layers.
  • Extract shared parser/message primitives into a new llm-types crate and wire dependent crates/plugins to consume the new interfaces.
  • Remove older VAD-specific integration points and update desktop/plugin STT configuration and supervisor flows to the new internal server path.

Test plan

  • cargo check
  • Run local STT plugin flow end-to-end and verify streaming transcription
  • Validate desktop STT settings UI selection/configuration behavior

Made with Cursor

@netlify
Copy link

netlify bot commented Feb 16, 2026

Deploy Preview for hyprnote-storybook canceled.

Name Link
🔨 Latest commit 1253078
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote-storybook/deploys/6995db44e552fd00077d75e0

@netlify
Copy link

netlify bot commented Feb 16, 2026

Deploy Preview for hyprnote canceled.

Name Link
🔨 Latest commit 1253078
🔍 Latest deploy log https://app.netlify.com/projects/hyprnote/deploys/6995db447432de000709c164

yujonglee and others added 3 commits February 18, 2026 18:53
Introduce new cactus/cactus-sys/transcribe-cactus crates, wire plugins to the new server path, and retire the older VAD-specific integration points.

Co-authored-by: Cursor <cursoragent@cursor.com>
Track vendor/cactus by gitlink so checkout and sync follow .gitmodules rather than vendoring full source into the monorepo.

Co-authored-by: Cursor <cursoragent@cursor.com>
Align the vendored cactus dependency to the latest 1.7 release and clear local submodule drift so builds use a reproducible upstream tag.

Co-authored-by: Cursor <cursoragent@cursor.com>
@yujonglee yujonglee force-pushed the feat/cactus-stt-integration branch from 25c6501 to 162e667 Compare February 18, 2026 09:53
@yujonglee yujonglee marked this pull request as ready for review February 18, 2026 13:43
lfs: true
fetch-depth: 0
fetch-tags: true
submodules: recursive
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ARM64 Linux runner builds x86_64 binaries

High Severity

The Linux build job now runs on depot-ubuntu-24.04-arm-8 (ARM64 runner) but still builds for x86_64-unknown-linux-gnu target. Cross-compilation from ARM64 to x86_64 on Linux typically requires additional toolchain setup that isn't present in the workflow, causing build failures or producing incorrect binaries.

Fix in Cursor Fix in Web

/>
<button
onClick={handleUseCactus}
disabled={active || !modelPath}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cactus button remains disabled without model path

Low Severity

The "Use Cactus" button is disabled when !modelPath, but the Rust backend has fallback logic that tries multiple paths including /tmp/cactus-model and models_dir/whisper-small. Users with valid fallback paths can't activate cactus because the UI incorrectly requires the setting to be populated, despite the backend working without it.

Fix in Cursor Fix in Web

.map_err(|e| format!("failed to flush temp file: {}", e))?;

let model =
hypr_cactus::Model::new(model_path).map_err(|e| format!("failed to load model: {}", e))?;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batch transcription loads model on every request

Medium Severity

transcribe_batch calls hypr_cactus::Model::new(model_path) on every HTTP batch request, loading the model from disk each time. Model initialization is typically an expensive operation (seconds), making the batch endpoint impractically slow for any repeated use. The streaming (WebSocket) path loads the model once per connection in run_transcriber, but the batch path has no such caching. The TranscribeService already holds model_path; a shared, lazily-initialized model instance would avoid the repeated load cost.

Additional Locations (1)

Fix in Cursor Fix in Web

"{:04}-{:02}-{:02}T{:02}:{:02}:{:02}.{:03}Z",
year, month, day, hours, minutes, seconds, millis
)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Custom calendar arithmetic reimplements standard library functionality

Low Severity

format_timestamp_now is a 55-line manual implementation of epoch-to-ISO-8601 conversion, including leap-year calculation and month-day table iteration. The chrono crate is already in the workspace dependency tree and would replace this entire function with a single expression like Utc::now().to_rfc3339_opts(SecondsFormat::Millis, true). Hand-rolled calendar math is hard to review and maintain.

Fix in Cursor Fix in Web

.collect::<String>();

assert_eq!(restored, text);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flaky test asserts wrong value on successful parse

Medium Severity

The second block of test_summary asserts restored == text where text includes raw <think>...</think> tags. But when the parser successfully parses a think block, parse_think_block strips the tags and trims the content. The restored string then omits the tags and surrounding whitespace, making the assertion fail. The test only passes by accident (~80% of runs) when random chunk sizes split the <think> tag, causing the parser to fail and preserve raw tags as TextDelta.

Fix in Cursor Fix in Web

permissions:
contents: write
runs-on: depot-ubuntu-24.04-8
runs-on: depot-ubuntu-24.04-arm-8
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linux builds switched from x86_64 to ARM

High Severity

The Linux runner changed from depot-ubuntu-24.04-8 (x86_64) to depot-ubuntu-24.04-arm-8 (ARM) in CD, and similarly in CI. Since the Linux build job has no explicit --target cross-compilation flag, it produces binaries matching the host architecture. This means release artifacts will now be ARM-only, breaking the app for all x86_64 Linux users.

Additional Locations (1)

Fix in Cursor Fix in Web

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

}
return true;
});
const models = configuredProviders?.[providerId]?.models ?? [];
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed filter allows selecting non-downloaded models

Medium Severity

The previous code filtered out Quantized* models that weren't yet downloaded, preventing users from selecting unavailable models. That filter was entirely removed, so all models now appear in the dropdown regardless of download status. Users can select a Quantized model that hasn't been downloaded, which will fail at runtime when the server tries to load a nonexistent model file.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments