fix(build): stabilize cross-platform tts packaging and ci#6
fix(build): stabilize cross-platform tts packaging and ci#6rishiskhare wants to merge 7 commits intomainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a vendored Kokoro ONNX TTS engine and runtime bundling; moves selection-capture into a new module with accessibility/clipboard strategies and new settings/commands; threads a bundled ONNX Runtime path into TTS initialization; expands UI settings; and updates multiple GitHub Actions and Windows NSIS packaging options. Changes
Sequence Diagram(s)sequenceDiagram
participant App as Tauri App
participant Init as initialize_core_logic
participant Resolver as resolve_bundled_onnxruntime
participant TTS as TTSManager
participant Loader as Model Loader Thread
participant ORT as ONNX Runtime
App->>Init: start/run(cli_args)
Init->>Resolver: resolve_bundled_onnxruntime(app_handle)
Resolver-->>Init: Option<PathBuf> (bundled onnx path)
Init->>TTS: TTSManager::new(..., onnxruntime_path)
TTS->>Loader: spawn model loader thread
Loader->>ORT: ensure_onnxruntime_initialized(onnxruntime_path)
ORT-->>Loader: init result (OnceLock)
Loader->>TTS: load model / emit events or emit tts-error
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 15
🧹 Nitpick comments (5)
src/components/settings/SelectionCaptureMethod.tsx (2)
26-49: Consider memoizing the options array.Similar to
SelectionClipboardHandling.tsx, the options array could be memoized since it depends onisMacOSandt.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/components/settings/SelectionCaptureMethod.tsx` around lines 26 - 49, The options array in SelectionCaptureMethod depends on isMacOS and t and should be memoized to avoid recreating it on each render; update the component (SelectionCaptureMethod) to wrap the options creation in a useMemo that lists [isMacOS, t] as dependencies so the array (and its objects with values "auto", "accessibility", "clipboard") is only recomputed when those inputs change.
22-24: Consider memoizing the macOS detection.The
isMacOScheck runs on every render but the user agent doesn't change. Consider moving this outside the component or memoizing it.♻️ Optional improvement
// Move outside component or use useMemo const isMacOS = typeof window !== "undefined" && /mac/i.test(window.navigator.userAgent || ""); // Or if you need it inside the component: const isMacOS = useMemo( () => typeof window !== "undefined" && /mac/i.test(window.navigator.userAgent || ""), [] );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/components/settings/SelectionCaptureMethod.tsx` around lines 22 - 24, The macOS detection assigned to isMacOS in SelectionCaptureMethod.tsx is recomputed on every render; move the expression const isMacOS = typeof window !== "undefined" && /mac/i.test(window.navigator.userAgent || ""); outside the component (module scope) or wrap it in useMemo with an empty dependency array inside the component and import useMemo if needed so the userAgent check runs once and not on every render.src/components/settings/SelectionClipboardHandling.tsx (1)
21-36: Consider memoizing the options array.The
optionsarray is recreated on every render. While this is a minor performance concern, memoizing it withuseMemowould be consistent with best practices for components wrapped inReact.memo.♻️ Optional improvement
+import React, { useMemo } from "react"; -import React from "react"; // ... - const options = [ + const options = useMemo(() => [ { value: "dont_modify", label: t( "settings.advanced.clipboardHandling.options.dontModify", "Don't Modify Clipboard", ), }, { value: "copy_to_clipboard", label: t( "settings.advanced.clipboardHandling.options.copyToClipboard", "Copy Selection To Clipboard", ), }, - ]; + ], [t]);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/components/settings/SelectionClipboardHandling.tsx` around lines 21 - 36, The options array in SelectionClipboardHandling is recreated on every render; wrap its creation with React's useMemo to memoize it (e.g., const options = useMemo(() => [ ... ], [t]) ) so it only recalculates when dependencies change; reference the options variable and the i18n translator function t and ensure SelectionClipboardHandling imports React/useMemo and uses the memoized options where currently used.src-tauri/src/lib.rs (1)
105-110: Hardcoded ONNX Runtime version in candidate filenames.The version
1.23.1is hardcoded in the candidate library names. When updating the bundled ONNX Runtime, these will need to be updated manually. Consider extracting this as a constant or relying solely on the fallback scan.♻️ Consider version-agnostic approach
The fallback scan (lines 120-146) already handles version mismatches, so the versioned candidates primarily serve as an optimization for the common case. This is a reasonable trade-off, but documenting the version coupling would help future maintainers.
+// NOTE: Update these version numbers when upgrading the bundled ONNX Runtime #[cfg(target_os = "macos")] let candidates = ["libonnxruntime.dylib", "libonnxruntime.1.23.1.dylib"];🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src-tauri/src/lib.rs` around lines 105 - 110, The candidate library filenames in lib.rs are hardcoding ONNX Runtime version "1.23.1" (the candidates array used under #[cfg(target_os = "...")]), which forces manual updates when the bundled runtime changes; update this by introducing a single version constant (e.g., ONNX_RUNTIME_VERSION) and build the versioned candidate names from that constant (or remove the versioned entries and rely only on the existing fallback scan), then replace the current candidates arrays in the blocks referencing candidates so future updates require changing only the constant (or none at all if relying on the scan).vendor/tts-rs/src/engines/kokoro/model.rs (1)
189-206: Consider reducing code duplication for speed tensor handling.The
if/elsebranches duplicate the entire input construction and session run, differing only in the speed tensor type.♻️ Reduce duplication with a helper or macro
One option is to extract a helper that abstracts over the speed value type, or use a trait object. However, given the ORT API constraints and the relatively small duplication, the current approach is acceptable for clarity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@vendor/tts-rs/src/engines/kokoro/model.rs` around lines 189 - 206, The two branches duplicate inputs construction and session.run; instead, create the speed tensor once and reuse it: match on self.speed_is_int32 to build a TensorRef for "speed" (e.g. create speed_arr as ndarray::arr1(&[speed as i32]) or ndarray::arr1(&[speed]) and wrap with TensorRef::from_array_view), store that TensorRef in a variable (e.g. speed_tensor), then build inputs using self.tokens_input_name, "style", and the speed_tensor and call self.session.run(inputs) once; reference symbols: self.speed_is_int32, TensorRef::from_array_view, tokens_arr, style_view, self.tokens_input_name, and self.session.run.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src-tauri/src/selection.rs`:
- Around line 90-106: The buffer sizing in unsafe fn cf_to_string is incorrect
because CFStringGetLength returns UTF-16 code units; replace the fixed len*4+1
calculation with a call to CFStringGetMaximumSizeForEncoding(len,
kCFStringEncodingUTF8) (or the equivalent constant in your bindings) plus one
for the NUL terminator to compute buf_size, allocate the Vec accordingly, call
CFStringGetCString as before, and keep CFRelease(ptr); this ensures cf_to_string
handles surrogate pairs and supplementary Unicode correctly.
In `@src-tauri/src/text_normalization.rs`:
- Around line 371-425: normalize_quote_spacing currently does repeated scans
over chars (prev_non_space/next_non_space and position/find), causing O(n²)
behavior; make it linear by precomputing nearest non-space character info
(indices or Option<char>) for every index before the main loop or by converting
the logic into a single-pass state machine. Specifically, in
normalize_quote_spacing compute arrays like nearest_non_space_left and
nearest_non_space_right (or equivalent index maps) once, then replace all uses
of chars[..].iter().rev().find(), chars[idx+1..].iter().position(), and
chars[..].iter().rev().copied().find() with O(1) lookups into those arrays so
the existing checks that call is_quote_char, is_opening_quote, and
should_trim_space_before_opening_quote remain unchanged but run in linear time.
- Around line 427-442: The branch that inserts a missing space before opening
quotes only checks smart single quotes (matches!(ch, '‘' | '’'))—add smart
double quote and straight double quote to that match (e.g., include '“' and '"'
in the matches! set) so the same logic applies to opening double quotes; also
extend the replacement chain on out.replace(...) to include the
colon-to-opening-double-quote cases (e.g., add .replace(":“", ": “") and
optionally .replace(":\"", ": \"") ) so sequences like “:“quoted”” or : "quoted"
get the missing space inserted.
In `@vendor/tts-rs/.cargo_vcs_info.json`:
- Around line 1-6: The vendored JSON metadata (keys "git", "sha1",
"path_in_vcs") is failing Prettier; fix by either adding the expected formatting
(ensure the JSON ends with a newline and matches Prettier style) in this file or
update Prettier configuration to ignore vendored cargo metadata (add a rule in
.prettierignore to exclude files like vendor/**/.cargo_vcs_info.json or
vendor/**/.cargo_vcs_info*). Modify the file to include the trailing newline and
valid JSON formatting if choosing to edit the file, otherwise add the ignore
pattern to the project's Prettier ignore/config so CI stops checking these
generated vendor metadata files.
In `@vendor/tts-rs/Cargo.toml`:
- Around line 60-68: The ort dependency is pinned to "2.0.0-rc.11" in
Cargo.toml; update the version string for the [dependencies.ort] entry to
"2.0.0-rc.12" if you want the latest RC, or alternatively evaluate production
readiness and pin to a stable release (or a known-good 1.x) instead—ensure you
update the version value for the ort dependency and run cargo update + cargo
test to validate compatibility with features ("std","ndarray","load-dynamic")
and optional/default-features settings used in the current manifest.
In `@vendor/tts-rs/examples/kokoro.rs`:
- Around line 4-7: The example imports from the wrong crate name: replace the
non-existent "transcribe_rs" path with the vendored crate name "tts_rs" wherever
KokoroEngine, KokoroInferenceParams, KokoroModelParams and SynthesisEngine are
imported (check the use statement that references
KokoroEngine/KokoroInferenceParams/KokoroModelParams and SynthesisEngine) so the
example compiles against the actual library defined in Cargo.toml.
In `@vendor/tts-rs/README.md`:
- Line 1: The README.md in the tts-rs package is failing Prettier; run the
repository's formatter on that file (e.g., execute prettier --write README.md or
your project's formatting script) to normalize markdown (wrapping, spacing, line
endings) so it passes prettier --check ., then commit the formatted README.md to
unblock CI.
- Around line 15-18: The README shows the wrong dependency version and an
outdated synthesize call signature; update the toml dependency version from
2026.2.1 to 2026.2.3 and change the example call using engine.synthesize so it
passes a single Option<SynthesisParams> as the second argument (e.g., build a
SynthesisParams with the desired voice "af_heart" and pass Some(params) instead
of two separate args), referencing the synthesize function and the
SynthesisParams struct in the example.
In `@vendor/tts-rs/src/engines/kokoro/mod.rs`:
- Around line 52-62: The doc examples import the wrong crate name; update the
example imports to use the actual crate/lib name (tts_rs or tts-rs as used by
the crate) instead of transcribe_rs so doctests compile; specifically change the
use statements that reference transcribe_rs::{SynthesisEngine,
engines::kokoro::{KokoroEngine, KokoroInferenceParams}} to use tts_rs (matching
the crate's lib name) in both occurrences surrounding the KokoroEngine and
KokoroInferenceParams examples.
In `@vendor/tts-rs/src/engines/kokoro/phonemizer.rs`:
- Around line 30-44: The function voice_lang uses byte slicing
(&voice[..voice.len().min(2)]) which can panic on multi-byte UTF‑8 names; change
the prefix extraction in voice_lang to be character-safe (e.g., take the first
two chars using voice.chars().take(2).collect::<String>() or an equivalent
char-indexed slice) and then match on prefix.as_str() (or otherwise convert to
&str) instead of the current byte-slice; keep the rest of the match arms and
returned &'static str values unchanged.
In `@vendor/tts-rs/src/engines/kokoro/vocab.rs`:
- Around line 22-31: The loop that constructs the vocab map currently truncates
multi-character keys by taking only the first char (in the for (k, v) in
vocab_obj block), which can overwrite entries; update the validation in that
loop to reject any key whose character count is not exactly one (e.g., check
k.chars().count() != 1) and return a KokoroError::Config with a clear message
like "Vocab key must be a single character" including the offending key, keeping
the existing checks for empty keys and non-integer values (the error should be
raised before inserting into map and before parsing v with as_i64), so
map.insert(ch, id) only runs for valid single-character keys.
In `@vendor/tts-rs/src/engines/kokoro/voices.rs`:
- Around line 34-43: The voice name extraction currently uses
raw_name.trim_end_matches('/') and trim_end_matches(".npy") which leaves
directory prefixes for nested entries; update the logic around
entry.name().to_string() and voice_name so you first get the path basename
(e.g., via Path::new(&raw_name).file_name()) and then strip the ".npy" extension
(or validate the extension) to derive the voice_name; keep the existing guard
that skips empty names or directory entries (raw_name ending with '/'), and
reference raw_name, voice_name, and entry.name().to_string() when making the
change.
- Around line 61-69: get_style may panic when a found voice has an empty styles
vector because clamping uses saturating_sub(1) then indexing; guard by checking
if styles.is_empty() after retrieving with self.voices.get(voice) and return an
Err instead of indexing: e.g., return
Err(KokoroError::NoStyles(voice.to_string())) or add a suitable KokoroError
variant (update the KokoroError enum and any callers), otherwise compute clamped
= idx.min(styles.len() - 1) and return styles[clamped].
- Around line 97-106: The parser reads header_len from bytes 8..9 assuming numpy
1.0 but never checks the version bytes at positions 6..7; update the logic
(around header_len, data_offset, and the KokoroError::VoiceParse check) to first
read the version bytes (data[6], data[7]) and branch: if major==1 read a 2-byte
little-endian u16 header_len from data[8..10], if major==2 read a 4-byte
little-endian u32 header_len from data[8..12] (cast to usize), then set
data_offset = 10 + header_len for v1 and data_offset = 12 + header_len for v2;
add bounds checks to ensure data has those version/header bytes before reading
and return a clear KokoroError::VoiceParse on unsupported/unknown major versions
or truncated header.
In `@vendor/tts-rs/src/lib.rs`:
- Around line 62-66: The duration_secs method can divide by zero if sample_rate
== 0; update the duration_secs(&self) function to guard against a zero
sample_rate (e.g., if self.sample_rate == 0 { return 0.0 } or otherwise handle
gracefully) so that calculating self.samples.len() as f64 / self.sample_rate as
f64 never produces infinity/NaN; locate and modify the duration_secs method on
the struct that holds samples and sample_rate to implement this check and return
a sensible default when sample_rate is zero.
---
Nitpick comments:
In `@src-tauri/src/lib.rs`:
- Around line 105-110: The candidate library filenames in lib.rs are hardcoding
ONNX Runtime version "1.23.1" (the candidates array used under #[cfg(target_os =
"...")]), which forces manual updates when the bundled runtime changes; update
this by introducing a single version constant (e.g., ONNX_RUNTIME_VERSION) and
build the versioned candidate names from that constant (or remove the versioned
entries and rely only on the existing fallback scan), then replace the current
candidates arrays in the blocks referencing candidates so future updates require
changing only the constant (or none at all if relying on the scan).
In `@src/components/settings/SelectionCaptureMethod.tsx`:
- Around line 26-49: The options array in SelectionCaptureMethod depends on
isMacOS and t and should be memoized to avoid recreating it on each render;
update the component (SelectionCaptureMethod) to wrap the options creation in a
useMemo that lists [isMacOS, t] as dependencies so the array (and its objects
with values "auto", "accessibility", "clipboard") is only recomputed when those
inputs change.
- Around line 22-24: The macOS detection assigned to isMacOS in
SelectionCaptureMethod.tsx is recomputed on every render; move the expression
const isMacOS = typeof window !== "undefined" &&
/mac/i.test(window.navigator.userAgent || ""); outside the component (module
scope) or wrap it in useMemo with an empty dependency array inside the component
and import useMemo if needed so the userAgent check runs once and not on every
render.
In `@src/components/settings/SelectionClipboardHandling.tsx`:
- Around line 21-36: The options array in SelectionClipboardHandling is
recreated on every render; wrap its creation with React's useMemo to memoize it
(e.g., const options = useMemo(() => [ ... ], [t]) ) so it only recalculates
when dependencies change; reference the options variable and the i18n translator
function t and ensure SelectionClipboardHandling imports React/useMemo and uses
the memoized options where currently used.
In `@vendor/tts-rs/src/engines/kokoro/model.rs`:
- Around line 189-206: The two branches duplicate inputs construction and
session.run; instead, create the speed tensor once and reuse it: match on
self.speed_is_int32 to build a TensorRef for "speed" (e.g. create speed_arr as
ndarray::arr1(&[speed as i32]) or ndarray::arr1(&[speed]) and wrap with
TensorRef::from_array_view), store that TensorRef in a variable (e.g.
speed_tensor), then build inputs using self.tokens_input_name, "style", and the
speed_tensor and call self.session.run(inputs) once; reference symbols:
self.speed_is_int32, TensorRef::from_array_view, tokens_arr, style_view,
self.tokens_input_name, and self.session.run.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4c10f6a9-e3df-4dee-98a4-9678c91d222b
⛔ Files ignored due to path filters (6)
src-tauri/Cargo.lockis excluded by!**/*.lockvendor/tts-rs/output.wavis excluded by!**/*.wavvendor/tts-rs/samples/dots.wavis excluded by!**/*.wavvendor/tts-rs/samples/jfk.wavis excluded by!**/*.wavvendor/tts-rs/samples/product_names.wavis excluded by!**/*.wavvendor/tts-rs/samples/russian.wavis excluded by!**/*.wav
📒 Files selected for processing (41)
.github/workflows/build-test.yml.github/workflows/build.yml.github/workflows/lint.yml.github/workflows/playwright.yml.github/workflows/pr-test-build.yml.github/workflows/prettier.yml.github/workflows/release.yml.github/workflows/test.ymldocs/windows-tts-process-test-plan.mdsrc-tauri/Cargo.tomlsrc-tauri/src/actions.rssrc-tauri/src/lib.rssrc-tauri/src/managers/tts.rssrc-tauri/src/selection.rssrc-tauri/src/settings.rssrc-tauri/src/shortcut/mod.rssrc-tauri/src/text_normalization.rssrc-tauri/tauri.conf.jsonsrc/bindings.tssrc/components/settings/ModelUnloadTimeout.tsxsrc/components/settings/SelectionCaptureMethod.tsxsrc/components/settings/SelectionClipboardHandling.tsxsrc/components/settings/advanced/AdvancedSettings.tsxsrc/stores/settingsStore.tsvendor/tts-rs/.cargo-okvendor/tts-rs/.cargo/config.tomlvendor/tts-rs/.cargo_vcs_info.jsonvendor/tts-rs/.gitignorevendor/tts-rs/Cargo.tomlvendor/tts-rs/Cargo.toml.origvendor/tts-rs/LICENSEvendor/tts-rs/README.mdvendor/tts-rs/examples/kokoro.rsvendor/tts-rs/src/engines/kokoro/engine.rsvendor/tts-rs/src/engines/kokoro/mod.rsvendor/tts-rs/src/engines/kokoro/model.rsvendor/tts-rs/src/engines/kokoro/phonemizer.rsvendor/tts-rs/src/engines/kokoro/vocab.rsvendor/tts-rs/src/engines/kokoro/voices.rsvendor/tts-rs/src/engines/mod.rsvendor/tts-rs/src/lib.rs
|
@coderabbitai autofix |
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Fixes Applied SuccessfullyFixed 13 file(s) based on 15 unresolved review comments. Files modified:
Commit: The changes have been pushed to the Time taken: |
Fixed 13 file(s) based on 15 unresolved review comments. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (1)
vendor/tts-rs/src/engines/kokoro/phonemizer.rs (1)
208-210:⚠️ Potential issue | 🟠 MajorPreserve the existing
LD_LIBRARY_PATHwhen adding the bundled directory.This replaces the parent process's library search path outright. If the app was already launched with required entries in
LD_LIBRARY_PATH, the bundled espeak invocation can fail even though those paths were valid before. Prepend/appendbin_dirinstead of overwriting it.💡 Suggested fix
#[cfg(target_os = "linux")] if let Some(bin_dir) = espeak.bin_path.as_deref().and_then(|p| p.parent()) { - cmd.env("LD_LIBRARY_PATH", bin_dir); + let library_path = if let Some(existing) = std::env::var_os("LD_LIBRARY_PATH") { + let mut paths = vec![bin_dir.as_os_str().to_owned()]; + paths.extend(std::env::split_paths(&existing).map(|p| p.into_os_string())); + std::env::join_paths(paths).map_err(|e| { + KokoroError::Config(format!("Invalid LD_LIBRARY_PATH: {e}")) + })? + } else { + bin_dir.as_os_str().to_owned() + }; + cmd.env("LD_LIBRARY_PATH", library_path); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@vendor/tts-rs/src/engines/kokoro/phonemizer.rs` around lines 208 - 210, The current code overwrites LD_LIBRARY_PATH with only the espeak parent dir; change it to preserve the existing value by reading the parent process env (std::env::var_os("LD_LIBRARY_PATH")), and set cmd.env("LD_LIBRARY_PATH", combined) where combined is bin_dir prepended (or appended) to the existing value separated by ':' (use bin_dir.as_os_str() and join with the existing OsString safely); keep the same conditional block around espeak.bin_path and use cmd.env rather than env_remove so prior library paths are retained.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src-tauri/src/selection.rs`:
- Line 139: The code currently treats any failure of clipboard.read_text() as
None (previous_clipboard) and later clears the clipboard when previous_clipboard
is None, which can destructively erase unreadable/non-text clipboard contents
(especially in DontModify mode); change the logic so previous_clipboard is only
None when there truly was no clipboard content and NOT simply when read_text()
failed—capture read_text() success into Some(text) and on failure preserve a
sentinel meaning "unreadable" (or use a tri-state) and when restoring only call
clipboard.set_text(...) if previous_clipboard is Some(text); ensure the
DontModify/path that currently clears on None instead skips any write when the
previous content was unreadable or read failed, updating the variables/branches
around previous_clipboard and the restore/clear code to only clear or set_text
when you have a concrete String from clipboard.read_text().
- Around line 154-165: The code uses fallible operations
(app.try_state::<EnigoState>()? and .lock().ok()? ) after priming the clipboard,
so those early returns can skip restore_clipboard; change those ?/ok() usages to
explicit matches that on Err/None call restore_clipboard(&clipboard,
previous_clipboard.as_ref()) and then return None. Specifically, in the block
using EnigoState and send_copy_ctrl_c, replace app.try_state::<EnigoState>()?
and let mut enigo = enigo_state.0.lock().ok()? with match/if let forms that
restore the clipboard before returning, keeping the existing send_copy_ctrl_c
error path intact.
In `@src-tauri/src/text_normalization.rs`:
- Around line 510-512: The match arm in normalize_quote_spacing() currently
treats '"' and '”' as requiring a separator after an alphanumeric left (pattern
(_, Some(left), Some(right @ ('"' | '”' | '’'))) ...), which forces an extra
space before closing quotes; update that arm to only match the curly apostrophe
'’' (e.g. (_, Some(left), Some(right @ ('’'))) if left.is_alphanumeric() => { /*
same logic */ }) so that '”' and '"' are no longer treated as needing a
separator and closing quotes won't get an inserted space.
- Around line 464-469: The global .replace call on out that does
.replace("'and", "' and") is too broad and corrupts tokens like 'and' or words
like android; update the post-pass so you only insert a space when 'and' is a
standalone quoted token or token boundary. Replace that chained .replace("'and",
"' and") with a contextual check over out (either a small manual scan that
checks surrounding chars or a targeted regex) that only matches "'and" when the
following character is a word-boundary or a closing quote/whitespace/punctuation
(e.g., match patterns like "'and'" or "'and " or start-of-string cases) and
leave occurrences inside larger words (like "android") untouched; modify the
code operating on the out variable where the replace chain appears to use that
contextual replacement.
In `@vendor/tts-rs/src/engines/kokoro/phonemizer.rs`:
- Around line 112-118: The loop over text.char_indices() treats '\r' and '\n'
separately causing two boundary tokens for CRLF; modify the block around
map_boundary_punctuation / is_numeric_connector_between_digits so that when ch
== '\r' and the next character is '\n' you treat them as one boundary:
consume/skip the '\r' and handle only the combined CRLF as a single punctuation
token (ensure you account for combined length when calling
is_numeric_connector_between_digits and when calling flush_text_part/parts.push
so only one TextPart::Punct is produced). Make this change inside the same for
(idx, ch) in text.char_indices() loop using the existing helpers
(map_boundary_punctuation, is_numeric_connector_between_digits, flush_text_part,
parts, current).
---
Duplicate comments:
In `@vendor/tts-rs/src/engines/kokoro/phonemizer.rs`:
- Around line 208-210: The current code overwrites LD_LIBRARY_PATH with only the
espeak parent dir; change it to preserve the existing value by reading the
parent process env (std::env::var_os("LD_LIBRARY_PATH")), and set
cmd.env("LD_LIBRARY_PATH", combined) where combined is bin_dir prepended (or
appended) to the existing value separated by ':' (use bin_dir.as_os_str() and
join with the existing OsString safely); keep the same conditional block around
espeak.bin_path and use cmd.env rather than env_remove so prior library paths
are retained.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 84504253-c954-43fb-94e4-0b039134beae
⛔ Files ignored due to path filters (1)
src-tauri/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (12)
.prettierignoresrc-tauri/Cargo.tomlsrc-tauri/src/selection.rssrc-tauri/src/text_normalization.rsvendor/tts-rs/Cargo.tomlvendor/tts-rs/README.mdvendor/tts-rs/examples/kokoro.rsvendor/tts-rs/src/engines/kokoro/mod.rsvendor/tts-rs/src/engines/kokoro/phonemizer.rsvendor/tts-rs/src/engines/kokoro/vocab.rsvendor/tts-rs/src/engines/kokoro/voices.rsvendor/tts-rs/src/lib.rs
✅ Files skipped from review due to trivial changes (5)
- .prettierignore
- src-tauri/Cargo.toml
- vendor/tts-rs/examples/kokoro.rs
- vendor/tts-rs/Cargo.toml
- vendor/tts-rs/README.md
🚧 Files skipped from review as they are similar to previous changes (2)
- vendor/tts-rs/src/engines/kokoro/mod.rs
- vendor/tts-rs/src/lib.rs
|
@CodeRabbit autofix the rest of the issues! |
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Fixes Applied SuccessfullyFixed 3 file(s) based on 5 unresolved review comments. Files modified:
Commit: The changes have been pushed to the Time taken: |
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. The branch was updated while autofix was in progress. Please try again. |
Fixed 3 file(s) based on 5 unresolved review comments. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src-tauri/src/text_normalization.rs (1)
212-243:⚠️ Potential issue | 🟠 MajorTrack trailing-fragment whitespace instead of guessing before
’.
had_leading_whitespaceonly preserves separators when the incoming fragment starts with whitespace. When the separator was trimmed off the previous fragment, both join sites fall back toneeds_space_between(), and theSome(right @ '’')branch then turns joins likedon+’toruser+’sintodon ’t/user ’s. Please carry a trailing-whitespace bit from the previous raw fragment (or per-buffer state) and use that to decide whether a separator existed, rather than inferring it from the apostrophe shape.Suggested direction
- capture
had_trailing_whitespacebefore callingnormalize_inline_whitespace()- store that bit alongside
SpeechTextRenderer.output,LinkContext.text, andImageContext.alt_text- consult that state before
needs_space_between(...)- drop the unconditional “incoming
’after alnum means insert a space” heuristic- add a regression case for a split fragment such as
don**'t**Also applies to: 324-342, 527-543
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src-tauri/src/text_normalization.rs` around lines 212 - 243, The push_text logic uses had_leading_whitespace only and then falls back to needs_space_between which mis-detects joins before right single-quote (’); capture a had_trailing_whitespace boolean from the incoming raw fragment before calling normalize_inline_whitespace() and store that trailing-whitespace bit alongside the buffer state for SpeechTextRenderer.output, LinkContext.text, and ImageContext.alt_text (i.e. where append_segment appends); when deciding to insert a separator consult the previous buffer's had_trailing_whitespace rather than inferring from the upcoming apostrophe, remove the special-case “incoming ’ after alnum => insert space” heuristic in needs_space_between usage, and add a regression test covering a split fragment like "don" + "’t" to ensure no spurious space is inserted.
♻️ Duplicate comments (1)
src-tauri/src/selection.rs (1)
149-155:⚠️ Potential issue | 🟠 Major
DontModifystill overwrites unreadable clipboard contents.Line 149 collapses every
read_text()failure intoClipboardState::Unreadable, but Line 166 still primes the clipboard. Becauserestore_clipboard()intentionally no-ops forUnreadable, the defaultDontModifypath leaves image/binary clipboard contents replaced by the sentinel or copied text.ClipboardState::Emptyis also never constructed, so the clear branch is currently dead.🛠️ Suggested fix
let previous_clipboard = match clipboard.read_text() { - Ok(text) => ClipboardState::Text(text), - Err(_) => { - // Try to determine if clipboard has content but is unreadable (e.g., image) - // vs. truly empty. Since we can't reliably distinguish, assume unreadable. - ClipboardState::Unreadable - } + Ok(text) if text.is_empty() => ClipboardState::Empty, + Ok(text) => ClipboardState::Text(text), + Err(err) => { + debug!("Unable to read existing clipboard text: {}", err); + if handling == ClipboardHandling::DontModify { + warn!("Clipboard is unreadable; skipping capture in DontModify mode"); + return None; + } + ClipboardState::Unreadable + } };Also applies to: 166-213, 222-237
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src-tauri/src/selection.rs` around lines 149 - 155, The code collapses all read_text() failures into ClipboardState::Unreadable and never produces ClipboardState::Empty, causing DontModify paths to prime/overwrite unreadable clipboards; change the read_text handling in the block that sets previous_clipboard so that: when read_text() returns Ok(text) and text.is_empty() set previous_clipboard = ClipboardState::Empty, when Ok(non-empty) set ClipboardState::Text(text), and when Err keep ClipboardState::Unreadable; then update the logic that primes/sets the clipboard (the code that runs before calling restore_clipboard() and any DontModify handling) to only write the sentinel or copied text when previous_clipboard == ClipboardState::Text or previous_clipboard == ClipboardState::Empty (i.e., never prime when previous_clipboard == ClipboardState::Unreadable). Ensure restore_clipboard() behavior stays unchanged for Unreadable.
🧹 Nitpick comments (2)
src-tauri/src/selection.rs (1)
197-204: Poll for clipboard change instead of sleeping once.A single 120 ms sleep makes this path timing-dependent. On slower apps or remote sessions the clipboard can update after the read, which turns a successful copy into a spurious
None.♻️ Suggested approach
- std::thread::sleep(std::time::Duration::from_millis(CLIPBOARD_COPY_DELAY_MS)); - - let copied_text = clipboard.read_text().ok(); + let deadline = + std::time::Instant::now() + std::time::Duration::from_millis(CLIPBOARD_COPY_DELAY_MS); + let copied_text = loop { + let text = clipboard.read_text().ok(); + if text.as_deref() != Some(sentinel.as_str()) || std::time::Instant::now() >= deadline { + break text; + } + std::thread::sleep(std::time::Duration::from_millis(15)); + };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src-tauri/src/selection.rs` around lines 197 - 204, Replace the single sleep-based wait with a short polling loop that repeatedly calls clipboard.read_text() and compares the result to the sentinel until it changes or a timeout elapses (use CLIPBOARD_COPY_DELAY_MS as the total timeout and a small per-iteration interval); update the logic around clipboard.read_text(), copied_text, and captured so the code returns the first non-empty, trimmed, non-sentinel value when observed or None after timeout, ensuring you preserve the sentinel comparison and string trimming/map-to-owned behavior.vendor/tts-rs/src/engines/kokoro/phonemizer.rs (1)
316-355: Add a CRLF regression test.The implementation now collapses
\r\ninto a singleTextPart::Punct('.'), but the split tests still only cover plain punctuation and numeric separators. A smallHello\r\nworldassertion would keep that Windows-specific fix from regressing.🧪 Suggested test
#[test] fn still_splits_comma_when_not_between_digits() { let parts = split_text_parts("Value 2, next"); assert_eq!( parts, vec![ TextPart::Text("Value 2".to_string()), TextPart::Punct(','), TextPart::Text("next".to_string()), ] ); } + + #[test] + fn collapses_crlf_into_single_boundary() { + let parts = split_text_parts("Hello\r\nworld"); + assert_eq!( + parts, + vec![ + TextPart::Text("Hello".to_string()), + TextPart::Punct('.'), + TextPart::Text("world".to_string()), + ] + ); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@vendor/tts-rs/src/engines/kokoro/phonemizer.rs` around lines 316 - 355, Add a new unit test that ensures split_text_parts treats Windows CRLF ("\r\n") as the single punctuation produced by the current implementation: call split_text_parts("Hello\r\nworld") and assert the result equals vec![TextPart::Text("Hello".to_string()), TextPart::Punct('.'), TextPart::Text("world".to_string())]; name the test something like splits_crlf_into_single_punct and place it alongside the other tests for split_text_parts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src-tauri/src/text_normalization.rs`:
- Around line 503-516: The issue: straight apostrophes produced by
decode_html_entity("apos") are not recognized as quotes, so update the quote
helpers to treat '\'' as a quote: add '\'' to the match arms in is_opening_quote
(treat it like '‘'/'’' handling) and include '\'' in is_quote_char's matches so
the quote-tightening pass recognizes ' as a quote; alternatively, normalize
' to ’ at decode time — pick one consistent approach and update the
corresponding code paths (is_opening_quote and is_quote_char or the HTML decode)
so straight single quotes are handled.
In `@vendor/tts-rs/src/engines/kokoro/phonemizer.rs`:
- Around line 237-260: The espeak-ng child process in phonemizer.rs (spawned
into variable child) is currently unbounded and can block in write_all() and
wait_with_output(); update phonemizer.rs to perform the write + wait under a
hard timeout (use std::time::Duration and a timeout mechanism such as
std::thread + join with timeout or a wait_timeout helper), ensure stdin (after
canonicalize_espeak_stdin_payload) is flushed/closed before waiting, and if the
timeout expires kill the child (child.kill()/child.wait()) and return a new
KokoroError variant (e.g., KokoroError::Timeout) instead of hanging; wire this
into callers like phonemize() so a wedged espeak-ng fails fast.
- Around line 226-229: The current block that sets LD_LIBRARY_PATH via
cmd.env("LD_LIBRARY_PATH", bin_dir) overwrites any inherited value; change it to
prepend bin_dir to the existing LD_LIBRARY_PATH instead: read the current
LD_LIBRARY_PATH from std::env::var_os (or Command's inherited env), build a new
value that is "bin_dir:<existing>" when an existing value is present (or just
"bin_dir" when not), and pass that combined string to cmd.env. Update the code
around espeak.bin_path.as_deref().and_then(|p| p.parent()) and the cmd.env call
so the bundled directory is prepended rather than clobbering the parent's
LD_LIBRARY_PATH.
---
Outside diff comments:
In `@src-tauri/src/text_normalization.rs`:
- Around line 212-243: The push_text logic uses had_leading_whitespace only and
then falls back to needs_space_between which mis-detects joins before right
single-quote (’); capture a had_trailing_whitespace boolean from the incoming
raw fragment before calling normalize_inline_whitespace() and store that
trailing-whitespace bit alongside the buffer state for
SpeechTextRenderer.output, LinkContext.text, and ImageContext.alt_text (i.e.
where append_segment appends); when deciding to insert a separator consult the
previous buffer's had_trailing_whitespace rather than inferring from the
upcoming apostrophe, remove the special-case “incoming ’ after alnum => insert
space” heuristic in needs_space_between usage, and add a regression test
covering a split fragment like "don" + "’t" to ensure no spurious space is
inserted.
---
Duplicate comments:
In `@src-tauri/src/selection.rs`:
- Around line 149-155: The code collapses all read_text() failures into
ClipboardState::Unreadable and never produces ClipboardState::Empty, causing
DontModify paths to prime/overwrite unreadable clipboards; change the read_text
handling in the block that sets previous_clipboard so that: when read_text()
returns Ok(text) and text.is_empty() set previous_clipboard =
ClipboardState::Empty, when Ok(non-empty) set ClipboardState::Text(text), and
when Err keep ClipboardState::Unreadable; then update the logic that primes/sets
the clipboard (the code that runs before calling restore_clipboard() and any
DontModify handling) to only write the sentinel or copied text when
previous_clipboard == ClipboardState::Text or previous_clipboard ==
ClipboardState::Empty (i.e., never prime when previous_clipboard ==
ClipboardState::Unreadable). Ensure restore_clipboard() behavior stays unchanged
for Unreadable.
---
Nitpick comments:
In `@src-tauri/src/selection.rs`:
- Around line 197-204: Replace the single sleep-based wait with a short polling
loop that repeatedly calls clipboard.read_text() and compares the result to the
sentinel until it changes or a timeout elapses (use CLIPBOARD_COPY_DELAY_MS as
the total timeout and a small per-iteration interval); update the logic around
clipboard.read_text(), copied_text, and captured so the code returns the first
non-empty, trimmed, non-sentinel value when observed or None after timeout,
ensuring you preserve the sentinel comparison and string trimming/map-to-owned
behavior.
In `@vendor/tts-rs/src/engines/kokoro/phonemizer.rs`:
- Around line 316-355: Add a new unit test that ensures split_text_parts treats
Windows CRLF ("\r\n") as the single punctuation produced by the current
implementation: call split_text_parts("Hello\r\nworld") and assert the result
equals vec![TextPart::Text("Hello".to_string()), TextPart::Punct('.'),
TextPart::Text("world".to_string())]; name the test something like
splits_crlf_into_single_punct and place it alongside the other tests for
split_text_parts.
🪄 Autofix (Beta)
✅ Autofix completed
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 36c74298-c5e2-425d-85ab-dc54716530c8
📒 Files selected for processing (3)
src-tauri/src/selection.rssrc-tauri/src/text_normalization.rsvendor/tts-rs/src/engines/kokoro/phonemizer.rs
|
@CodeRabbit fix the major issues |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 198246
Here are the 4 remaining unresolved major issues I'll fix:
🏁 Script executed: Length of output: 1080 🏁 Script executed: Length of output: 469 🏁 Script executed: Length of output: 2424 🏁 Script executed: Length of output: 3982 🏁 Script executed: Length of output: 3313 🏁 Script executed: Length of output: 1014
Remaining Major Issues
Failed to handle agent chat message. Please try again. |
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Fixes Applied SuccessfullyFixed 3 file(s) based on 3 unresolved review comments. Files modified:
Commit: The changes have been pushed to the Time taken: |
Fixed 3 file(s) based on 3 unresolved review comments. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
vendor/tts-rs/src/engines/kokoro/phonemizer.rs (1)
259-320:⚠️ Potential issue | 🔴 CriticalThe subprocess timeout can still hang instead of failing fast.
The 30s deadline only starts after Line 265 returns, so a wedged child can still block indefinitely in
write_all(). Then, ifrecv_timeout()fires, the timeout branch on Line 315 tries to lock the same mutex that the waiter thread is holding acrosswait_with_output(), so the kill path can deadlock and never returnKokoroError::Timeout(_).Please restructure this so the timeout covers the full write+wait lifecycle and the thread responsible for timing out still owns the
Childhandle it needs to kill.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@vendor/tts-rs/src/engines/kokoro/phonemizer.rs` around lines 259 - 320, The current timeout doesn't cover the blocking write and deadlocks because the waiter thread holds the Child; fix by moving the entire write+wait lifecycle into the spawned worker so the main thread can enforce the timeout and kill by PID: change wait_with_timeout (and its call site where canonicalize_espeak_stdin_payload is used) so it captures child.id() first, then spawn a thread that takes ownership of the Child and the stdin payload, performs write_all(stdin_payload) and then wait_with_output() and sends the Result over a channel; in the main thread use recv_timeout(timeout) to wait, and on timeout invoke a platform-appropriate kill of the child by the saved child_id (then return KokoroError::Timeout), ensuring no mutex is held across blocking waits and the timeout covers both write and wait.
🧹 Nitpick comments (1)
vendor/tts-rs/src/engines/kokoro/model.rs (1)
235-241: Make the fallback.onnxselection deterministic.
std::fs::read_dir()does not guarantee ordering. Ifkokoro-quant-convinteger.onnxis absent and the directory contains multiple.onnxfiles, different filesystems can pick different models on different platforms. Sorting the candidates before choosing the fallback would make packaging behavior stable.💡 Example refactor
- // Scan for any .onnx file - for entry in std::fs::read_dir(model_dir)? { - let entry = entry?; - let path = entry.path(); - if path.extension().and_then(|e| e.to_str()) == Some("onnx") { - log::info!("Using ONNX file: {}", path.display()); - return Ok(path); - } - } + let mut candidates = std::fs::read_dir(model_dir)? + .filter_map(|entry| entry.ok().map(|e| e.path())) + .filter(|path| path.extension().and_then(|e| e.to_str()) == Some("onnx")) + .collect::<Vec<_>>(); + candidates.sort(); + + if let Some(path) = candidates.into_iter().next() { + log::info!("Using ONNX file: {}", path.display()); + return Ok(path); + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@vendor/tts-rs/src/engines/kokoro/model.rs` around lines 235 - 241, The current loop using std::fs::read_dir(model_dir) in vendor/tts-rs/src/engines/kokoro/model.rs returns the first found .onnx file non-deterministically; change the logic in that scanning block (the code that iterates entries and returns a path) to collect all entries whose extension is "onnx" into a Vec<PathBuf>, sort them deterministically (e.g., by file_name() or display() string), then choose the first sorted candidate (after the existing explicit check for "kokoro-quant-convinteger.onnx" is still applied) and return it, so the fallback selection is stable across filesystems.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@vendor/tts-rs/src/engines/kokoro/model.rs`:
- Around line 337-358: split_chunks currently hardcodes punctuation IDs
([1,2,3,4,5,6]) which breaks on vocab layouts loaded from config; change the API
so the caller passes the punctuation ID set (derived from self.vocab) into
split_chunks (or add a new split_chunks_with_punct(ids: &[i64], punct_ids:
&[i64]) and use that) and update calls where split_chunks is used to compute
punct_ids from self.vocab (e.g., map characters like ';',':',',','.','!','?' to
their IDs) before invoking the chunker, keeping MAX_PHONEME_LEN behavior
unchanged.
---
Duplicate comments:
In `@vendor/tts-rs/src/engines/kokoro/phonemizer.rs`:
- Around line 259-320: The current timeout doesn't cover the blocking write and
deadlocks because the waiter thread holds the Child; fix by moving the entire
write+wait lifecycle into the spawned worker so the main thread can enforce the
timeout and kill by PID: change wait_with_timeout (and its call site where
canonicalize_espeak_stdin_payload is used) so it captures child.id() first, then
spawn a thread that takes ownership of the Child and the stdin payload, performs
write_all(stdin_payload) and then wait_with_output() and sends the Result over a
channel; in the main thread use recv_timeout(timeout) to wait, and on timeout
invoke a platform-appropriate kill of the child by the saved child_id (then
return KokoroError::Timeout), ensuring no mutex is held across blocking waits
and the timeout covers both write and wait.
---
Nitpick comments:
In `@vendor/tts-rs/src/engines/kokoro/model.rs`:
- Around line 235-241: The current loop using std::fs::read_dir(model_dir) in
vendor/tts-rs/src/engines/kokoro/model.rs returns the first found .onnx file
non-deterministically; change the logic in that scanning block (the code that
iterates entries and returns a path) to collect all entries whose extension is
"onnx" into a Vec<PathBuf>, sort them deterministically (e.g., by file_name() or
display() string), then choose the first sorted candidate (after the existing
explicit check for "kokoro-quant-convinteger.onnx" is still applied) and return
it, so the fallback selection is stable across filesystems.
🪄 Autofix (Beta)
✅ Autofix completed
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3fdc6627-3ce9-4eaa-87b5-4b54aab2d352
📒 Files selected for processing (3)
src-tauri/src/text_normalization.rsvendor/tts-rs/src/engines/kokoro/model.rsvendor/tts-rs/src/engines/kokoro/phonemizer.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- src-tauri/src/text_normalization.rs
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Fixes Applied SuccessfullyFixed 1 file(s) based on 1 unresolved review comment. Files modified:
Commit: The changes have been pushed to the Time taken: |
Fixed 1 file(s) based on 1 unresolved review comment. Co-authored-by: CodeRabbit <noreply@coderabbit.ai>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@vendor/tts-rs/src/engines/kokoro/model.rs`:
- Around line 241-247: The current loop over std::fs::read_dir(model_dir)
returns the first .onnx entry in nondeterministic filesystem order; instead
collect all paths whose extension == "onnx" into a Vec, sort them
deterministically (e.g., by file_name() or display string using
to_string_lossy()), and then return the first element of the sorted list; update
the code around the read_dir loop that uses entry.path() and the ONNX check to
perform collection + sort before returning to ensure deterministic model
selection.
- Around line 318-339: Change detect_tokens_input(session: &Session) and
detect_speed_type(session: &Session) to return Result<T, KokoroError> instead of
silently returning defaults: detect_tokens_input should return Result<String,
KokoroError> and return Err(KokoroError::MissingInput("input_ids/tokens")) if
neither "input_ids" nor "tokens" is present; detect_speed_type should return
Result<bool, KokoroError> and return Err(KokoroError::MissingInput("speed")) if
"speed" is absent (or Err with a clear message if dtype cannot be determined).
Update the callers in the model loader to use the ? operator to propagate errors
during model load so missing inputs fail fast. Ensure you reference the Session
input iteration (session.inputs()), input.name(), and input.dtype() when
implementing the checks and error construction.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4bf1ea05-ace5-4e06-b6b6-cb63f0e632f3
📒 Files selected for processing (1)
vendor/tts-rs/src/engines/kokoro/model.rs
| for entry in std::fs::read_dir(model_dir)? { | ||
| let entry = entry?; | ||
| let path = entry.path(); | ||
| if path.extension().and_then(|e| e.to_str()) == Some("onnx") { | ||
| log::info!("Using ONNX file: {}", path.display()); | ||
| return Ok(path); | ||
| } |
There was a problem hiding this comment.
Fallback ONNX selection is nondeterministic.
Line 241-Line 247 picks the first .onnx from read_dir, but directory iteration order is not stable across platforms/filesystems. This can load different models in CI/package environments.
💡 Suggested fix
- // Scan for any .onnx file
- for entry in std::fs::read_dir(model_dir)? {
- let entry = entry?;
- let path = entry.path();
- if path.extension().and_then(|e| e.to_str()) == Some("onnx") {
- log::info!("Using ONNX file: {}", path.display());
- return Ok(path);
- }
- }
+ // Scan for any .onnx file deterministically
+ let mut candidates = Vec::new();
+ for entry in std::fs::read_dir(model_dir)? {
+ let entry = entry?;
+ let path = entry.path();
+ if path.extension().and_then(|e| e.to_str()) == Some("onnx") {
+ candidates.push(path);
+ }
+ }
+ candidates.sort();
+ if let Some(path) = candidates.into_iter().next() {
+ log::info!("Using ONNX file: {}", path.display());
+ return Ok(path);
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@vendor/tts-rs/src/engines/kokoro/model.rs` around lines 241 - 247, The
current loop over std::fs::read_dir(model_dir) returns the first .onnx entry in
nondeterministic filesystem order; instead collect all paths whose extension ==
"onnx" into a Vec, sort them deterministically (e.g., by file_name() or display
string using to_string_lossy()), and then return the first element of the sorted
list; update the code around the read_dir loop that uses entry.path() and the
ONNX check to perform collection + sort before returning to ensure deterministic
model selection.
| /// Detect the token input name ("input_ids" or "tokens") from session inputs. | ||
| fn detect_tokens_input(session: &Session) -> String { | ||
| for input in session.inputs() { | ||
| if input.name() == "input_ids" || input.name() == "tokens" { | ||
| return input.name().to_string(); | ||
| } | ||
| } | ||
| // Default to "input_ids" if neither is found | ||
| "input_ids".to_string() | ||
| } | ||
|
|
||
| /// Detect whether the speed input expects int32 (true) or float32 (false). | ||
| fn detect_speed_type(session: &Session) -> bool { | ||
| for input in session.inputs() { | ||
| if input.name() == "speed" { | ||
| // Check the type description | ||
| let type_str = format!("{:?}", input.dtype()); | ||
| return type_str.contains("Int32") || type_str.contains("int32"); | ||
| } | ||
| } | ||
| // Default: modern Kokoro models use int32 | ||
| true |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's verify the file exists and check the exact code at those lines
cat -n vendor/tts-rs/src/engines/kokoro/model.rs | sed -n '318,340p'Repository: rishiskhare/parrot
Length of output: 1053
🏁 Script executed:
# Search for KokoroError definition to verify the Config variant exists
rg "enum KokoroError|struct KokoroError" vendor/tts-rs/src/Repository: rishiskhare/parrot
Length of output: 126
🏁 Script executed:
# Find all usages of detect_tokens_input and detect_speed_type
rg "detect_tokens_input|detect_speed_type" vendor/tts-rs/src/Repository: rishiskhare/parrot
Length of output: 454
🏁 Script executed:
# Get the full KokoroError enum definition
rg -A 20 "pub enum KokoroError" vendor/tts-rs/src/engines/kokoro/model.rsRepository: rishiskhare/parrot
Length of output: 870
🏁 Script executed:
# Find the context where these functions are called (get surrounding lines)
rg -B 5 -A 5 "let tokens_input_name = detect_tokens_input" vendor/tts-rs/src/engines/kokoro/model.rsRepository: rishiskhare/parrot
Length of output: 498
🏁 Script executed:
# Get the function signature of the calling function
rg -B 20 "let tokens_input_name = detect_tokens_input" vendor/tts-rs/src/engines/kokoro/model.rs | head -25Repository: rishiskhare/parrot
Length of output: 846
Fail fast if required model inputs are not found.
The functions detect_tokens_input() and detect_speed_type() use silent defaults ("input_ids" and int32) that mask model schema mismatches. When the model lacks expected inputs, the functions succeed but inference fails later with obscure errors. Since these are called during model load—which already returns Result<Self, KokoroError>—both functions should return Result<T, KokoroError> and propagate errors via the ? operator instead.
Suggested fix
-fn detect_tokens_input(session: &Session) -> String {
+fn detect_tokens_input(session: &Session) -> Result<String, KokoroError> {
for input in session.inputs() {
if input.name() == "input_ids" || input.name() == "tokens" {
- return input.name().to_string();
+ return Ok(input.name().to_string());
}
}
- "input_ids".to_string()
+ Err(KokoroError::Config(
+ "Missing required token input: expected 'input_ids' or 'tokens'".to_string(),
+ ))
}
-fn detect_speed_type(session: &Session) -> bool {
+fn detect_speed_type(session: &Session) -> Result<bool, KokoroError> {
for input in session.inputs() {
if input.name() == "speed" {
let type_str = format!("{:?}", input.dtype());
- return type_str.contains("Int32") || type_str.contains("int32");
+ return Ok(type_str.contains("Int32") || type_str.contains("int32"));
}
}
- true
+ Err(KokoroError::Config(
+ "Missing required 'speed' input in model graph".to_string(),
+ ))
}- let tokens_input_name = detect_tokens_input(&session);
- let speed_is_int32 = detect_speed_type(&session);
+ let tokens_input_name = detect_tokens_input(&session)?;
+ let speed_is_int32 = detect_speed_type(&session)?;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /// Detect the token input name ("input_ids" or "tokens") from session inputs. | |
| fn detect_tokens_input(session: &Session) -> String { | |
| for input in session.inputs() { | |
| if input.name() == "input_ids" || input.name() == "tokens" { | |
| return input.name().to_string(); | |
| } | |
| } | |
| // Default to "input_ids" if neither is found | |
| "input_ids".to_string() | |
| } | |
| /// Detect whether the speed input expects int32 (true) or float32 (false). | |
| fn detect_speed_type(session: &Session) -> bool { | |
| for input in session.inputs() { | |
| if input.name() == "speed" { | |
| // Check the type description | |
| let type_str = format!("{:?}", input.dtype()); | |
| return type_str.contains("Int32") || type_str.contains("int32"); | |
| } | |
| } | |
| // Default: modern Kokoro models use int32 | |
| true | |
| /// Detect the token input name ("input_ids" or "tokens") from session inputs. | |
| fn detect_tokens_input(session: &Session) -> Result<String, KokoroError> { | |
| for input in session.inputs() { | |
| if input.name() == "input_ids" || input.name() == "tokens" { | |
| return Ok(input.name().to_string()); | |
| } | |
| } | |
| Err(KokoroError::Config( | |
| "Missing required token input: expected 'input_ids' or 'tokens'".to_string(), | |
| )) | |
| } | |
| /// Detect whether the speed input expects int32 (true) or float32 (false). | |
| fn detect_speed_type(session: &Session) -> Result<bool, KokoroError> { | |
| for input in session.inputs() { | |
| if input.name() == "speed" { | |
| // Check the type description | |
| let type_str = format!("{:?}", input.dtype()); | |
| return Ok(type_str.contains("Int32") || type_str.contains("int32")); | |
| } | |
| } | |
| Err(KokoroError::Config( | |
| "Missing required 'speed' input in model graph".to_string(), | |
| )) | |
| } |
| /// Detect the token input name ("input_ids" or "tokens") from session inputs. | |
| fn detect_tokens_input(session: &Session) -> String { | |
| for input in session.inputs() { | |
| if input.name() == "input_ids" || input.name() == "tokens" { | |
| return input.name().to_string(); | |
| } | |
| } | |
| // Default to "input_ids" if neither is found | |
| "input_ids".to_string() | |
| } | |
| /// Detect whether the speed input expects int32 (true) or float32 (false). | |
| fn detect_speed_type(session: &Session) -> bool { | |
| for input in session.inputs() { | |
| if input.name() == "speed" { | |
| // Check the type description | |
| let type_str = format!("{:?}", input.dtype()); | |
| return type_str.contains("Int32") || type_str.contains("int32"); | |
| } | |
| } | |
| // Default: modern Kokoro models use int32 | |
| true | |
| let tokens_input_name = detect_tokens_input(&session)?; | |
| let speed_is_int32 = detect_speed_type(&session)?; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@vendor/tts-rs/src/engines/kokoro/model.rs` around lines 318 - 339, Change
detect_tokens_input(session: &Session) and detect_speed_type(session: &Session)
to return Result<T, KokoroError> instead of silently returning defaults:
detect_tokens_input should return Result<String, KokoroError> and return
Err(KokoroError::MissingInput("input_ids/tokens")) if neither "input_ids" nor
"tokens" is present; detect_speed_type should return Result<bool, KokoroError>
and return Err(KokoroError::MissingInput("speed")) if "speed" is absent (or Err
with a clear message if dtype cannot be determined). Update the callers in the
model loader to use the ? operator to propagate errors during model load so
missing inputs fail fast. Ensure you reference the Session input iteration
(session.inputs()), input.name(), and input.dtype() when implementing the checks
and error construction.
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Chores