feat(transcribe): --prompt-file flag for Scribe keyterms vocabulary bias#47
Open
mckila wants to merge 1 commit into
Open
feat(transcribe): --prompt-file flag for Scribe keyterms vocabulary bias#47mckila wants to merge 1 commit into
mckila wants to merge 1 commit into
Conversation
Pure post-process fuzzy correction can't recover phonetic / acronym leaps where the raw transcript shares no characters with the canonical phrase. The fix has to happen at the engine's input. ElevenLabs Scribe supports `keyterms` — an array of phrases that biases the decoder toward the supplied vocabulary. This change exposes that as a `--prompt-file <path>` CLI flag: one phrase per line, `#` for comments, blanks ignored. Missing file is a warn + skip so callers without a vocabulary still transcribe. Oversize phrases (>50 chars or >5 words) are filtered to satisfy Scribe's per-keyterm limits, and the total list is truncated to the 1000-keyterm cap. Behaviour is engine-agnostic at the CLI surface — the same file format also feeds OpenAI Whisper's `initial_prompt` if the helper is ever swapped. Motivating incident — IUIC "show MMC" mistranscription (2026-05-26): on clean speaker audio, the canonical IUIC salutation "shalom Most High in Christ Bless" transcribed as "show MMC in Christ Bless" "Most High" -> "MMC" has near-zero string distance, so no downstream fuzzy matcher can recover it. Passing the IUIC vocabulary as `keyterms` gives Scribe the prior it needs. Tests bootstrap a `tests/` tree (none existed) with a pytest suite: parser strips comments + blanks, three-term fixture round-trips into the mocked Scribe call, missing file warns cleanly, oversize phrases get filtered, empty keyterms omit the `keyterms` field (no 20% surcharge), `--help` surfaces the flag, plus a real-execution slice gated on IUIC_RUN_REAL_SCRIBE=1 that synthesises an IUIC salutation via macOS `say` and verifies Scribe lands on the canonical form. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
1 issue found across 5 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="tests/test_transcribe_prompt_file.py">
<violation number="1" location="tests/test_transcribe_prompt_file.py:205">
P2: The live Scribe regression test is too permissive: it can pass without validating recovery of the biased keyterm.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
| # The canonical phrase has to show up; we don't compare against an | ||
| # un-biased call (Scribe is non-deterministic enough that a head-to-head | ||
| # in a single CI run is noisy). The presence assertion is the regression. | ||
| assert "most high" in biased_text or "shalom" in biased_text, biased |
There was a problem hiding this comment.
P2: The live Scribe regression test is too permissive: it can pass without validating recovery of the biased keyterm.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/test_transcribe_prompt_file.py, line 205:
<comment>The live Scribe regression test is too permissive: it can pass without validating recovery of the biased keyterm.</comment>
<file context>
@@ -0,0 +1,205 @@
+ # The canonical phrase has to show up; we don't compare against an
+ # un-biased call (Scribe is non-deterministic enough that a head-to-head
+ # in a single CI run is noisy). The presence assertion is the regression.
+ assert "most high" in biased_text or "shalom" in biased_text, biased
</file context>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
--prompt-file <path>flag tohelpers/transcribe.pythat loads a vocabulary file (one phrase per line,#for comments, blanks ignored) and passes the phrases to ElevenLabs Scribe'skeytermsparameter. Scribe biases its decoder toward the supplied vocabulary, which is the right layer to fix mistranscriptions of proper nouns / domain terms (brand names, people, places) — pure post-process fuzzy correction can't recover phonetic or acronym leaps where the raw transcript shares no characters with the canonical phrase.The same prompt-file format is engine-agnostic on purpose: if the helper is ever swapped for OpenAI Whisper, joining the lines with commas feeds straight into Whisper's
initial_prompt. No CLI change needed.Motivating incident
On a real 2026-05-26 IUIC session, on clean speaker audio, the canonical salutation
transcribed as
Most High->MMChas near-zero string distance to recover, so no downstream fuzzy matcher can fix it. Passing the IUIC vocabulary askeytermsgives Scribe the prior it needs at the decoder layer.Surface
vocabulary.txt:Behaviour:
keytermsfield is omitted from the upload (no 20% Scribe surcharge for callers that haven't opted in).Tests
This repo had no
tests/tree; this PR bootstraps one alongside the new feature.test_load_keyterms_strips_comments_and_blankstest_load_keyterms_three_term_fixture_round_trip— mocksrequests.postand asserts all three terms travel through to the Scribekeytermspayload.test_load_keyterms_missing_file_is_clean_skiptest_load_keyterms_filters_oversize_phrasestest_load_keyterms_truncates_to_scribe_limittest_call_scribe_omits_keyterms_when_emptytest_help_text_documents_prompt_file_flagtest_keyterms_recover_iuic_phrase_on_real_scribe— real-execution slice gated onIUIC_RUN_REAL_SCRIBE=1+ELEVENLABS_API_KEY; synthesises an IUIC salutation via macOSsayand verifies Scribe lands on the canonical form.pytest tests/— 8 passed, 1 skipped (gated real-execution slice).Test plan
pytest tests/test_transcribe_prompt_file.py -vcleanpython helpers/transcribe.py --helpshows the new flagGenerated with Claude Code.
Summary by cubic
Adds a
--prompt-file <path>flag tohelpers/transcribe.pythat loads a vocabulary file and sends its phrases to ElevenLabs Scribe askeytermsto improve proper noun/domain term recognition. Handles missing files, enforces Scribe limits, and updates help text.--prompt-file <path>: one phrase per line;#comments; blanks ignored.keyterms; same file also works for Whisperinitial_prompt.keytermswhen empty to avoid surcharge.--helpandSKILL.md.Written for commit e88fce1. Summary will update on new commits. Review in cubic