Skip to content

feat(transcribe): --prompt-file flag for Scribe keyterms vocabulary bias#47

Open
mckila wants to merge 1 commit into
browser-use:mainfrom
mckila:feat/transcribe-prompt-file-keyterms
Open

feat(transcribe): --prompt-file flag for Scribe keyterms vocabulary bias#47
mckila wants to merge 1 commit into
browser-use:mainfrom
mckila:feat/transcribe-prompt-file-keyterms

Conversation

@mckila
Copy link
Copy Markdown

@mckila mckila commented May 27, 2026

Summary

Adds a --prompt-file <path> flag to helpers/transcribe.py that loads a vocabulary file (one phrase per line, # for comments, blanks ignored) and passes the phrases to ElevenLabs Scribe's keyterms parameter. Scribe biases its decoder toward the supplied vocabulary, which is the right layer to fix mistranscriptions of proper nouns / domain terms (brand names, people, places) — pure post-process fuzzy correction can't recover phonetic or acronym leaps where the raw transcript shares no characters with the canonical phrase.

The same prompt-file format is engine-agnostic on purpose: if the helper is ever swapped for OpenAI Whisper, joining the lines with commas feeds straight into Whisper's initial_prompt. No CLI change needed.

Motivating incident

On a real 2026-05-26 IUIC session, on clean speaker audio, the canonical salutation

shalom Most High in Christ Bless

transcribed as

show MMC in Christ Bless

Most High -> MMC has near-zero string distance to recover, so no downstream fuzzy matcher can fix it. Passing the IUIC vocabulary as keyterms gives Scribe the prior it needs at the decoder layer.

Surface

python helpers/transcribe.py <video> --prompt-file vocabulary.txt

vocabulary.txt:

# IUIC core
Most High
shalom
shalom family
Israel United in Christ
Yahawah

Behaviour:

  • Missing file is a warn + skip, not an error — a caller without a vocabulary still transcribes.
  • Oversize phrases (>50 chars or >5 words) are filtered to satisfy Scribe's per-keyterm limits, with a warn count.
  • The total list is truncated to Scribe's 1000-keyterm cap.
  • Empty keyterms => the keyterms field is omitted from the upload (no 20% Scribe surcharge for callers that haven't opted in).

Tests

This repo had no tests/ tree; this PR bootstraps one alongside the new feature.

  • test_load_keyterms_strips_comments_and_blanks
  • test_load_keyterms_three_term_fixture_round_trip — mocks requests.post and asserts all three terms travel through to the Scribe keyterms payload.
  • test_load_keyterms_missing_file_is_clean_skip
  • test_load_keyterms_filters_oversize_phrases
  • test_load_keyterms_truncates_to_scribe_limit
  • test_call_scribe_omits_keyterms_when_empty
  • test_help_text_documents_prompt_file_flag
  • test_keyterms_recover_iuic_phrase_on_real_scribe — real-execution slice gated on IUIC_RUN_REAL_SCRIBE=1 + ELEVENLABS_API_KEY; synthesises an IUIC salutation via macOS say and verifies Scribe lands on the canonical form.

pytest tests/ — 8 passed, 1 skipped (gated real-execution slice).

Test plan

  • pytest tests/test_transcribe_prompt_file.py -v clean
  • python helpers/transcribe.py --help shows the new flag
  • Real-execution slice with live Scribe (gated; run before relying on this in production)

Generated with Claude Code.


Summary by cubic

Adds a --prompt-file <path> flag to helpers/transcribe.py that loads a vocabulary file and sends its phrases to ElevenLabs Scribe as keyterms to improve proper noun/domain term recognition. Handles missing files, enforces Scribe limits, and updates help text.

  • New Features
    • --prompt-file <path>: one phrase per line; # comments; blanks ignored.
    • Sends phrases as Scribe keyterms; same file also works for Whisper initial_prompt.
    • Missing file is a warn + skip (still transcribes).
    • Filters >50 chars or >5 words; truncates to 1000 terms.
    • Omits keyterms when empty to avoid surcharge.
    • Updated --help and SKILL.md.

Written for commit e88fce1. Summary will update on new commits. Review in cubic

Pure post-process fuzzy correction can't recover phonetic / acronym leaps
where the raw transcript shares no characters with the canonical phrase.
The fix has to happen at the engine's input.

ElevenLabs Scribe supports `keyterms` — an array of phrases that biases
the decoder toward the supplied vocabulary. This change exposes that as a
`--prompt-file <path>` CLI flag: one phrase per line, `#` for comments,
blanks ignored. Missing file is a warn + skip so callers without a
vocabulary still transcribe. Oversize phrases (>50 chars or >5 words) are
filtered to satisfy Scribe's per-keyterm limits, and the total list is
truncated to the 1000-keyterm cap. Behaviour is engine-agnostic at the
CLI surface — the same file format also feeds OpenAI Whisper's
`initial_prompt` if the helper is ever swapped.

Motivating incident — IUIC "show MMC" mistranscription (2026-05-26): on
clean speaker audio, the canonical IUIC salutation
  "shalom Most High in Christ Bless"
transcribed as
  "show MMC in Christ Bless"
"Most High" -> "MMC" has near-zero string distance, so no downstream
fuzzy matcher can recover it. Passing the IUIC vocabulary as `keyterms`
gives Scribe the prior it needs.

Tests bootstrap a `tests/` tree (none existed) with a pytest suite:
parser strips comments + blanks, three-term fixture round-trips into the
mocked Scribe call, missing file warns cleanly, oversize phrases get
filtered, empty keyterms omit the `keyterms` field (no 20% surcharge),
`--help` surfaces the flag, plus a real-execution slice gated on
IUIC_RUN_REAL_SCRIBE=1 that synthesises an IUIC salutation via macOS
`say` and verifies Scribe lands on the canonical form.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 5 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="tests/test_transcribe_prompt_file.py">

<violation number="1" location="tests/test_transcribe_prompt_file.py:205">
P2: The live Scribe regression test is too permissive: it can pass without validating recovery of the biased keyterm.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

# The canonical phrase has to show up; we don't compare against an
# un-biased call (Scribe is non-deterministic enough that a head-to-head
# in a single CI run is noisy). The presence assertion is the regression.
assert "most high" in biased_text or "shalom" in biased_text, biased
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The live Scribe regression test is too permissive: it can pass without validating recovery of the biased keyterm.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/test_transcribe_prompt_file.py, line 205:

<comment>The live Scribe regression test is too permissive: it can pass without validating recovery of the biased keyterm.</comment>

<file context>
@@ -0,0 +1,205 @@
+    # The canonical phrase has to show up; we don't compare against an
+    # un-biased call (Scribe is non-deterministic enough that a head-to-head
+    # in a single CI run is noisy). The presence assertion is the regression.
+    assert "most high" in biased_text or "shalom" in biased_text, biased
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant