feat(transcribe): --prompt-file flag for Scribe keyterms vocabulary bias by mckila · Pull Request #47 · browser-use/video-use

mckila · 2026-05-27T04:59:01Z

Summary

Adds a --prompt-file <path> flag to helpers/transcribe.py that loads a vocabulary file (one phrase per line, # for comments, blanks ignored) and passes the phrases to ElevenLabs Scribe's keyterms parameter. Scribe biases its decoder toward the supplied vocabulary, which is the right layer to fix mistranscriptions of proper nouns / domain terms (brand names, people, places) — pure post-process fuzzy correction can't recover phonetic or acronym leaps where the raw transcript shares no characters with the canonical phrase.

The same prompt-file format is engine-agnostic on purpose: if the helper is ever swapped for OpenAI Whisper, joining the lines with commas feeds straight into Whisper's initial_prompt. No CLI change needed.

Motivating incident

On a real 2026-05-26 IUIC session, on clean speaker audio, the canonical salutation

shalom Most High in Christ Bless

transcribed as

show MMC in Christ Bless

Most High -> MMC has near-zero string distance to recover, so no downstream fuzzy matcher can fix it. Passing the IUIC vocabulary as keyterms gives Scribe the prior it needs at the decoder layer.

Surface

python helpers/transcribe.py <video> --prompt-file vocabulary.txt

vocabulary.txt:

# IUIC core
Most High
shalom
shalom family
Israel United in Christ
Yahawah

Behaviour:

Missing file is a warn + skip, not an error — a caller without a vocabulary still transcribes.
Oversize phrases (>50 chars or >5 words) are filtered to satisfy Scribe's per-keyterm limits, with a warn count.
The total list is truncated to Scribe's 1000-keyterm cap.
Empty keyterms => the keyterms field is omitted from the upload (no 20% Scribe surcharge for callers that haven't opted in).

Tests

This repo had no tests/ tree; this PR bootstraps one alongside the new feature.

test_load_keyterms_strips_comments_and_blanks
test_load_keyterms_three_term_fixture_round_trip — mocks requests.post and asserts all three terms travel through to the Scribe keyterms payload.
test_load_keyterms_missing_file_is_clean_skip
test_load_keyterms_filters_oversize_phrases
test_load_keyterms_truncates_to_scribe_limit
test_call_scribe_omits_keyterms_when_empty
test_help_text_documents_prompt_file_flag
test_keyterms_recover_iuic_phrase_on_real_scribe — real-execution slice gated on IUIC_RUN_REAL_SCRIBE=1 + ELEVENLABS_API_KEY; synthesises an IUIC salutation via macOS say and verifies Scribe lands on the canonical form.

pytest tests/ — 8 passed, 1 skipped (gated real-execution slice).

Test plan

pytest tests/test_transcribe_prompt_file.py -v clean
python helpers/transcribe.py --help shows the new flag
Real-execution slice with live Scribe (gated; run before relying on this in production)

Generated with Claude Code.

Summary by cubic

Adds a --prompt-file <path> flag to helpers/transcribe.py that loads a vocabulary file and sends its phrases to ElevenLabs Scribe as keyterms to improve proper noun/domain term recognition. Handles missing files, enforces Scribe limits, and updates help text.

New Features
- --prompt-file <path>: one phrase per line; # comments; blanks ignored.
- Sends phrases as Scribe keyterms; same file also works for Whisper initial_prompt.
- Missing file is a warn + skip (still transcribes).
- Filters >50 chars or >5 words; truncates to 1000 terms.
- Omits keyterms when empty to avoid surcharge.
- Updated --help and SKILL.md.

^{Written for commit e88fce1. Summary will update on new commits. Review in cubic}

Pure post-process fuzzy correction can't recover phonetic / acronym leaps where the raw transcript shares no characters with the canonical phrase. The fix has to happen at the engine's input. ElevenLabs Scribe supports `keyterms` — an array of phrases that biases the decoder toward the supplied vocabulary. This change exposes that as a `--prompt-file <path>` CLI flag: one phrase per line, `#` for comments, blanks ignored. Missing file is a warn + skip so callers without a vocabulary still transcribe. Oversize phrases (>50 chars or >5 words) are filtered to satisfy Scribe's per-keyterm limits, and the total list is truncated to the 1000-keyterm cap. Behaviour is engine-agnostic at the CLI surface — the same file format also feeds OpenAI Whisper's `initial_prompt` if the helper is ever swapped. Motivating incident — IUIC "show MMC" mistranscription (2026-05-26): on clean speaker audio, the canonical IUIC salutation "shalom Most High in Christ Bless" transcribed as "show MMC in Christ Bless" "Most High" -> "MMC" has near-zero string distance, so no downstream fuzzy matcher can recover it. Passing the IUIC vocabulary as `keyterms` gives Scribe the prior it needs. Tests bootstrap a `tests/` tree (none existed) with a pytest suite: parser strips comments + blanks, three-term fixture round-trips into the mocked Scribe call, missing file warns cleanly, oversize phrases get filtered, empty keyterms omit the `keyterms` field (no 20% surcharge), `--help` surfaces the flag, plus a real-execution slice gated on IUIC_RUN_REAL_SCRIBE=1 that synthesises an IUIC salutation via macOS `say` and verifies Scribe lands on the canonical form. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cubic-dev-ai

1 issue found across 5 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="tests/test_transcribe_prompt_file.py">

<violation number="1" location="tests/test_transcribe_prompt_file.py:205">
P2: The live Scribe regression test is too permissive: it can pass without validating recovery of the biased keyterm.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic}

cubic-dev-ai · 2026-05-27T05:04:03Z

+    # The canonical phrase has to show up; we don't compare against an
+    # un-biased call (Scribe is non-deterministic enough that a head-to-head
+    # in a single CI run is noisy). The presence assertion is the regression.
+    assert "most high" in biased_text or "shalom" in biased_text, biased


P2: The live Scribe regression test is too permissive: it can pass without validating recovery of the biased keyterm.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At tests/test_transcribe_prompt_file.py, line 205: <comment>The live Scribe regression test is too permissive: it can pass without validating recovery of the biased keyterm.</comment> <file context> @@ -0,0 +1,205 @@ + # The canonical phrase has to show up; we don't compare against an + # un-biased call (Scribe is non-deterministic enough that a head-to-head + # in a single CI run is noisy). The presence assertion is the regression. + assert "most high" in biased_text or "shalom" in biased_text, biased </file context>

cubic-dev-ai Bot reviewed May 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(transcribe): --prompt-file flag for Scribe keyterms vocabulary bias#47

feat(transcribe): --prompt-file flag for Scribe keyterms vocabulary bias#47
mckila wants to merge 1 commit into
browser-use:mainfrom
mckila:feat/transcribe-prompt-file-keyterms

mckila commented May 27, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

cubic-dev-ai Bot May 27, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mckila commented May 27, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivating incident

Surface

Tests

Test plan

Summary by cubic

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mckila commented May 27, 2026 •

edited by cubic-dev-ai Bot

Loading

cubic-dev-ai Bot May 27, 2026 •

edited

Loading