ECHO-258 switch to custom runpod deployment for whisper#168
Conversation
… handling - Removed the unused RUNPOD_WHISPER_MODEL environment variable from the configuration. - Updated the scheduling for RunPod transcription responses to trigger every 30 seconds instead of every minute. - Changed the transcription response handling to use 'joined_text' instead of 'transcription'. - Enhanced error handling in the transcription process to log and raise exceptions when fetching conversation chunks and details fails. - Updated prompt templates for multiple languages with new technical presentation examples.
WalkthroughThis change removes the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Scheduler
participant TaskProcessor
participant RunPodAPI
participant Directus
Scheduler->>TaskProcessor: Trigger task_update_runpod_transcription_response (every 30s)
TaskProcessor->>RunPodAPI: Fetch transcription response
RunPodAPI-->>TaskProcessor: Return response with joined_text
TaskProcessor->>Directus: Update conversation chunk with transcript
sequenceDiagram
participant User
participant Transcriber
participant Directus
participant RunPodAPI
User->>Transcriber: Request transcription
Transcriber->>Directus: Fetch conversation chunk/details/metadata
alt On error
Transcriber->>Transcriber: Log error and raise ValueError
else Success
Transcriber->>RunPodAPI: Queue transcription (no model specified)
RunPodAPI-->>Transcriber: Acknowledge request
end
Assessment against linked issues
Possibly related PRs
Suggested labels
Suggested reviewers
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Actionable comments posted: 5
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (9)
echo/server/dembrane/config.py(0 hunks)echo/server/dembrane/scheduler.py(1 hunks)echo/server/dembrane/tasks.py(1 hunks)echo/server/dembrane/transcribe.py(5 hunks)echo/server/prompt_templates/default_whisper_prompt.de.jinja(1 hunks)echo/server/prompt_templates/default_whisper_prompt.en.jinja(1 hunks)echo/server/prompt_templates/default_whisper_prompt.es.jinja(1 hunks)echo/server/prompt_templates/default_whisper_prompt.fr.jinja(1 hunks)echo/server/prompt_templates/default_whisper_prompt.nl.jinja(1 hunks)
💤 Files with no reviewable changes (1)
- echo/server/dembrane/config.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: ci-check-server
🔇 Additional comments (7)
echo/server/prompt_templates/default_whisper_prompt.nl.jinja (1)
1-1: LGTM! Solid prompt engineering here.This Dutch technical presentation example with mixed English technical terms is much more contextually rich than the previous simple instruction. The realistic bilingual technical jargon will definitely improve Whisper's transcription accuracy for Dutch speakers in tech environments.
echo/server/dembrane/scheduler.py (1)
29-29: LGTM! Frequency optimization looks solid.Bumping the RunPod transcription polling from every minute to every 30 seconds - this will definitely improve response times for users waiting on their transcriptions. Clean implementation using APScheduler's CronTrigger.
echo/server/dembrane/tasks.py (1)
937-937: LGTM! Clean adaptation to the new RunPod response schema.Switching from
transcriptiontojoined_textfield - this matches the custom RunPod deployment's response format perfectly. The change is surgical and maintains all the existing error handling around it.echo/server/prompt_templates/default_whisper_prompt.fr.jinja (1)
1-1:Details
✅ Verification successful
Solid contextual prompt engineering, but verify token limits. LGTM.
This rich contextual prompt should significantly improve transcription accuracy for technical content mixing French and English. The example-driven approach is proper engineering.
Whisper has prompt token limitations that could affect performance. Run this to verify the prompt length:
🏁 Script executed:
#!/bin/bash # Check prompt template lengths across all languages to ensure they don't exceed Whisper limits echo "Checking prompt template lengths..." for template in echo/server/prompt_templates/default_whisper_prompt.*.jinja; do if [ -f "$template" ]; then chars=$(cat "$template" | wc -c) words=$(cat "$template" | wc -w) echo "$(basename $template): $chars characters, $words words" fi done # Whisper typically handles up to ~224 tokens (roughly 1000-1500 characters) echo "Note: Whisper prompt limit is typically ~224 tokens (~1000-1500 characters)"Length of output: 2876
Prompt Length Verification Complete – All Systems Go
Alldefault_whisper_prompt.*.jinjatemplates clock in between 534 – 588 characters (72 – 87 words), well below the ~1000–1500 character (~224 token) Whisper limit. No adjustments needed—proceed to launch! LGTM.echo/server/prompt_templates/default_whisper_prompt.es.jinja (1)
1-1: Consistent prompt engineering across languages. LGTM.The Spanish template follows the same solid pattern as the French version - contextual technical presentation framing with bilingual terminology. This consistency across language templates is proper architecture.
echo/server/dembrane/transcribe.py (2)
193-193: Clean prompt construction refactor. LGTM.Simplifying the string concatenation with a space prefix is cleaner than the multiline approach. Good refactor.
205-216:Details
✅ Verification successful
More robust error handling. LGTM.
Adding error handling around the conversation chunk lookup is solid defensive programming.
Verify that removing RUNPOD_WHISPER_MODEL doesn't break the transcription requests:
🏁 Script executed:
#!/bin/bash # Check if RUNPOD_WHISPER_MODEL is still referenced anywhere in the codebase echo "Searching for RUNPOD_WHISPER_MODEL references..." rg -i "runpod.*whisper.*model|whisper.*model.*runpod" --type py # Check the config file to see what RunPod-related variables remain echo -e "\nRunPod configuration variables:" rg "RUNPOD_WHISPER" echo/server/dembrane/config.py || echo "Config file not available for review" # Check if model parameter was being used in payload construction echo -e "\nChecking payload construction patterns:" ast-grep --pattern 'input_payload = { $$$ }'Length of output: 1853
Sequence clear: Defensive error handling ✅ & RUNPOD_WHISPER_MODEL removal safe 🚀
- Wrapped
directus.get_itemsintry/except(echo/server/dembrane/transcribe.py lines 205–216) to log errors and rethrow asValueError.- Ripgrep scan found no
RUNPOD_WHISPER_MODELreferences across Python files.- Payload construction patterns show no usage of a model parameter.
All systems green—this change is ready for warp speed. LGTM.
| @@ -1 +1 @@ | |||
| System: Du bist eine hilfreiche Transkribent. Bitte transkribiere das folgende Audio *IN DEUTSCH*. No newline at end of file | |||
| Hier is een transcriptie van een technische presentatie. De spreker gebruikt natuurlijk Nederlands met enkele Engelse technical terms zoals gebruikelijk. De presentator zegt: "In een vrij gemakkelijk network is het mogelijk om de requirements voor een change te evalueren. We gebruiken modern tooling voor ons system management. Het is quite possible dat een enkele wijziging impact heeft, maar we blijven altijd in het Nederlands praten over onze processes en workflows. De change management procedure helpt ons om disruptions te voorkomen en stability te behouden in ons environment. No newline at end of file | |||
There was a problem hiding this comment.
Critical issue: Wrong language content detected!
This German prompt template .de.jinja contains Dutch text instead of German content. This will seriously mess up German transcriptions - Whisper will be primed for Dutch when it should be expecting German.
Fix this by replacing with proper German content:
-Hier is een transcriptie van een technische presentatie. De spreker gebruikt natuurlijk Nederlands met enkele Engelse technical terms zoals gebruikelijk. De presentator zegt: "In een vrij gemakkelijk network is het mogelijk om de requirements voor een change te evalueren. We gebruiken modern tooling voor ons system management. Het is quite possible dat een enkele wijziging impact heeft, maar we blijven altijd in het Nederlands praten over onze processes en workflows. De change management procedure helpt ons om disruptions te voorkomen en stability te behouden in ons environment.
+Hier ist eine Transkription einer technischen Präsentation. Der Sprecher verwendet natürlich Deutsch mit einigen englischen technical terms wie üblich. Der Präsentator sagt: "In einem ziemlich einfachen network ist es möglich, die requirements für eine change zu evaluieren. Wir verwenden moderne tooling für unser system management. Es ist quite possible, dass eine einzelne Änderung impact hat, aber wir sprechen immer auf Deutsch über unsere processes und workflows. Das change management procedure hilft uns, disruptions zu vermeiden und stability in unserem environment zu behalten.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Hier is een transcriptie van een technische presentatie. De spreker gebruikt natuurlijk Nederlands met enkele Engelse technical terms zoals gebruikelijk. De presentator zegt: "In een vrij gemakkelijk network is het mogelijk om de requirements voor een change te evalueren. We gebruiken modern tooling voor ons system management. Het is quite possible dat een enkele wijziging impact heeft, maar we blijven altijd in het Nederlands praten over onze processes en workflows. De change management procedure helpt ons om disruptions te voorkomen en stability te behouden in ons environment. | |
| Hier ist eine Transkription einer technischen Präsentation. Der Sprecher verwendet natürlich Deutsch mit einigen englischen technical terms wie üblich. Der Präsentator sagt: "In einem ziemlich einfachen network ist es möglich, die requirements für eine change zu evaluieren. Wir verwenden moderne tooling für unser system management. Es ist quite possible, dass eine einzelne Änderung impact hat, aber wir sprechen immer auf Deutsch über unsere processes und workflows. Das change management procedure hilft uns, disruptions zu vermeiden und stability in unserem environment. |
🤖 Prompt for AI Agents
In echo/server/prompt_templates/default_whisper_prompt.de.jinja at line 1, the
file contains Dutch text instead of German, which is incorrect for a German
prompt template. Replace the entire Dutch text with an accurate German
transcription prompt that matches the intended use for German language content
to ensure Whisper processes German transcriptions correctly.
| @@ -1 +1 @@ | |||
| System: You are a helpful transcriptionist. Please transcribe the following audio *IN ENGLISH*. No newline at end of file | |||
| Here is a transcript from a business presentation about technology and change management. The speaker maintains English throughout while occasionally using technical terms. The presenter explains: "In our network environment, it's essential to carefully plan any changes. Our system administrators use modern tools for network management. The change management process helps us prevent disruptions and maintain system stability. We've implemented quality assurance procedures and risk assessment protocols. Even when discussing complex technical concepts, we continue speaking in English. No newline at end of file | |||
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Strong contextual prompt but might be too domain-specific. LGTM with consideration.
The business/technical presentation context is solid engineering, but this template is quite specific to enterprise scenarios. Consider if this fits all your audio content types or if you need more generic fallbacks.
🤖 Prompt for AI Agents
In echo/server/prompt_templates/default_whisper_prompt.en.jinja at line 1, the
prompt is very specific to business and technical presentations, which may not
suit all audio content types. Review your use cases and consider creating
additional, more generic prompt templates as fallbacks to ensure broader
applicability beyond enterprise scenarios.
| try: | ||
| conversation = directus.get_items( | ||
| "conversation", | ||
| { | ||
| "query": { | ||
| "filter": {"id": {"_eq": chunk["conversation_id"]}}, | ||
| "fields": [ | ||
| "id", | ||
| "project_id", | ||
| "project_id.language", | ||
| "project_id.default_conversation_transcript_prompt", | ||
| ], | ||
| }, | ||
| }, | ||
| }, | ||
| ) | ||
| ) | ||
| except Exception as e: | ||
| logger.error(f"Failed to get conversation for {conversation_chunk_id}: {e}") | ||
| raise ValueError(f"Failed to get conversation for {conversation_chunk_id}: {e}") from e |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Consistent error handling pattern. LGTM.
Good consistency with the error handling pattern established in the previous block.
Same suggestion about exception type consistency - consider using TranscriptionError instead of ValueError for consistency with the module's exception hierarchy.
🤖 Prompt for AI Agents
In echo/server/dembrane/transcribe.py around lines 156 to 173, the exception
raised on failure to get the conversation uses ValueError, which is inconsistent
with the module's exception hierarchy. Replace the ValueError with
TranscriptionError to maintain consistent error handling across the module.
Ensure to import TranscriptionError if not already imported.
| try: | ||
| chunks = directus.get_items( | ||
| "conversation_chunk", | ||
| { | ||
| "query": { | ||
| "filter": {"id": {"_eq": conversation_chunk_id}}, | ||
| "fields": ["id", "path", "conversation_id", "timestamp"], | ||
| }, | ||
| }, | ||
| }, | ||
| ) | ||
| ) | ||
| except Exception as e: | ||
| logger.error(f"Failed to get chunks for {conversation_chunk_id}: {e}") | ||
| raise ValueError(f"Failed to get chunks for {conversation_chunk_id}: {e}") from e |
There was a problem hiding this comment.
🧹 Nitpick (assertive)
Solid error handling improvement. LGTM.
Adding try-catch blocks around external API calls is proper defensive programming. The error logging provides good debugging context.
Consider using a more specific exception type instead of generic ValueError:
- raise ValueError(f"Failed to get chunks for {conversation_chunk_id}: {e}") from e
+ raise TranscriptionError(f"Failed to get chunks for {conversation_chunk_id}: {e}") from e📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| try: | |
| chunks = directus.get_items( | |
| "conversation_chunk", | |
| { | |
| "query": { | |
| "filter": {"id": {"_eq": conversation_chunk_id}}, | |
| "fields": ["id", "path", "conversation_id", "timestamp"], | |
| }, | |
| }, | |
| }, | |
| ) | |
| ) | |
| except Exception as e: | |
| logger.error(f"Failed to get chunks for {conversation_chunk_id}: {e}") | |
| raise ValueError(f"Failed to get chunks for {conversation_chunk_id}: {e}") from e | |
| try: | |
| chunks = directus.get_items( | |
| "conversation_chunk", | |
| { | |
| "query": { | |
| "filter": {"id": {"_eq": conversation_chunk_id}}, | |
| "fields": ["id", "path", "conversation_id", "timestamp"], | |
| }, | |
| }, | |
| ) | |
| except Exception as e: | |
| logger.error(f"Failed to get chunks for {conversation_chunk_id}: {e}") | |
| raise TranscriptionError(f"Failed to get chunks for {conversation_chunk_id}: {e}") from e |
🤖 Prompt for AI Agents
In echo/server/dembrane/transcribe.py around lines 111 to 123, replace the
generic ValueError raised in the except block with a more specific custom
exception or a relevant built-in exception that better describes the failure
context of the external API call. Define a custom exception class if needed, and
raise that instead to improve error clarity and handling downstream.
|
|
||
| data = {"input": input_payload} | ||
|
|
||
| logger.debug(f"***data: {data}") |
There was a problem hiding this comment.
Debug logging exposes sensitive data. Critical issue.
The debug log outputs the entire request payload which likely contains signed URLs and API keys. This is a security risk in production logs.
Apply this diff to sanitize the debug output:
- logger.debug(f"***data: {data}")
+ # Log payload structure without sensitive data
+ safe_data = {
+ "input": {
+ "audio": "[SIGNED_URL_REDACTED]",
+ "initial_prompt": data["input"].get("initial_prompt", "")[:100] + "..." if len(data["input"].get("initial_prompt", "")) > 100 else data["input"].get("initial_prompt", ""),
+ "language": data["input"].get("language")
+ }
+ }
+ logger.debug(f"Transcription request payload: {safe_data}")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| logger.debug(f"***data: {data}") | |
| # Log payload structure without sensitive data | |
| safe_data = { | |
| "input": { | |
| "audio": "[SIGNED_URL_REDACTED]", | |
| "initial_prompt": data["input"].get("initial_prompt", "")[:100] + "..." if len(data["input"].get("initial_prompt", "")) > 100 else data["input"].get("initial_prompt", ""), | |
| "language": data["input"].get("language") | |
| } | |
| } | |
| logger.debug(f"Transcription request payload: {safe_data}") |
🤖 Prompt for AI Agents
In echo/server/dembrane/transcribe.py at line 56, the debug log prints the
entire request payload including sensitive information like signed URLs and API
keys. To fix this, modify the debug statement to sanitize or redact sensitive
fields before logging, ensuring no secrets are exposed in production logs.
Refactor RunPod Whisper transcription configuration and improve error handling - Removed the unused RUNPOD_WHISPER_MODEL environment variable from the configuration. - Updated the scheduling for RunPod transcription responses to trigger every 30 seconds instead of every minute. - Changed the transcription response handling to use 'joined_text' instead of 'transcription'. - Enhanced error handling in the transcription process to log and raise exceptions when fetching conversation chunks and details fails. - Updated prompt templates for multiple languages with new technical presentation examples.
Refactor RunPod Whisper transcription configuration and improve error handling
Summary by CodeRabbit
Improvements
Prompt Updates
Other Changes