VER-278: Use Gemini 2.5 Flash Latest as fallback model in Stage 3#35
Conversation
Add resilient error handling to Stage 3 analysis by implementing automatic fallback from Gemini 2.5 Pro to Flash on recoverable API errors - Extract analysis logic into analyze_snippet() task function - Fallback to Flash on server errors (5xx) and client errors except auth failures (401, 403) - Use HTTPStatus enum constants for clearer error code handling This improves processing reliability while maximizing use of the more capable Pro model when available
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdded Changes
Sequence Diagram(s)sequenceDiagram
participant Proc as process_snippet
participant Analyze as analyze_snippet
participant Primary as Primary Gemini
participant Fallback as Gemini 2.5 Flash Latest
Proc->>Analyze: analyze_snippet(gemini_key, audio_file, metadata)
Analyze->>Primary: request analysis
alt Primary returns success
Primary-->>Analyze: response + grounding
else Primary raises ServerError
Primary-->>Analyze: ServerError
Analyze->>Fallback: retry with fallback model
Fallback-->>Analyze: response + grounding
else Primary raises ClientError
alt Unauthorized/Forbidden
Primary-->>Analyze: ClientError (401/403)
Analyze-->>Proc: re-raise error
else Other ClientError
Primary-->>Analyze: ClientError (other)
Analyze->>Fallback: retry with fallback model
Fallback-->>Analyze: response + grounding
end
end
Analyze-->>Proc: return response + grounding metadata
Proc->>Proc: post-processing, status updates, finalization
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro Disabled knowledge base sources:
📒 Files selected for processing (1)
🧰 Additional context used🧬 Code graph analysis (1)src/processing_pipeline/stage_3.py (3)
🔇 Additional comments (3)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Pylint (4.0.2)src/processing_pipeline/stage_3.py************* Module .pylintrc ... [truncated 19746 characters] ... ": "src.processing_pipeline.stage_3", Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @quancao-ea, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the resilience of the AI model integration within the processing pipeline's Stage 3. It introduces a strategic fallback mechanism for Gemini API interactions, allowing the system to automatically switch from the primary Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
The pull request introduces a new analyze_snippet function to handle the logic of calling the Gemini API with a fallback mechanism. This improves the robustness of the process_snippet function by gracefully handling ServerError and ClientError from the Gemini API, allowing it to retry with a fallback model. The changes also include necessary imports for HTTPStatus and google.genai.errors.
| except errors.ServerError as e: | ||
| print(f"Server error with {main_model} (code {e.code}): {e.message}") | ||
| print(f"Falling back to {fallback_model}") | ||
| return Stage3Executor.run( | ||
| gemini_key=gemini_key, | ||
| model_name=fallback_model, | ||
| audio_file=audio_file, | ||
| metadata=metadata, | ||
| ) |
| except errors.ClientError as e: | ||
| if e.code in [HTTPStatus.UNAUTHORIZED, HTTPStatus.FORBIDDEN]: | ||
| print(f"Auth error with {main_model} (code {e.code}): {e.message}") | ||
| raise | ||
| else: | ||
| print(f"Client error with {main_model} (code {e.code}): {e.message}") | ||
| print(f"Falling back to {fallback_model}") | ||
| return Stage3Executor.run( | ||
| gemini_key=gemini_key, | ||
| model_name=fallback_model, | ||
| audio_file=audio_file, | ||
| metadata=metadata, | ||
| ) |
There was a problem hiding this comment.
The ClientError handling distinguishes between authentication/authorization errors (which should raise immediately) and other client errors, for which a fallback is attempted. This is a robust approach to error management, ensuring that critical configuration issues are surfaced quickly while other client-side problems can leverage the fallback.
| @@ -1,7 +1,9 @@ | |||
| from datetime import datetime | |||
| from http import HTTPStatus | |||
| import os | ||
| import time | ||
| from google import genai | ||
| from google.genai import errors |
|
|
||
| @optional_task(log_prints=True) | ||
| def process_snippet(supabase_client, snippet, local_file, gemini_key, skip_review: bool): | ||
| def analyze_snippet(gemini_key, audio_file, metadata): |
| main_model = GeminiModel.GEMINI_2_5_PRO | ||
| fallback_model = GeminiModel.GEMINI_FLASH_LATEST |
|
|
||
| @optional_task(log_prints=True) | ||
| def process_snippet(supabase_client, snippet, local_file, gemini_key, skip_review: bool): | ||
| print(f"Processing snippet: {local_file}") |
There was a problem hiding this comment.
Caution
Changes requested ❌
Reviewed everything up to 37bcb4c in 2 minutes and 55 seconds. Click for details.
- Reviewed
71lines of code in1files - Skipped
0files when reviewing. - Skipped posting
4draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/processing_pipeline/stage_3.py:145
- Draft comment:
Add a docstring for analyze_snippet to describe its purpose and parameters. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 30% vs. threshold = 50% This is a code quality suggestion about documentation. The rules state that "Comments that suggest code quality refactors are good! But only if they are actionable and clear." However, I need to consider: (1) Is this comment about a change in the diff? Yes,analyze_snippetis a new function. (2) Is it actionable? Yes, adding a docstring is clear and actionable. (3) Is it important enough? Looking at the file, most functions don't have docstrings - only the complexStage3Executor.runmethod has one. Theanalyze_snippetfunction is relatively straightforward with clear parameter names and a simple fallback pattern. (4) The rules say not to make comments that are "obvious or unimportant" - adding docstrings to every function could be considered standard practice but not critical. The function's purpose is fairly clear from its name and implementation. While adding docstrings is generally good practice, this comment might be too prescriptive for a code review tool. The function is relatively simple and self-explanatory, and most other similar functions in the file lack docstrings. This could be seen as an "obvious" or low-priority suggestion that doesn't require immediate action. However, the function does implement non-trivial error handling logic with fallback behavior, and documenting the parameters and return value could be valuable. The comment is actionable and clear. But given that the codebase doesn't consistently use docstrings for similar functions, and the rules emphasize only commenting when there's a clear code change required, this seems more like a nice-to-have rather than a must-have. This is a borderline case. The comment is actionable and about new code, but it's a minor code quality suggestion rather than identifying a bug or important issue. Given the inconsistent use of docstrings in the existing codebase and the relatively straightforward nature of the function, this comment should likely be deleted as it's not critical enough to warrant inclusion.
2. src/processing_pipeline/stage_3.py:150
- Draft comment:
Replace print statements with a structured logging solution. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% This comment is a general code quality suggestion about using structured logging instead of print statements. While this might be a valid improvement, it's not specific to the changes in this PR - the entire codebase uses print statements consistently. The rules state that comments should be about clear code changes required, not general refactoring suggestions unless they are actionable and clear. This comment doesn't point to a specific problem with the new code; it's more of a general architectural suggestion. Additionally, the code is using Prefect'slog_prints=Truewhich already provides some structure to the logging. The comment doesn't provide specific guidance on what structured logging solution to use or why it's necessary for this particular change. The comment could be valid if there's a specific reason why structured logging is needed for this particular error handling and fallback logic - perhaps for better monitoring or debugging of the model fallback behavior. The new code does add error handling that might benefit from structured logging with proper log levels (ERROR, WARNING, INFO). While structured logging might be beneficial, this comment applies equally to the entire file (and likely the entire codebase), not specifically to the changes made in this PR. The comment doesn't explain why this particular change requires structured logging more than the existing code. It's a general refactoring suggestion rather than a specific issue with the new code. This comment should be deleted. It's a general code quality suggestion that applies to the entire codebase, not specifically to the changes in this PR. The rules explicitly state that comments should be actionable and clear, and this is more of a broad architectural suggestion without specific guidance.
3. src/processing_pipeline/stage_3.py:157
- Draft comment:
Refactor duplicated fallback logic in the exception handlers to reduce redundancy. - Reason this comment was not posted:
Comment looked like it was already resolved.
4. src/processing_pipeline/stage_3.py:167
- Draft comment:
Consider logging the full exception stack trace (e.g., using logging.exception) instead of just printing error messages. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable: usefulness confidence = 20% vs. threshold = 50% The comment is suggesting a code quality improvement (using logging instead of print). However, I need to consider: 1) The entire file usesprint()statements consistently, notlogging, 2) The@optional_task(log_prints=True)decorator suggests this is intentional - Prefect tasks capture print output, 3) The comment doesn't account for the existing patterns in the codebase, 4) This would require importing the logging module which isn't currently used, 5) The comment is about code that WAS changed (new error handling), so it is about changes. However, suggesting to change the logging approach when the entire file and framework uses print statements seems like it's not respecting the existing patterns. The comment is technically about changed code (the new error handling), so it does relate to the diff. It's also a reasonable suggestion in general - logging.exception() does provide stack traces which can be helpful for debugging. Maybe the codebase should be using proper logging? While logging.exception() is generally better practice, this comment ignores the clear pattern in the codebase where ALL error handling uses print() and the Prefect decorator explicitly captures print output withlog_prints=True. Changing just this one line would be inconsistent. If logging should be used, it would need to be a broader refactor across the entire file, not just this one line. The comment doesn't acknowledge this context. This comment should be deleted. While it's technically about changed code, it suggests breaking from the established pattern in the file (using print() with Prefect's log_prints=True) without acknowledging that context. The suggestion would require importing logging and would be inconsistent with the rest of the file's error handling approach.
Workflow ID: wflow_xV6Ofh91E7DwutqO
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
| response, grounding_metadata = analyze_snippet( | ||
| gemini_key=gemini_key, | ||
| model_name=GeminiModel.GEMINI_2_5_PRO, | ||
| audio_file=local_file, |
There was a problem hiding this comment.
Use consistent naming for the audio file parameter (audio_file vs local_file) for clarity.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
src/processing_pipeline/stage_3.py (2)
151-165: Consider efficiency: file uploaded twice when fallback is triggered.
Stage3Executor.runuploads the audio file each time it's called (line 308). When the main model fails and triggers the fallback, the same file is uploaded again. For better efficiency, consider refactoring to upload once and pass the uploaded file reference to both attempts.Also applies to: 173-178
157-178: Preserve main error context when fallback also fails.If both the main and fallback models fail, only the fallback exception propagates, losing context from the initial failure. Consider capturing the main error and either logging it or including it in a chained exception to aid debugging.
Example approach:
except errors.ServerError as e: print(f"Server error with {main_model} (code {e.code}): {e.message}") print(f"Falling back to {fallback_model}") + try: - return Stage3Executor.run( - gemini_key=gemini_key, - model_name=fallback_model, - audio_file=audio_file, - metadata=metadata, - ) + return Stage3Executor.run( + gemini_key=gemini_key, + model_name=fallback_model, + audio_file=audio_file, + metadata=metadata, + ) + except Exception as fallback_error: + print(f"Fallback also failed: {fallback_error}") + raise fallback_error from e
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
src/processing_pipeline/stage_3.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/processing_pipeline/stage_3.py (4)
src/processing_pipeline/constants.py (1)
GeminiModel(5-11)src/processing_pipeline/stage_1.py (2)
run(505-544)run(552-573)src/processing_pipeline/stage_1_preprocess.py (2)
run(33-90)run(100-153)src/utils.py (1)
optional_task(6-40)
🔇 Additional comments (3)
src/processing_pipeline/stage_3.py (3)
2-2: LGTM! Imports support the new error handling.The
HTTPStatusanderrorsimports are correctly used in the newanalyze_snippetfunction to handle Gemini API errors and implement the fallback mechanism.Also applies to: 6-6
189-193: Clean integration with the new fallback mechanism.The refactoring to use
analyze_snippetcorrectly delegates the analysis while maintaining the existing error handling and status update logic. The change is well-contained and doesn't affect the overall flow ofprocess_snippet.
144-144: Verify: no retries configured foranalyze_snippet.Unlike other tasks in this file (e.g.,
fetch_a_specific_snippet_from_supabasewithretries=3),analyze_snippethas no retry configuration. This appears intentional since the function implements its own fallback logic, but please confirm this is the desired behavior. Adding retries here would multiply attempts (retries × fallback), potentially causing excessive API calls.
| def process_snippet(supabase_client, snippet, local_file, gemini_key, skip_review: bool): | ||
| def analyze_snippet(gemini_key, audio_file, metadata): | ||
| main_model = GeminiModel.GEMINI_2_5_PRO | ||
| fallback_model = GeminiModel.GEMINI_FLASH_LATEST |
There was a problem hiding this comment.
🧩 Analysis chain
Verify the fallback model matches the PR objective.
The PR title specifies "Gemini 2.5 Flash Latest" as the fallback model, but the code uses GEMINI_FLASH_LATEST which maps to "gemini-flash-latest". From the constants file, there's also GEMINI_2_5_FLASH = "gemini-2.5-flash". Please confirm whether GEMINI_FLASH_LATEST is the correct model or if you intended to use a model that explicitly identifies as version 2.5.
🌐 Web query:
What is the difference between Google Gemini models "gemini-flash-latest" and "gemini-2.5-flash"?
💡 Result:
Short answer: gemini-flash-latest is an alias that always points to the most recent "flash" release (it can be stable, preview or experimental and may be hot‑swapped), whereas gemini-2.5-flash is the specific stable Gemini 2.5 Flash model (a fixed release with documented capabilities and limits). [1][2]
Details (concise):
- Alias vs fixed name: gemini-flash-latest = latest release alias; gemini-2.5-flash = specific stable model. [1].
- Capabilities/token limits: Gemini 2.5 Flash (gemini-2.5-flash) documents a 1,048,576 input token limit and 65,536 output token limit and lists its supported modalities and features. Use the stable model string for production to avoid unexpected changes from alias updates. [1][2]
Sources:
[1] Google Gemini models (model/version naming & gemini-2.5-flash).
[2] Google Gemini models (description of "latest" alias and guidance).
Clarify fallback model intent—consider using the stable model for production consistency.
The PR title states "Gemini 2.5 Flash Latest," but the code uses GEMINI_FLASH_LATEST which is an alias that can change and may not always remain Gemini 2.5. The specific gemini-2.5-flash model is the stable Gemini 2.5 Flash version with documented capabilities and limits, whereas gemini-flash-latest is a rolling alias that can be hot-swapped and may be preview or experimental.
If your intent is to use Gemini 2.5 Flash specifically (as the PR title suggests), consider using GEMINI_2_5_FLASH instead to ensure consistent, documented behavior in production. If the alias is intentional, update the PR title to clarify that you want the latest flash model regardless of version.
🤖 Prompt for AI Agents
In src/processing_pipeline/stage_3.py around line 147, the fallback_model is set
to the rolling alias GEMINI_FLASH_LATEST which can change; if the intent is to
pin to the stable Gemini 2.5 Flash for production consistency, replace the alias
with the explicit GEMINI_2_5_FLASH constant (or the string identifier for
gemini-2.5-flash) so the code always selects that stable model; if you
intentionally want the rolling latest, update the PR title/message to state that
fact instead of referencing Gemini 2.5 Flash.
…back-model-in-stage-3
Important
Adds fallback model
GeminiModel.GEMINI_FLASH_LATESTinanalyze_snippet()for error handling instage_3.py.analyze_snippet()instage_3.pyto handle snippet analysis with fallback toGeminiModel.GEMINI_FLASH_LATESTifGeminiModel.GEMINI_2_5_PROfails.ServerErrorandClientErrorinanalyze_snippet(), falling back unless error isUNAUTHORIZEDorFORBIDDEN.process_snippet()toanalyze_snippet()instage_3.py.HTTPStatusanderrorsinstage_3.py.This description was created by
for 37bcb4c. You can customize this summary. It will automatically update as commits are pushed.
Summary by CodeRabbit