Integrate Gemini 2.5 into the Analysis Pipeline#16
Conversation
WalkthroughAdds Google GenAI v2 client usage and Gemini 2.5 model identifiers across the pipeline, replaces the Gemini 1206 transcription generator with a Gemini 2.5 Pro implementation, migrates stages to client.models.generate_content with GenerateContentConfig/SafetySetting, updates Supabase/main flows, and bumps dependencies and tests. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller as Main / Test
participant Supabase as SupabaseClient
participant Stage4 as Stage4Executor
participant GenAI as google.genai.Client
participant Tool as GoogleSearchTool
Caller->>Supabase: fetch_snippet(id)
Supabase-->>Caller: snippet (transcription, metadata, analysis_json, disinformation_snippet)
Caller->>Stage4: run(transcription, disinformation_snippet, metadata, analysis_json)
Stage4->>GenAI: client.models.generate_content(model=GEMINI_2_5_PRO, contents, config{tools, safety,...})
Note right of GenAI: config may include GoogleSearch Tool and SafetySetting list
GenAI-->>Stage4: response (text, candidates...)
alt response.text parses as JSON
Stage4->>Stage4: extract JSON and grounding_metadata
else parse fails
Stage4->>GenAI: client.models.generate_content(model=GEMINI_2_5_PRO, contents=validation_prompt, config{schema/mime})
GenAI-->>Stage4: validated JSON text
Stage4->>Stage4: extract JSON and grounding_metadata
end
Stage4->>Supabase: submit_snippet_review(result, reviewed_by=GEMINI_2_5_PRO)
Supabase-->>Stage4: ack
Stage4-->>Caller: processed result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Pylint (3.3.7)src/processing_pipeline/stage_4.py************* Module .pylintrc ... [truncated 11710 characters] ... eline.stage_4", Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (5)
src/processing_pipeline/stage_4.py (2)
6-14: Remove unused imports.
HarmCategoryandHarmBlockThresholdare never referenced in the code. They can be safely removed to reduce clutter and align with the static analysis suggestions.🧰 Tools
🪛 Ruff (0.8.2)
11-11:
google.genai.types.HarmCategoryimported but unusedRemove unused import
(F401)
12-12:
google.genai.types.HarmBlockThresholdimported but unusedRemove unused import
(F401)
318-318: Potential JSON parsing exception.
json.loads(response.text)will raise if the response is invalid JSON. Though it's caught by"is_convertible": falselogic next, consider logging partial text for debugging if an unhandled exception occurs.src/processing_pipeline/constants.py (2)
47-47: Consider file handling guard.In
get_timestamped_transcription_generation_prompt(), if the file is missing or locked, an exception is raised. Consider atry/exceptorwith open(...) as fpattern for safer file operations.
51-51: Add fallback logic for missing JSON file.Similar to the prompt file, consider gracefully handling file IO errors.
src/processing_pipeline/supabase_utils.py (1)
255-255: Avoid hard-coding in production.
"reviewed_by": GEMINI_2_0_FLASH_EXPis explicitly set. Consider passing the model name as a parameter or storing in a configuration.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
requirements.txt(1 hunks)src/processing_pipeline/constants.py(2 hunks)src/processing_pipeline/stage_4.py(5 hunks)src/processing_pipeline/supabase_utils.py(2 hunks)tests/processing_pipeline/test_stage_4.py(8 hunks)tests/test_supabase_utils.py(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- requirements.txt
🧰 Additional context used
🪛 Ruff (0.8.2)
src/processing_pipeline/stage_4.py
11-11: google.genai.types.HarmCategory imported but unused
Remove unused import
(F401)
12-12: google.genai.types.HarmBlockThreshold imported but unused
Remove unused import
(F401)
🔇 Additional comments (19)
src/processing_pipeline/stage_4.py (5)
18-18: Model constant added.
The definition of GEMINI_2_0_FLASH_EXP ensures alignment with the new model version. This change looks good.
208-210: Ensure consistent environment usage.
We set gemini_key from GOOGLE_GEMINI_PAID_KEY. Confirm that future references (e.g., in __ensure_json_format) safely rely on the same or a compatible environment variable.
221-251: Confirm safety settings approach.
All categories are set to "BLOCK_NONE". Verify if this is the intended behavior, as it permits any content.
268-269: Different environment variable for second prompt.
Here we read GOOGLE_GEMINI_KEY instead of GOOGLE_GEMINI_PAID_KEY. Ensure that using a separate key for the JSON-formatting prompt is intentional.
287-315: High token limit.
max_output_tokens=8192 may be memory-intensive. Confirm that you won't encounter performance or quota issues.
src/processing_pipeline/constants.py (1)
5-5: Added new model constant.
GEMINI_2_0_FLASH_EXP = "gemini-2.0-flash-exp" looks good.
src/processing_pipeline/supabase_utils.py (1)
3-3: Introduced new constant import.
Pulling in GEMINI_2_0_FLASH_EXP aligns with the move to the new model.
tests/processing_pipeline/test_stage_4.py (10)
3-3: Extended mock import usage.
Importing ANY from unittest.mock enhances test flexibility. This increment in mocking scope looks good.
17-17: Retain legacy constant for coverage.
GEMINI_1_5_PRO is still used in tests to confirm backward compatibility or second call usage.
36-45: Mocking the new client approach.
These lines correctly patch google.genai.Client and configure the model. Ensure any additional calls to client in the code are also mocked if needed.
102-103: Multiple response candidates.
Adding candidates helps test grounding metadata usage. Good improvement for capturing more test scenarios.
104-108: Cafe usage of models.generate_content mock.
Switching to models.generate_content for test calls is consistent with the library changes.
459-459: Multiple calls in single test.
Using side_effect with a list is effective for testing consecutive calls. Nice approach to verifying each call in sequence.
475-476: Validate first model call.
Ensuring the first call uses GEMINI_2_0_FLASH_EXP demonstrates that the pipeline picks the correct primary model.
480-482: Validate second model call.
The second call uses GEMINI_1_5_PRO. If this fallback is intentional for JSON validation, the test is correct.
598-598: Simulate empty response.
"is_convertible": false is a good scenario to ensure the snippet is not updated. This negative path coverage is valuable.
744-744: API error tested.
Raising Exception("API Error") confirms Stage4Executor.run logic handles generative call failures.
tests/test_supabase_utils.py (2)
4-4: Removed reference to old constant.
Importing GEMINI_2_0_FLASH_EXP ensures tests align with the new model references in production code.
699-699: Confirm new reviewed_by constant usage.
The test’s updated "reviewed_by" field matches the main code’s move to GEMINI_2_0_FLASH_EXP.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/main.py (1)
37-38: Remove or clarify the outdated comment.The comment about changing the paid key to free key appears to be a TODO or note. Since the code doesn't actually perform this change, this comment should either be removed or clarified with more context about when/how this should be done.
- # We need to change the paid key to the free key in stage 4 + # TODO: Consider using free tier key for testing to avoid costsOr simply remove it if it's no longer relevant.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (8)
prompts/Stage_4_review_prompt.md(3 hunks)requirements.txt(1 hunks)src/main.py(1 hunks)src/processing_pipeline/constants.py(2 hunks)src/processing_pipeline/stage_4.py(5 hunks)src/processing_pipeline/supabase_utils.py(2 hunks)tests/processing_pipeline/test_stage_4.py(8 hunks)tests/test_supabase_utils.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- requirements.txt
- src/processing_pipeline/constants.py
- tests/test_supabase_utils.py
🧰 Additional context used
🧬 Code graph analysis (2)
src/main.py (2)
src/processing_pipeline/stage_4.py (3)
Stage4Executor(189-334)prepare_snippet_for_review(27-50)run(196-270)src/processing_pipeline/supabase_utils.py (1)
get_snippet_by_id(29-31)
tests/processing_pipeline/test_stage_4.py (2)
tests/processing_pipeline/test_stage_3.py (1)
mock_gemini_model(40-59)tests/processing_pipeline/test_stage_1.py (1)
mock_gemini_model(71-77)
🪛 markdownlint-cli2 (0.17.2)
prompts/Stage_4_review_prompt.md
161-161: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
174-174: Unordered list indentation
Expected: 2; Actual: 4
(MD007, ul-indent)
🪛 Ruff (0.12.2)
src/processing_pipeline/stage_4.py
259-259: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (15)
prompts/Stage_4_review_prompt.md (3)
145-145: LGTM!The header change from "Primacy of Transcription and Metadata" to "Transcription and Metadata" maintains clarity while removing unnecessary emphasis. The substantive guideline remains unchanged.
161-161: Important accuracy directive added.The new instruction to verify and correct inaccurate claims in the Analysis JSON is crucial for Stage 4's review function and aligns well with the PR's integration of the new Gemini model. This ensures reviewers are explicitly prompted to validate rather than accept claims at face value.
174-174: Simplification of wording is appropriate.Changing "well-written" to "good" maintains the intent while using simpler language that's less subjective.
src/processing_pipeline/supabase_utils.py (1)
3-3: Model constant update looks good.The import and usage of
GEMINI_2_5_PROcorrectly replacesGEMINI_1_5_PROfor thereviewed_byfield, maintaining consistency with the new model integration across the pipeline.Also applies to: 255-255
src/processing_pipeline/stage_4.py (6)
1-12: LGTM! Clean migration to Google GenAI v2 API.The imports correctly reflect the new API surface with
google.genai.Clientand necessary configuration types. The removal of the oldgenerativeaiimports is complete.
207-209: API migration implemented correctly.The transition from
GenerativeModelto the newClient-based API is properly implemented. Thegoogle_search_toolis correctly instantiated for grounding support.
220-251: Comprehensive safety settings and proper API usage.The
generate_contentcall correctly uses the new API structure with:
- Proper
GenerateContentConfigsetup- Google Search tool integration for grounding
- Comprehensive safety settings including
HARM_CATEGORY_CIVIC_INTEGRITY- Appropriate response modality set to TEXT
This aligns well with the PR's goal of integrating the new Gemini model.
253-265: Good JSON extraction logic with proper fallback.The implementation correctly:
- Attempts to extract JSON from the response text
- Handles cases where the response might contain additional text around the JSON
- Falls back to
__ensure_json_formatfor validation when initial parsing failsThis robust approach ensures reliable JSON output.
278-279: Verify the use of different API keys for fallback.The fallback method uses
GOOGLE_GEMINI_KEYwhile the main method usesGOOGLE_GEMINI_PAID_KEY. Also note that the fallback usesGEMINI_1_5_PROinstead ofGEMINI_2_5_PRO.Is this intentional to use different API keys and models for the fallback validation? If the paid key has higher limits, it might make sense to use it for both calls to ensure consistency. Also, consider whether using the older model for validation might introduce discrepancies.
297-327: Excellent structured JSON validation approach.The fallback validation correctly:
- Uses
response_mime_typeandresponse_schemafor strict JSON output- Includes the same comprehensive safety settings
- Properly validates convertibility before returning
This ensures the output always conforms to the expected schema.
src/main.py (2)
1-7: LGTM! Clean migration to Supabase-based workflow.The imports correctly reflect the removal of AWS/S3 dependencies and the addition of Supabase and Stage 4 components, aligning with the PR's objectives.
15-35: Test harness properly updated for new Stage 4 API.The test function correctly:
- Fetches a snippet from Supabase using the new client
- Calls
prepare_snippet_for_reviewwith the correct signature- Passes all four required parameters to
Stage4Executor.run- Includes the new
disinformation_snippetparameterThe structured output formatting with JSON pretty-printing improves readability.
tests/processing_pipeline/test_stage_4.py (3)
36-109: Test mock correctly updated for new GenAI Client API.The mock setup properly reflects the new API structure:
- Creates a
Clientmock withmodelsattribute- Configures
models.generate_contentto return the expected response- Includes comprehensive test data matching the expected schema
- Sets up
candidateswith grounding metadataThis ensures tests accurately validate the new integration.
459-459: Test assertions correctly validate the new API calls.The test properly verifies:
- The mock client is called twice (main + fallback)
- First call uses
GEMINI_2_5_PROwith the new API signature- Second call uses
GEMINI_1_5_PROfor JSON validation- Both calls use the correct parameter structure with
model,contents, andconfigAlso applies to: 475-482
598-641: Test coverage for error scenarios is comprehensive.The tests properly handle:
- Empty/invalid responses with
is_convertible: false- API errors with appropriate exception handling
- Missing fields and invalid data
The mock setup correctly simulates these error conditions using the new API structure.
Also applies to: 744-744
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/processing_pipeline/constants.py (1)
1-7: Fix FD leaks and set UTF‑8 encoding for all prompt/schema loaders.Current open(...).read()/json.load(open(...)) leak fds and rely on locale encodings. Use Path helpers with explicit UTF‑8.
+from pathlib import Path import json GEMINI_1_5_PRO = "gemini-1.5-pro-002" GEMINI_1_5_FLASH = "gemini-1.5-flash" GEMINI_2_5_FLASH = "gemini-2.5-flash" GEMINI_2_5_PRO = "gemini-2.5-pro" +def _read_text(path: str) -> str: + return Path(path).read_text(encoding="utf-8") + +def _read_json(path: str): + return json.loads(Path(path).read_text(encoding="utf-8")) def get_transcription_prompt_for_stage_1_preprocess(): - return open("prompts/Stage_1_Preprocess_transcription_prompt.md", "r").read() + return _read_text("prompts/Stage_1_Preprocess_transcription_prompt.md") def get_detection_prompt_for_stage_1(): - return open("prompts/Stage_1_detection_prompt.md", "r").read() + return _read_text("prompts/Stage_1_detection_prompt.md") def get_detection_prompt_for_stage_1_preprocess(): - return open("prompts/Stage_1_Preprocess_detection_prompt.md", "r").read() + return _read_text("prompts/Stage_1_Preprocess_detection_prompt.md") def get_system_instruction_for_stage_1(): - return open("prompts/Stage_1_system_instruction.md", "r").read() + return _read_text("prompts/Stage_1_system_instruction.md") def get_system_instruction_for_stage_1_preprocess(): - return open("prompts/Stage_1_Preprocess_system_instruction.md", "r").read() + return _read_text("prompts/Stage_1_Preprocess_system_instruction.md") def get_output_schema_for_stage_1(): - return json.load(open("prompts/Stage_1_output_schema.json", "r")) + return _read_json("prompts/Stage_1_output_schema.json") def get_output_schema_for_stage_1_preprocess(): - return json.load(open("prompts/Stage_1_Preprocess_output_schema.json", "r")) + return _read_json("prompts/Stage_1_Preprocess_output_schema.json") def get_user_prompt_for_stage_3(): - return open("prompts/Stage_3_analysis_prompt.md", "r").read() + return _read_text("prompts/Stage_3_analysis_prompt.md") def get_system_instruction_for_stage_3(): - return open("prompts/Stage_3_system_instruction.md", "r").read() + return _read_text("prompts/Stage_3_system_instruction.md") def get_output_schema_for_stage_3(): - return json.load(open("prompts/Stage_3_output_schema.json", "r")) + return _read_json("prompts/Stage_3_output_schema.json") def get_timestamped_transcription_generation_prompt(): - return open("prompts/Timestamped_transcription_generation_prompt.md", "r").read() + return _read_text("prompts/Timestamped_transcription_generation_prompt.md") def get_timestamped_transcription_generation_output_schema(): - return json.load(open("prompts/Timestamped_transcription_generation_output_schema.json", "r")) + return _read_json("prompts/Timestamped_transcription_generation_output_schema.json") def get_system_instruction_for_stage_4(): - return open("prompts/Stage_4_system_instruction.md", "r").read() + return _read_text("prompts/Stage_4_system_instruction.md") def get_user_prompt_for_stage_4(): - return open("prompts/Stage_4_review_prompt.md", "r").read() + return _read_text("prompts/Stage_4_review_prompt.md") def get_output_schema_for_stage_4(): - return json.load(open("prompts/Stage_4_output_schema.json", "r")) + return _read_json("prompts/Stage_4_output_schema.json") def get_gemini_2_5_pro_transcription_generation_prompt(): - return open("prompts/Gemini_2_5_pro_transcription_generation_prompt.md", "r").read() + return _read_text("prompts/Gemini_2_5_pro_transcription_generation_prompt.md")Also applies to: 9-71
🧹 Nitpick comments (27)
tests/processing_pipeline/test_timestamped_transcription_generator.py (3)
63-66: Relax strict length assertions for MP3 rounding jitter.Exact equality on MP3 durations is flaky across encoders/platforms. Use a tolerance.
Apply:
- assert abs(len(audio) - segment_length_ms) == 0 + assert abs(len(audio) - segment_length_ms) <= 150 # allow small encoder jitter- assert len(audio1) == 10000 # The first part must be rounded down to the nearest multiple of 10 seconds - assert len(audio2) == 20000 # The remaining part of the audio + assert abs(len(audio1) - 10000) <= 150 # rounded down to nearest 10s + assert abs(len(audio2) - 20000) <= 150 # remainderAlso applies to: 78-83
84-98: Patching and wiring look correct; add a minimal call assertion.Assert we invoked the client once to catch regressions.
mock_client_class.return_value = mock_client + mock_client.models.generate_content.assert_not_called() ... result = TimestampedTranscriptionGenerator.transcribe_segments( segments, "fake-api-key" ) + mock_client.models.generate_content.assert_called_once()
132-136: Avoid brittle exact error string match.Error messages vary by OS/ffmpeg. Match on the errno phrase only.
- with pytest.raises(FileNotFoundError) as exc_info: + with pytest.raises(FileNotFoundError) as exc_info: TimestampedTranscriptionGenerator.run("non_existed_file.mp3", "fake-api-key", 10) - assert str(exc_info.value) == "[Errno 2] No such file or directory: 'non_existed_file.mp3'" + assert "No such file or directory" in str(exc_info.value)src/processing_pipeline/timestamped_transcription_generator.py (3)
87-91: Use splat instead of list concatenation (nit).Slightly cleaner and avoids creating an intermediate list.
- result = client.models.generate_content( + result = client.models.generate_content( model=GEMINI_2_5_PRO, - contents=[cls.USER_PROMPT] + segments, + contents=[cls.USER_PROMPT, *segments],
111-113: Harden JSON parsing and empty responses.Guard against missing/invalid
result.textto produce clearer errors.- return json.loads(result.text)["segments"] + if not getattr(result, "text", None): + raise ValueError("Empty response from Gemini when transcribing segments") + try: + return json.loads(result.text)["segments"] + except (json.JSONDecodeError, KeyError) as e: + raise ValueError(f"Invalid JSON in Gemini response: {e}") from e
95-109: Review safety/thinking settings for Stage 4 parity and policy.All thresholds are
BLOCK_NONEandinclude_thoughts=True. If that’s intentional, consider surfacing them via config/env to align with Stage 4 defaults and compliance posture.Would you like these values to be driven by env vars (e.g., GEMINI_SAFETY_BLOCK_NONE=1)?
tests/processing_pipeline/test_stage_3.py (1)
163-167: Patch the object where it’s used for isolation.Patching
google.genai.Clientworks, but patching the import site is more robust.- @patch('google.genai.Client') + @patch('processing_pipeline.stage_3.genai.Client')Apply similarly to other tests in this file.
Also applies to: 205-207, 228-230, 315-317, 343-344
tests/processing_pipeline/test_stage_1_preprocess.py (1)
49-65: Add safety settings assertions for stronger contract.Also assert the SafetySetting categories/thresholds to catch regressions.
Apply this diff:
# Check config config = kwargs['config'] assert config.response_mime_type == "application/json" assert config.response_schema == Stage1PreprocessTranscriptionExecutor.OUTPUT_SCHEMA assert config.max_output_tokens == 8192 + # Assert safety settings for completeness + categories = {s.category for s in config.safety_settings} + assert categories == { + HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT, + HarmCategory.HARM_CATEGORY_HATE_SPEECH, + HarmCategory.HARM_CATEGORY_HARASSMENT, + HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT, + HarmCategory.HARM_CATEGORY_CIVIC_INTEGRITY, + }src/processing_pipeline/stage_3.py (3)
141-147: Outdated log message (“Gemini Pro 1.5”).Update to reflect 2.5 usage.
- print(f"Processing snippet: {local_file} with Gemini Pro 1.5") + print(f"Processing snippet: {local_file} with Gemini 2.5 Pro")
249-253: Add a timeout/attempt cap to file processing poll.Avoid potential infinite wait if the upload never leaves PROCESSING.
- while audio_file.state.name == "PROCESSING": - print("Processing the uploaded audio file...") - time.sleep(1) - audio_file = client.files.get(name=audio_file.name) + start_time = time.time() + max_wait_seconds = 300 + while audio_file.state.name == "PROCESSING": + if time.time() - start_time > max_wait_seconds: + raise TimeoutError("Audio processing timed out") + print("Processing the uploaded audio file...") + time.sleep(1) + audio_file = client.files.get(name=audio_file.name)
269-286: Align safety settings with other stages (add CIVIC_INTEGRITY).Stage 1/4 include CIVIC_INTEGRITY; Stage 3 should, too, for consistency.
SafetySetting( category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT, threshold=HarmBlockThreshold.BLOCK_NONE, ), + SafetySetting( + category=HarmCategory.HARM_CATEGORY_CIVIC_INTEGRITY, + threshold=HarmBlockThreshold.BLOCK_NONE, + ),src/processing_pipeline/stage_4.py (3)
219-249: Use enums for SafetySetting and enforce JSON in the primary call.
- Swap string categories/thresholds to HarmCategory/HarmBlockThreshold for type safety.
- Ask the model for JSON directly with response_mime_type/response_schema to reduce fallback usage.
- Pass contents as a list for consistency with other stages.
-from google.genai.types import ( - GenerateContentConfig, - GoogleSearch, - SafetySetting, - Tool, -) +from google.genai.types import ( + GenerateContentConfig, + GoogleSearch, + SafetySetting, + Tool, + HarmCategory, + HarmBlockThreshold, +) @@ - response = client.models.generate_content( - model=model_id, - contents=user_prompt, - config=GenerateContentConfig( + response = client.models.generate_content( + model=model_id, + contents=[user_prompt], + config=GenerateContentConfig( tools=[google_search_tool], response_modalities=["TEXT"], + response_mime_type="application/json", + response_schema=cls.OUTPUT_SCHEMA, system_instruction=cls.SYSTEM_INSTRUCTION, max_output_tokens=8192, safety_settings=[ - SafetySetting( - category="HARM_CATEGORY_SEXUALLY_EXPLICIT", - threshold="BLOCK_NONE", - ), - SafetySetting( - category="HARM_CATEGORY_HATE_SPEECH", - threshold="BLOCK_NONE", - ), - SafetySetting( - category="HARM_CATEGORY_HARASSMENT", - threshold="BLOCK_NONE", - ), - SafetySetting( - category="HARM_CATEGORY_DANGEROUS_CONTENT", - threshold="BLOCK_NONE", - ), - SafetySetting( - category="HARM_CATEGORY_CIVIC_INTEGRITY", - threshold="BLOCK_NONE", - ), + SafetySetting(category=HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT, threshold=HarmBlockThreshold.BLOCK_NONE), + SafetySetting(category=HarmCategory.HARM_CATEGORY_HATE_SPEECH, threshold=HarmBlockThreshold.BLOCK_NONE), + SafetySetting(category=HarmCategory.HARM_CATEGORY_HARASSMENT, threshold=HarmBlockThreshold.BLOCK_NONE), + SafetySetting(category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT, threshold=HarmBlockThreshold.BLOCK_NONE), + SafetySetting(category=HarmCategory.HARM_CATEGORY_CIVIC_INTEGRITY, threshold=HarmBlockThreshold.BLOCK_NONE), ], ), )
296-326: Use enums for SafetySetting in fallback, too.Match the primary call’s types for consistency.
- safety_settings=[ - SafetySetting( - category="HARM_CATEGORY_SEXUALLY_EXPLICIT", - threshold="BLOCK_NONE", - ), - SafetySetting( - category="HARM_CATEGORY_HATE_SPEECH", - threshold="BLOCK_NONE", - ), - SafetySetting( - category="HARM_CATEGORY_HARASSMENT", - threshold="BLOCK_NONE", - ), - SafetySetting( - category="HARM_CATEGORY_DANGEROUS_CONTENT", - threshold="BLOCK_NONE", - ), - SafetySetting( - category="HARM_CATEGORY_CIVIC_INTEGRITY", - threshold="BLOCK_NONE", - ), - ], + safety_settings=[ + SafetySetting(category=HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT, threshold=HarmBlockThreshold.BLOCK_NONE), + SafetySetting(category=HarmCategory.HARM_CATEGORY_HATE_SPEECH, threshold=HarmBlockThreshold.BLOCK_NONE), + SafetySetting(category=HarmCategory.HARM_CATEGORY_HARASSMENT, threshold=HarmBlockThreshold.BLOCK_NONE), + SafetySetting(category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT, threshold=HarmBlockThreshold.BLOCK_NONE), + SafetySetting(category=HarmCategory.HARM_CATEGORY_CIVIC_INTEGRITY, threshold=HarmBlockThreshold.BLOCK_NONE), + ],
333-334: Shorten exception message per TRY003 and include context in the message payload instead.Keep the error readable and avoid overly long exception strings.
- raise ValueError(f"[Stage 4] The provided text is not convertible to a valid JSON object:\n{text}") + raise ValueError("[Stage 4] The provided text is not convertible to a valid JSON object")src/processing_pipeline/stage_1_preprocess.py (1)
40-46: Consider providing mime_type and add a timeout to polling.
- Supplying mime_type can speed up/clarify file processing.
- Bound the polling loop to avoid indefinite waits.
- audio_file = client.files.upload(file=audio_file) + audio_file = client.files.upload(file=audio_file) # optionally: config={"mime_type": "audio/mp3"} @@ - while audio_file.state.name == "PROCESSING": - print("Processing the uploaded audio file...") - time.sleep(1) - audio_file = client.files.get(name=audio_file.name) + start_time = time.time() + max_wait_seconds = 300 + while audio_file.state.name == "PROCESSING": + if time.time() - start_time > max_wait_seconds: + raise TimeoutError("Audio processing timed out") + print("Processing the uploaded audio file...") + time.sleep(1) + audio_file = client.files.get(name=audio_file.name)tests/processing_pipeline/test_gemini_2_5_pro_transcription_generator.py (5)
60-62: Avoid hardcoding the model string; assert against the constant.-from processing_pipeline.gemini_2_5_pro_transcription_generator import Gemini25ProTranscriptionGenerator +from processing_pipeline.gemini_2_5_pro_transcription_generator import Gemini25ProTranscriptionGenerator +from processing_pipeline.constants import GEMINI_2_5_PRO @@ - assert kwargs["model"] == "gemini-2.5-pro" + assert kwargs["model"] == GEMINI_2_5_PRO
76-87: Remove unused fixture argument.
mock_audio_fileisn’t used in this test.- def test_run_with_processing_audio(self, mock_genai, mock_audio_file): + def test_run_with_processing_audio(self, mock_genai):
103-107: Remove unused fixture argument.
mock_genaiisn’t used here.- def test_run_without_api_key(self, mock_genai): + def test_run_without_api_key(self):
174-175: Drop unused local variable (Ruff F841).- result = Gemini25ProTranscriptionGenerator.run("test.mp3", "fake-api-key") + Gemini25ProTranscriptionGenerator.run("test.mp3", "fake-api-key")
145-162: Naming nit: this test checks “thinking_config,” not timeouts.Consider renaming to reflect intent (optional).
- def test_timeout_configuration(self, mock_genai, mock_audio_file): - """Test timeout configuration in generate_content""" + def test_thinking_config_configuration(self, mock_genai, mock_audio_file): + """Test thinking_config in generate_content"""tests/processing_pipeline/test_stage_4.py (2)
36-40: Patch where it’s used to avoid aliasing fragility.Stage4Executor likely does
from google import genaiand thengenai.Client(...). Patching"processing_pipeline.stage_4.genai.Client"is safer than"google.genai.Client".-with patch("google.genai.Client") as mock: +with patch("processing_pipeline.stage_4.genai.Client") as mock:
459-477: Grounding type assertion is inconsistent with process_snippet usage.process_snippet forwards grounding_metadata as a dict to Supabase. Assert dict, not str.
-assert isinstance(grounding, str) +assert isinstance(grounding, dict)tests/processing_pipeline/test_stage_1.py (2)
709-717: Update expected transcriptor and logs to 2.5 Pro to match the code migration.Tests still assert "gemini-1206". The code path now uses the 2.5 Pro transcriptor; align expectations.
- call(1, {"timestamped_transcription": "Test transcription"}, "gemini-1206"), - call(2, {"timestamped_transcription": "Test transcription"}, "gemini-1206"), + call(1, {"timestamped_transcription": "Test transcription"}, "gemini-2.5-pro"), + call(2, {"timestamped_transcription": "Test transcription"}, "gemini-2.5-pro"),Also applies to: 728-731, 732-753, 755-761
138-138: Silence Ruff ARG002 (unused fixture args) without losing fixture effects.Access or delete the fixture vars at the start of each test.
def test_transcribe_with_gemini_success(self, mock_environment): + del mock_environment """Test successful transcription with Gemini"""def test_transcribe_with_gemini_2_5_pro_success(self, mock_environment): + del mock_environmentdef test_transcribe_with_gemini_2_5_pro_rate_limit(self, mock_environment): + del mock_environmentdef test_initial_detection_success(self, mock_environment): + del mock_environmentdef test_disinformation_detection_success(self, mock_environment): + del mock_environmentdef test_run_success(self, mock_environment, mock_genai): + del mock_environment-def test_disinformation_detection_with_unicode_handling(self, mock_supabase_client, mock_genai): +def test_disinformation_detection_with_unicode_handling(self, mock_genai):Also applies to: 164-164, 174-174, 183-183, 194-194, 209-209, 634-634
src/processing_pipeline/stage_1.py (3)
246-251: Two stray string literals — missing print.These lines won’t log; convert to print calls.
- if len(flagged_snippets) == 0: - "No flagged snippets found, inserting the llm response with status 'Processed'" + if len(flagged_snippets) == 0: + print("No flagged snippets found, inserting the llm response with status 'Processed'") @@ - else: - "Flagged snippets found, inserting the llm response with status 'New'" + else: + print("Flagged snippets found, inserting the llm response with status 'New'")Also applies to: 265-287
428-434: Update log to reflect 2.5 Pro.-print("Processing the timestamped transcription with Gemini 1.5 Pro") +print("Processing the timestamped transcription with Gemini 2.5 Pro")
485-498: Regenerate flow: transcriptor label and logs still reference 1206/1.5.Align with 2.5 Pro; this also requires updating tests that assert the transcriptor.
- try: - transcriptor = "gemini-1206" - timestamped_transcription = transcribe_audio_file_with_gemini_2_5_pro(local_file) + try: + transcriptor = "gemini-2.5-pro" + timestamped_transcription = transcribe_audio_file_with_gemini_2_5_pro(local_file) except google_exceptions.ResourceExhausted as e: print( - f"Failed to transcribe the audio file with Gemini 1206 due to Rate Limit: {e}\n" + f"Failed to transcribe the audio file with Gemini 2.5 Pro due to Rate Limit: {e}\n" "Falling back to the custom timestamped-transcript generator" ) @@ - print("Processing the timestamped transcription with Gemini 1.5 Pro") + print("Processing the timestamped transcription with Gemini 2.5 Pro")Please also update tests/processing_pipeline/test_stage_1.py lines 749-751 to expect "gemini-2.5-pro" (see separate test comment).
Also applies to: 500-505
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (15)
src/processing_pipeline/constants.py(4 hunks)src/processing_pipeline/gemini_1206_transcription_generator.py(0 hunks)src/processing_pipeline/gemini_2_5_pro_transcription_generator.py(1 hunks)src/processing_pipeline/stage_1.py(12 hunks)src/processing_pipeline/stage_1_preprocess.py(3 hunks)src/processing_pipeline/stage_3.py(3 hunks)src/processing_pipeline/stage_4.py(5 hunks)src/processing_pipeline/timestamped_transcription_generator.py(2 hunks)tests/processing_pipeline/test_gemini_1206_transcription_generator.py(0 hunks)tests/processing_pipeline/test_gemini_2_5_pro_transcription_generator.py(1 hunks)tests/processing_pipeline/test_stage_1.py(23 hunks)tests/processing_pipeline/test_stage_1_preprocess.py(7 hunks)tests/processing_pipeline/test_stage_3.py(4 hunks)tests/processing_pipeline/test_stage_4.py(14 hunks)tests/processing_pipeline/test_timestamped_transcription_generator.py(1 hunks)
💤 Files with no reviewable changes (2)
- tests/processing_pipeline/test_gemini_1206_transcription_generator.py
- src/processing_pipeline/gemini_1206_transcription_generator.py
🧰 Additional context used
🧬 Code graph analysis (8)
src/processing_pipeline/gemini_2_5_pro_transcription_generator.py (3)
src/processing_pipeline/constants.py (1)
get_gemini_2_5_pro_transcription_generation_prompt(69-70)src/processing_pipeline/stage_1.py (1)
run(535-579)src/processing_pipeline/timestamped_transcription_generator.py (1)
run(30-65)
tests/processing_pipeline/test_stage_4.py (2)
src/processing_pipeline/stage_4.py (2)
process_snippet(92-129)analysis_review(154-185)src/processing_pipeline/supabase_utils.py (3)
set_snippet_status(69-79)submit_snippet_review(238-260)create_new_label(322-340)
tests/processing_pipeline/test_stage_1_preprocess.py (2)
src/processing_pipeline/stage_1_preprocess.py (4)
Stage1PreprocessTranscriptionExecutor(22-81)Stage1PreprocessDetectionExecutor(84-135)run(32-81)run(91-135)tests/processing_pipeline/test_stage_1.py (1)
mock_genai(71-77)
tests/processing_pipeline/test_gemini_2_5_pro_transcription_generator.py (2)
src/processing_pipeline/gemini_2_5_pro_transcription_generator.py (2)
Gemini25ProTranscriptionGenerator(10-59)run(15-59)src/processing_pipeline/stage_1.py (1)
run(535-579)
src/processing_pipeline/timestamped_transcription_generator.py (1)
src/processing_pipeline/constants.py (2)
get_timestamped_transcription_generation_output_schema(53-54)get_timestamped_transcription_generation_prompt(49-50)
src/processing_pipeline/stage_1.py (5)
src/processing_pipeline/constants.py (3)
get_system_instruction_for_stage_1(21-22)get_output_schema_for_stage_1(29-30)get_detection_prompt_for_stage_1(13-14)src/processing_pipeline/gemini_2_5_pro_transcription_generator.py (2)
Gemini25ProTranscriptionGenerator(10-59)run(15-59)src/processing_pipeline/stage_3.py (1)
run(241-290)src/processing_pipeline/timestamped_transcription_generator.py (1)
run(30-65)src/processing_pipeline/stage_1_preprocess.py (3)
run(32-81)run(91-135)Stage1PreprocessDetectionExecutor(84-135)
tests/processing_pipeline/test_stage_3.py (2)
src/processing_pipeline/stage_3.py (3)
process_snippet(139-175)Stage3Executor(234-290)run(241-290)src/processing_pipeline/supabase_utils.py (2)
update_snippet(185-227)set_snippet_status(69-79)
tests/processing_pipeline/test_stage_1.py (4)
src/processing_pipeline/stage_1.py (11)
transcribe_audio_file_with_gemini_2_5_flash(80-83)transcribe_audio_file_with_custom_timestamped_transcription_generator(139-143)initial_disinformation_detection_with_gemini_2_5_pro(147-150)disinformation_detection_with_gemini_2_5_pro(154-168)transcribe_audio_file_with_gemini_2_5_pro(131-135)run(535-579)Stage1Executor(528-579)initial_disinformation_detection(298-342)regenerate_timestamped_transcript(443-525)update_stage_1_llm_response_timestamped_transcription(351-352)process_audio_file(204-294)tests/processing_pipeline/test_gemini_2_5_pro_transcription_generator.py (2)
mock_genai(10-25)test_run_success(35-74)tests/processing_pipeline/test_stage_1_preprocess.py (4)
mock_genai(13-18)mock_genai(111-116)test_run_success(27-67)test_run_success(118-160)tests/test_main.py (1)
mock_environment(19-29)
🪛 Ruff (0.12.2)
src/processing_pipeline/gemini_2_5_pro_transcription_generator.py
17-17: Avoid specifying long messages outside the exception class
(TRY003)
55-55: Avoid specifying long messages outside the exception class
(TRY003)
tests/processing_pipeline/test_gemini_2_5_pro_transcription_generator.py
76-76: Unused method argument: mock_audio_file
(ARG002)
103-103: Unused method argument: mock_genai
(ARG002)
174-174: Local variable result is assigned to but never used
Remove assignment to unused variable result
(F841)
src/processing_pipeline/timestamped_transcription_generator.py
89-89: Consider [cls.USER_PROMPT, *segments] instead of concatenation
Replace with [cls.USER_PROMPT, *segments]
(RUF005)
src/processing_pipeline/stage_4.py
258-258: Avoid specifying long messages outside the exception class
(TRY003)
tests/processing_pipeline/test_stage_1.py
138-138: Unused method argument: mock_environment
(ARG002)
164-164: Unused method argument: mock_environment
(ARG002)
174-174: Unused method argument: mock_environment
(ARG002)
183-183: Unused method argument: mock_environment
(ARG002)
194-194: Unused method argument: mock_environment
(ARG002)
209-209: Unused method argument: mock_environment
(ARG002)
634-634: Unused method argument: mock_supabase_client
(ARG002)
🔇 Additional comments (24)
src/processing_pipeline/gemini_2_5_pro_transcription_generator.py (1)
28-59: Overall flow LGTM.Upload, wait, generate, and cleanup are correctly structured. Once polling is fixed, this should be robust.
Consider adding a max wait (e.g., 5–10 min) to prevent infinite loops on stalled processing.
tests/processing_pipeline/test_stage_3.py (1)
247-255: LGTM: Stage3Executor smoke test.Asserting the result is JSON‑parseable and using GEMINI_2_5_PRO matches the migration.
tests/processing_pipeline/test_stage_1_preprocess.py (3)
15-18: Good mocking pattern and result assertions.Client/file ops and result checks look correct for the v2 client path.
Also applies to: 29-37
78-94: Nice polling test.The processing→processed flow and sleep assertions are spot on.
101-108: Error paths covered well.Upload and generation error propagation are correctly tested.
Also applies to: 167-174
src/processing_pipeline/stage_1_preprocess.py (1)
48-79: Stage 1 v2 migration looks solid.Model selection, config, safety settings, and cleanup are consistent and correct.
Also applies to: 103-135
src/processing_pipeline/stage_4.py (1)
202-209: Unify Gemini env var usage: prefer GOOGLE_GEMINI_KEY or document the paid variant.
- Found mismatch: stage_4.run reads GOOGLE_GEMINI_PAID_KEY (src/processing_pipeline/stage_4.py:202–204) while Stage4.__ensure_json_format reads GOOGLE_GEMINI_KEY (src/processing_pipeline/stage_4.py:273–275).
- Majority of code, README and tests use GOOGLE_GEMINI_KEY (src/main.py, src/processing_pipeline/stage_1.py, src/processing_pipeline/stage_3.py, README.md, tests/).
- Action: either change stage_4 to use GOOGLE_GEMINI_KEY for consistency, or keep GOOGLE_GEMINI_PAID_KEY but update README/tests and add a short comment explaining why a separate paid key is required.
tests/processing_pipeline/test_stage_4.py (6)
3-3: LGTM — imports updated for v2 client and new model constant.Also applies to: 17-17
41-106: LGTM — solid client/models/response scaffolding.Mocked response payload and grounding metadata shape look correct for downstream code.
331-336: LGTM — end-to-end process_snippet happy path is exercised.Also applies to: 341-353, 355-363
545-557: LGTM — analysis_review loop wiring and calls look correct (non-repeating path).
589-651: LGTM — empty/invalid JSON fallback path behavior is verified correctly.Also applies to: 640-648, 645-651
718-721: LGTM — error, invalid snippet, API error, specific IDs, and empty-category cases covered.Also applies to: 732-735, 743-751, 763-781, 786-810
tests/processing_pipeline/test_stage_1.py (7)
12-16: LGTM — imports aligned with 2.5 Flash/Pro surfaces.Also applies to: 24-25
71-77: LGTM — v2 client fixture covers models.generate_content; good default JSON.
138-153: LGTM — Stage 1 transcription/detection tests updated for v2 API path and pass-through JSON.Also applies to: 164-173, 174-180, 183-193, 194-206, 209-221
233-239: LGTM — main-flow hooks (initial/redo/regenerate) patch points and assertions look right.Also applies to: 269-275, 304-305
316-354: LGTM — process_audio_file success and error paths validated with 2.5 Flash + preprocess detection.Also applies to: 373-379
464-505: LGTM — redo_main_detection over multiple responses with v2 client.
634-642: LGTM — Unicode payload handling through v2 client.Also applies to: 653-661
src/processing_pipeline/stage_1.py (4)
8-14: LGTM — migrated to google.genai v2 client and type imports; model constant updated.Also applies to: 24-29
79-84: LGTM — Stage 1 public API functions now target Gemini 2.5 Flash/Pro and preprocess executors.Also applies to: 131-135, 147-151, 154-169
206-260: LGTM — fallback to custom generator on 2.5 Pro rate limit and main detection using Stage1Executor.Also applies to: 242-259
539-579: LGTM — Stage1Executor.run migrated to v2 API with schema’d JSON and safety settings.
| # Check if response.text is a good json | ||
| try: | ||
| # Find the first { and last } to extract the JSON object | ||
| start_idx = response.text.find("{") | ||
| end_idx = response.text.rfind("}") | ||
| if start_idx == -1 or end_idx == -1: | ||
| raise ValueError("No JSON object found in response") | ||
| response_text = response.text[start_idx : end_idx + 1] | ||
| result = json.loads(response_text) | ||
| except json.JSONDecodeError: | ||
| print("[Stage 4] Response from gemini 2.5 pro is not a valid JSON object.") | ||
| # Use another prompt to ensure the response is a "valid" json | ||
| result = Stage4Executor.__ensure_json_format(response.text) | ||
|
|
There was a problem hiding this comment.
Fallback not triggered when no braces (ValueError escapes).
The try only catches JSONDecodeError; “No JSON object found” raises ValueError and skips fallback.
- try:
+ try:
# Find the first { and last } to extract the JSON object
start_idx = response.text.find("{")
end_idx = response.text.rfind("}")
if start_idx == -1 or end_idx == -1:
raise ValueError("No JSON object found in response")
response_text = response.text[start_idx : end_idx + 1]
result = json.loads(response_text)
- except json.JSONDecodeError:
+ except (json.JSONDecodeError, ValueError):
print("[Stage 4] Response from gemini 2.5 pro is not a valid JSON object.")
# Use another prompt to ensure the response is a "valid" json
result = Stage4Executor.__ensure_json_format(response.text)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Check if response.text is a good json | |
| try: | |
| # Find the first { and last } to extract the JSON object | |
| start_idx = response.text.find("{") | |
| end_idx = response.text.rfind("}") | |
| if start_idx == -1 or end_idx == -1: | |
| raise ValueError("No JSON object found in response") | |
| response_text = response.text[start_idx : end_idx + 1] | |
| result = json.loads(response_text) | |
| except json.JSONDecodeError: | |
| print("[Stage 4] Response from gemini 2.5 pro is not a valid JSON object.") | |
| # Use another prompt to ensure the response is a "valid" json | |
| result = Stage4Executor.__ensure_json_format(response.text) | |
| # Check if response.text is a good json | |
| try: | |
| # Find the first { and last } to extract the JSON object | |
| start_idx = response.text.find("{") | |
| end_idx = response.text.rfind("}") | |
| if start_idx == -1 or end_idx == -1: | |
| raise ValueError("No JSON object found in response") | |
| response_text = response.text[start_idx : end_idx + 1] | |
| result = json.loads(response_text) | |
| except (json.JSONDecodeError, ValueError): | |
| print("[Stage 4] Response from gemini 2.5 pro is not a valid JSON object.") | |
| # Use another prompt to ensure the response is a "valid" json | |
| result = Stage4Executor.__ensure_json_format(response.text) |
🧰 Tools
🪛 Ruff (0.12.2)
258-258: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In src/processing_pipeline/stage_4.py around lines 252 to 265, the current
try/except only catches json.JSONDecodeError so the manually raised ValueError
("No JSON object found") bypasses the fallback; update the exception handling to
catch both json.JSONDecodeError and ValueError (e.g., except
(json.JSONDecodeError, ValueError):) so the fallback
Stage4Executor.__ensure_json_format(response.text) is invoked for both malformed
JSON and missing braces, and keep the existing log message or add a concise one
before calling the fallback.
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (3)
src/processing_pipeline/stage_4.py (1)
254-267: Fallback not triggered when no braces (ValueError escapes).Expand the except clause to include ValueError so the fallback runs when braces are missing. This mirrors the prior review note.
- except json.JSONDecodeError: + except (json.JSONDecodeError, ValueError): print("[Stage 4] Response from gemini 2.5 pro is not a valid JSON object.") # Use another prompt to ensure the response is a "valid" json result = Stage4Executor.__ensure_json_format(response.text)src/processing_pipeline/gemini_2_5_pro_transcription_generator.py (2)
34-35: Gate chain‑of‑thought behind an env flag (default off)Prevents accidental thought leakage to logs/telemetry while allowing opt‑in.
- thinking_config=ThinkingConfig(include_thoughts=False), + thinking_config=ThinkingConfig( + include_thoughts=bool(int(os.getenv("GEMINI_INCLUDE_THOUGHTS", "0"))) + ),Apply outside-selected lines:
import os
21-59: Harden upload polling: add timeout, failure-state handling, robust enum access, and guaranteed cleanupThe current loop can hang indefinitely and doesn’t handle terminal failure states. It also reuses the
audio_filearg for the uploaded handle and hardcodesaudio/mp3. Patch below addresses:
- bounded wait with exponential backoff + timeout
- failure-state detection
- robust enum access (
getattr(..., "name", ...))- keep original arg intact (
uploaded_filevariable)- best‑effort cleanup even if polling fails
- correct default MIME (
audio/mpeg) with detection from filename- # Upload the audio file and wait for it to finish processing - audio_file = client.files.upload(file=audio_file, config={"mime_type": "audio/mp3"}) - while audio_file.state.name == "PROCESSING": - print("Processing the uploaded audio file...") - time.sleep(1) - audio_file = client.files.get(name=audio_file.name) - - try: + # Upload the audio file and wait for it to finish processing + # Detect MIME type when possible; default to audio/mpeg for MP3 + filename = getattr(audio_file, "name", None) + mime_type = (mimetypes.guess_type(filename)[0] if filename else None) or "audio/mpeg" + uploaded_file = client.files.upload(file=audio_file, config={"mime_type": mime_type}) + try: + # Poll with bounded wait and robust state handling + start = time.time() + backoff = 1.0 + state = getattr(uploaded_file.state, "name", uploaded_file.state) + while state == "PROCESSING": + print("Processing the uploaded audio file...") + time.sleep(backoff) + backoff = min(backoff * 1.5, 5.0) + uploaded_file = client.files.get(name=uploaded_file.name) + state = getattr(uploaded_file.state, "name", uploaded_file.state) + if time.time() - start > 300: + raise TimeoutError(f"Upload processing timed out after 300s (state={state}).") + if state in {"FAILED", "ERROR", "CANCELLED"}: + raise RuntimeError(f"Upload failed with terminal state: {state}") result = client.models.generate_content( model=GEMINI_2_5_PRO, - contents=[cls.USER_PROMPT, audio_file], + contents=[cls.USER_PROMPT, uploaded_file], config=GenerateContentConfig( max_output_tokens=8192, thinking_config=ThinkingConfig(include_thoughts=False), safety_settings=[ SafetySetting( category=HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT, threshold=HarmBlockThreshold.BLOCK_NONE, ), SafetySetting( category=HarmCategory.HARM_CATEGORY_HATE_SPEECH, threshold=HarmBlockThreshold.BLOCK_NONE ), SafetySetting( category=HarmCategory.HARM_CATEGORY_HARASSMENT, threshold=HarmBlockThreshold.BLOCK_NONE ), SafetySetting( category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT, threshold=HarmBlockThreshold.BLOCK_NONE, ), ], ), ) if not result.text: raise ValueError("No content in response - likely truncated or blocked") return result.text finally: - client.files.delete(name=audio_file.name) + try: + client.files.delete(name=uploaded_file.name) + except Exception: + # Best-effort cleanup; ignore delete errors + passApply outside-selected lines:
# top-level imports import mimetypes
🧹 Nitpick comments (6)
src/processing_pipeline/stage_4.py (3)
221-252: Ask for JSON up front; drop brittle brace slicing.Request JSON with schema in the primary call to avoid substring extraction and reduce failures.
response = client.models.generate_content( model=model_id, contents=user_prompt, config=GenerateContentConfig( tools=[google_search_tool], response_modalities=["TEXT"], system_instruction=cls.SYSTEM_INSTRUCTION, max_output_tokens=8192, + response_mime_type="application/json", + response_schema=cls.OUTPUT_SCHEMA, safety_settings=[ SafetySetting( category=HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT, threshold=HarmBlockThreshold.BLOCK_NONE, ), ... ], ), ) - # Check if response.text is a good json - try: - # Find the first { and last } to extract the JSON object - start_idx = response.text.find("{") - end_idx = response.text.rfind("}") - if start_idx == -1 or end_idx == -1: - raise ValueError("No JSON object found in response") - response_text = response.text[start_idx : end_idx + 1] - result = json.loads(response_text) + # Parse model JSON response + try: + result = json.loads(response.text)Also applies to: 254-262
269-271: Guard against missing grounding_metadata on candidate.Avoid AttributeError when candidates exist but lack grounding_metadata.
- grounding_metadata = str(response.candidates[0].grounding_metadata) if response.candidates else None + candidate = response.candidates[0] if getattr(response, "candidates", None) else None + gm = getattr(candidate, "grounding_metadata", None) if candidate else None + grounding_metadata = str(gm) if gm is not None else None
221-252: Confirm token limit, safety thresholds, and add per-call timeout.
- Token limit: Gemini 2.5 Pro max_output_tokens = 65,536 — 8192 is within limit.
- Timeout: GenerateContentConfig supports http_options (genai.types.HttpOptions) with timeout in milliseconds; add a per-call timeout (e.g., HttpOptions(timeout=30000)) on the config or client to avoid long hangs.
- Safety: All thresholds set to BLOCK_NONE — confirm this is intentional.
File: src/processing_pipeline/stage_4.py lines 221–252
src/processing_pipeline/gemini_2_5_pro_transcription_generator.py (3)
36-51: Align safety settings with other stages: include CIVIC_INTEGRITYOther modules include CIVIC_INTEGRITY; keep parity to avoid unexpected blocks later.
SafetySetting( category=HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT, threshold=HarmBlockThreshold.BLOCK_NONE, ), + SafetySetting( + category=HarmCategory.HARM_CATEGORY_CIVIC_INTEGRITY, + threshold=HarmBlockThreshold.BLOCK_NONE, + ),
24-24: Prefer logging over print for observabilitySwitch to
loggingfor structured, level‑controlled output.- print("Processing the uploaded audio file...") + logger.info("Processing the uploaded audio file...")Apply outside-selected lines:
import logging logger = logging.getLogger(__name__)
17-17: Ruff TRY003: long exception messagesEither shorten messages or define a small custom exception class to satisfy linter and keep messages centralized.
Also applies to: 55-55
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (3)
src/processing_pipeline/gemini_2_5_pro_transcription_generator.py(1 hunks)src/processing_pipeline/stage_1.py(12 hunks)src/processing_pipeline/stage_4.py(5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/processing_pipeline/stage_1.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/processing_pipeline/gemini_2_5_pro_transcription_generator.py (3)
src/processing_pipeline/constants.py (1)
get_gemini_2_5_pro_transcription_generation_prompt(69-70)src/processing_pipeline/stage_1.py (1)
run(534-578)src/processing_pipeline/stage_4.py (1)
run(197-271)
🪛 Ruff (0.12.2)
src/processing_pipeline/gemini_2_5_pro_transcription_generator.py
17-17: Avoid specifying long messages outside the exception class
(TRY003)
55-55: Avoid specifying long messages outside the exception class
(TRY003)
src/processing_pipeline/stage_4.py
260-260: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (1)
src/processing_pipeline/gemini_2_5_pro_transcription_generator.py (1)
31-31: Verified — passing uploaded File object in contents is supportedGemini/GenAI v2 docs and the Python SDK show three supported file-attachment methods: (1) pass the uploaded File object directly in contents (e.g., contents=[prompt, uploaded_file]); (2) reference an uploaded file via parts/file_data or Part.fromURI with file_uri; (3) include raw bytes inline (inline_data / Part.fromBytes). The snippet in src/processing_pipeline/gemini_2_5_pro_transcription_generator.py (line 31) — contents=[cls.USER_PROMPT, audio_file] — is a supported pattern; no change required.
| client = genai.Client(api_key=gemini_key) | ||
| model_id = GEMINI_2_5_PRO | ||
| google_search_tool = Tool(google_search=GoogleSearch()) | ||
|
|
There was a problem hiding this comment.
Unify Gemini API key env var across both paths (fallback may 401).
run() uses GOOGLE_GEMINI_PAID_KEY while the fallback uses GOOGLE_GEMINI_KEY. Use a single, consistent lookup.
- gemini_key = os.getenv("GOOGLE_GEMINI_KEY")
+ gemini_key = os.getenv("GOOGLE_GEMINI_PAID_KEY") or os.getenv("GOOGLE_GEMINI_KEY")
if not gemini_key:
raise ValueError("Google Gemini API key was not set!")
- client = genai.Client(api_key=gemini_key)
+ client = genai.Client(api_key=gemini_key)Also applies to: 275-281
🤖 Prompt for AI Agents
In src/processing_pipeline/stage_4.py around lines 208-211 and 275-281, the code
reads Gemini API keys from two different environment variables
(GOOGLE_GEMINI_PAID_KEY in run() and GOOGLE_GEMINI_KEY in the fallback), which
can cause authentication errors; standardize on a single env var name across
both code paths by choosing one canonical variable (e.g., GOOGLE_GEMINI_PAID_KEY
or GOOGLE_GEMINI_KEY), update both lookups to use that same variable, and ensure
any helper/utility that reads the key falls back consistently (e.g., try the
primary name then a single documented fallback) so both the main path and
fallback use identical credentials.
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/processing_pipeline/stage_4.py (1)
333-335: Avoid leaking full model output in exception messages (privacy + lint).Current raise includes the entire text; it can end up in Supabase status/logs. Use a generic message.
- raise ValueError(f"[Stage 4] The provided text is not convertible to a valid JSON object:\n{text}") + raise ValueError("[Stage 4] The provided text is not convertible to the expected JSON schema.")This also satisfies TRY003.
♻️ Duplicate comments (2)
src/processing_pipeline/stage_4.py (2)
204-207: Unify Gemini API key env across both paths (prevent 401 in fallback).Standardize key lookup so main and fallback use identical credentials.
- gemini_key = os.getenv("GOOGLE_GEMINI_PAID_KEY") + gemini_key = os.getenv("GOOGLE_GEMINI_PAID_KEY") or os.getenv("GOOGLE_GEMINI_KEY") if not gemini_key: raise ValueError("Google Gemini API key was not set!")- gemini_key = os.getenv("GOOGLE_GEMINI_KEY") + gemini_key = os.getenv("GOOGLE_GEMINI_PAID_KEY") or os.getenv("GOOGLE_GEMINI_KEY") if not gemini_key: raise ValueError("Google Gemini API key was not set!")Also applies to: 275-281
254-266: Fallback not triggered when no braces (ValueError escapes).Also reported previously; catch both exceptions so the JSON-fixer runs.
- except json.JSONDecodeError: + except (json.JSONDecodeError, ValueError): print("[Stage 4] Response from gemini 2.5 pro is not a valid JSON object.") # Use another prompt to ensure the response is a "valid" json result = Stage4Executor.__ensure_json_format(response.text)
🧹 Nitpick comments (1)
src/processing_pipeline/stage_4.py (1)
221-251: Emit JSON directly from primary call (reduce brittle brace parsing).Ask the model for JSON up front; add schema + MIME so you can parse
response.textdirectly and avoid substring heuristics.response = client.models.generate_content( model=model_id, contents=user_prompt, config=GenerateContentConfig( tools=[google_search_tool], response_modalities=["TEXT"], system_instruction=cls.SYSTEM_INSTRUCTION, + response_mime_type="application/json", + response_schema=cls.OUTPUT_SCHEMA, max_output_tokens=8192, safety_settings=[
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
src/processing_pipeline/stage_4.py(5 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
src/processing_pipeline/stage_4.py
260-260: Avoid specifying long messages outside the exception class
(TRY003)
🔇 Additional comments (2)
src/processing_pipeline/stage_4.py (2)
306-325: SafetySetting enum usage looks correct.Enums are used consistently in the fallback path; this addresses the earlier type‑mismatch issue.
6-14: Good: v2 client/types and enums are correctly imported.Imports align with the v2 API surface and enum usage throughout.
Summary by CodeRabbit
New Features
Improvements
Chores