Skip to content

VER-306: Fix crashing issue due to out of memory in Stage 3 machine#69

Merged
quancao-ea merged 4 commits intomainfrom
fix/stage-3-out-of-memory-issue
Mar 20, 2026
Merged

VER-306: Fix crashing issue due to out of memory in Stage 3 machine#69
quancao-ea merged 4 commits intomainfrom
fix/stage-3-out-of-memory-issue

Conversation

@quancao-ea
Copy link
Copy Markdown
Collaborator

@quancao-ea quancao-ea commented Mar 20, 2026

Summary by CodeRabbit

  • New Features

    • Web-search enabled fact-checking and web-content fetching for richer, more grounded analysis
    • Fully asynchronous processing for faster, non-blocking content analysis and polling
  • Bug Fixes

    • Improved validation, clearer error messages, and more resilient fallback handling during analysis
  • Chores

    • Pinned new runtime dependencies to support async HTTP access and HTML→markdown conversion (aiohttp, html2text)

Implement SearXNG web search and URL content extraction
functionality to enable web-based information gathering in
Stage 3 processing. These tools provide asynchronous web
search capabilities and HTML-to-markdown conversion for
content extraction.
Migrate Stage 3 processing pipeline from synchronous to asynchronous
execution with enhanced web search capabilities. Replace Gemini CLI
and Google Search grounding with direct SDK web search tools.

Key changes:
- Convert executor and flow to async/await pattern
- Replace CLI and Google Search with custom searxng_web_search/web_url_read tools
- Add dedicated constants module for model configuration
- Simplify error handling with unified fallback strategy
- Pass gemini_client instance instead of API key
- Improve memory efficiency with streaming operations
@linear
Copy link
Copy Markdown

linear Bot commented Mar 20, 2026

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 20, 2026

Walkthrough

This PR converts Stage 3 to async execution, introduces SDK-based web-search grounding using SearXNG and URL reading, centralizes a shared GenAI client (removing direct API-key passing), adds web tooling and model constants, and refactors executors, flows, and tasks to use the new async flow.

Changes

Cohort / File(s) Summary
Dependencies
requirements.txt
Added aiohttp==3.13.3 and html2text==2025.4.15.
Model Constants
src/processing_pipeline/stage_3/constants.py
New module exposing MAIN_MODEL = GeminiModel.GEMINI_2_5_PRO and FALLBACK_MODEL = GeminiModel.GEMINI_2_5_FLASH.
Web Tooling
src/processing_pipeline/stage_3/web_tools.py
New async utilities: searxng_web_search() (SearXNG JSON normalization) and web_url_read() (fetch + HTML→markdown via html2text) with shared aiohttp timeout and SSL config.
Core Executor Refactoring
src/processing_pipeline/stage_3/executors.py
Replaced sync run() with async run_async(gemini_client: genai.Client, ...); removed API-key handling and CLI fallback; added SDK-based web-search/function-calling, async schema structuring, and file cleanup via gemini_client.files.delete().
Async Flow & Task Updates
src/processing_pipeline/stage_3/flows.py, src/processing_pipeline/stage_3/tasks.py
Converted in_depth_analysis, process_snippet, and analyze_snippet to async; create and pass a shared genai.Client; updated error handling to preserve auth failures, attempt fallback model, and build detailed error messages for complex exceptions.

Sequence Diagram(s)

sequenceDiagram
    participant Flow as in_depth_analysis (Flow)
    participant Task as process_snippet (Task)
    participant Executor as Stage3Executor.run_async()
    participant GenAI as GenAI Client (SDK)
    participant Files as GenAI Files API
    participant Search as searxng_web_search (Web Tools)
    participant WebRead as web_url_read (Web Tools)

    Flow->>Task: await process_snippet(gemini_client, snippet)
    Task->>Executor: await run_async(gemini_client, model, audio_file)
    Executor->>GenAI: upload file (aio)
    GenAI->>Files: store file -> returns file_id
    Executor->>GenAI: aio.models.generate_content (with automatic_function_calling)
    GenAI->>Search: call searxng_web_search(query)
    Search-->>GenAI: results
    GenAI->>WebRead: call web_url_read(url)
    WebRead-->>GenAI: markdown content
    GenAI-->>Executor: analysis + grounding
    Executor->>Files: delete(file_id)
    Executor-->>Task: analysis result
    Task-->>Flow: final result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • nhphong

"🐰
I hop through code with eager cheer,
Async hops and web-tools near,
SearXNG crumbs and markdown bright,
Gemini hums through day and night,
Cheers — the pipeline's taking flight!"

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 44.44% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title references a specific issue (VER-306) and describes the fix (crashing/out of memory in Stage 3), but the actual changes involve a comprehensive refactor from synchronous to asynchronous execution, adding web tools, and restructuring the entire Stage 3 pipeline - not solely addressing memory issues. Clarify whether the title should emphasize the primary architectural change (async refactor) or if memory optimization is the core focus. Consider a more descriptive title like 'VER-306: Refactor Stage 3 to async execution with web tools' if the async conversion is the main solution.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/stage-3-out-of-memory-issue
📝 Coding Plan
  • Generate coding plan for human review comments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Pylint (4.0.5)
src/processing_pipeline/stage_3/tasks.py

************* Module .pylintrc
.pylintrc:1:0: F0011: error while parsing the configuration: File contains no section headers.
file: '.pylintrc', line: 1
'disable=C0116\n' (config-parse-error)
[
{
"type": "convention",
"module": "src.processing_pipeline.stage_3.tasks",
"obj": "",
"line": 24,
"column": 0,
"endLine": null,
"endColumn": null,
"path": "src/processing_pipeline/stage_3/tasks.py",
"symbol": "line-too-long",
"message": "Line too long (180/100)",
"message-id": "C0301"
},
{
"type": "convention",
"module": "src.processing_pipeline.stage_3.tasks",
"obj": "",
"line": 108,
"column": 0,
"endLine": null,
"endColumn": null,
"path": "src/processing_pipeline/stage_3/tasks.py",
"symbol": "line-too-long",
"message": "Line too long (119/100)",
"message-id": "C0301"
},
{
"type": "convention",

... [truncated 10800 characters] ...

"module": "src.processing_pipeline.stage_3.tasks",
"obj": "process_snippet",
"line": 219,
"column": 11,
"endLine": 219,
"endColumn": 20,
"path": "src/processing_pipeline/stage_3/tasks.py",
"symbol": "broad-exception-caught",
"message": "Catching too general exception Exception",
"message-id": "W0718"
},
{
"type": "error",
"module": "src.processing_pipeline.stage_3.tasks",
"obj": "process_snippet",
"line": 221,
"column": 82,
"endLine": 221,
"endColumn": 94,
"path": "src/processing_pipeline/stage_3/tasks.py",
"symbol": "no-member",
"message": "Instance of 'Exception' has no 'exceptions' member",
"message-id": "E1101"
}
]

src/processing_pipeline/stage_3/web_tools.py

************* Module .pylintrc
.pylintrc:1:0: F0011: error while parsing the configuration: File contains no section headers.
file: '.pylintrc', line: 1
'disable=C0116\n' (config-parse-error)
[
{
"type": "convention",
"module": "src.processing_pipeline.stage_3.web_tools",
"obj": "",
"line": 51,
"column": 0,
"endLine": null,
"endColumn": null,
"path": "src/processing_pipeline/stage_3/web_tools.py",
"symbol": "line-too-long",
"message": "Line too long (120/100)",
"message-id": "C0301"
},
{
"type": "convention",
"module": "src.processing_pipeline.stage_3.web_tools",
"obj": "",
"line": 1,
"column": 0,
"endLine": null,
"endColumn": null,
"path": "src/processing_pipeline/stage_3/web_tools.py",
"symbol": "missing-module-docstring",
"message": "Missing module docstring",
"message-id": "C0114"
},
{
"type": "error",
"module": "src.processing_pipeline.stage_3.web_tools",
"obj": "",
"line": 6,
"column": 0,
"endLine": 6,
"endColumn": 16,
"path": "src/processing_pipeline/stage_3/web_tools.py",
"symbol": "import-error",
"message": "Unable to import 'html2text'",
"message-id": "E0401"
}
]


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can generate a title for your PR based on the changes with custom instructions.

Set the reviews.auto_title_instructions setting to generate a title for your PR based on the changes in the PR with custom instructions.

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical out-of-memory issue in the Stage 3 processing machine by overhauling how it interacts with the Gemini API and external web resources. The primary change involves migrating the entire Stage 3 pipeline to an asynchronous architecture, utilizing asyncio and aiohttp for more efficient resource handling. This refactoring also modernizes the Gemini integration by switching from CLI-based methods to the GenAI SDK's automatic function calling, enabling robust web search and URL content reading capabilities for enhanced fact-checking.

Highlights

  • Asynchronous Processing Migration: Migrated Stage 3 Gemini interactions from synchronous calls to asynchronous operations using asyncio and aiohttp to improve resource management and prevent out-of-memory issues.
  • Gemini SDK Integration with Automatic Function Calling: Replaced the Gemini CLI and Google Search grounding with direct GenAI SDK integration, leveraging automatic function calling for new web tools.
  • New Web Tools for Fact-Checking: Introduced searxng_web_search and web_url_read as asynchronous tools for dynamic fact-checking and content extraction from URLs.
  • Dependency Updates: Added aiohttp and html2text to requirements.txt to support the new asynchronous web tools and HTML parsing capabilities.
  • Centralized Model Configuration: Centralized Gemini model selection (MAIN_MODEL, FALLBACK_MODEL) in a new constants.py file for Stage 3, improving maintainability.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant and well-executed refactoring of the Stage 3 processing pipeline to address out-of-memory issues. By moving from a synchronous, subprocess-based approach to a fully asynchronous implementation using asyncio and the Google GenAI async SDK, the code is now more efficient, scalable, and robust. The introduction of web_tools.py for handling web requests is a clean separation of concerns. The overall changes are excellent. I have one high-severity suggestion regarding exception handling to make the service more manageable by avoiding a broad BaseException catch.

Comment thread src/processing_pipeline/stage_3/tasks.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/processing_pipeline/stage_3/executors.py`:
- Around line 60-64: The processing wait loop using
uploaded_audio_file.state.name == "PROCESSING" can hang indefinitely; modify the
logic around that loop (the block updating uploaded_audio_file and calling
gemini_client.files.get) to enforce a maximum wait: introduce a timeout
parameter (e.g., max_wait_seconds or deadline) and break/raise once elapsed,
checking elapsed time each iteration (or use asyncio.wait_for around a helper
coroutine) and surface a clear error or change state if the timeout is reached;
update references in the same function where uploaded_audio_file and
gemini_client.files.get are used so the loop exits deterministically on timeout.

In `@src/processing_pipeline/stage_3/web_tools.py`:
- Around line 48-51: searxng_web_search lacks a timeout and can hang; mirror
web_url_read by adding a 10-second aiohttp timeout. Update the call that opens
the client/session (in searxng_web_search) to pass
timeout=aiohttp.ClientTimeout(total=10) either on ClientSession(...) or on
session.get(...), so response.raise_for_status() and await response.json() are
bounded by 10s; keep existing SSL connector usage
(aiohttp.TCPConnector(ssl=_ssl_context)) when adding the timeout.
- Line 8: SEARXNG_URL currently defaults to an empty string which leads to
requests to "/search" and confusing failures; update the module to validate the
configuration and fail fast: change the default to None (or keep env lookup) and
either (a) raise a clear ValueError during import if SEARXNG_URL is falsy, or
(b) add a guard at the start of searxng_web_search(...) that checks "if not
SEARXNG_URL: raise ValueError('SEARXNG_URL environment variable is not set')" so
callers get an explicit error instead of silent broken requests; reference
SEARXNG_URL and searxng_web_search to locate where to add the validation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b2615480-abce-46c6-a62a-a782f94a19df

📥 Commits

Reviewing files that changed from the base of the PR and between 2d51fb3 and d4efd8d.

📒 Files selected for processing (6)
  • requirements.txt
  • src/processing_pipeline/stage_3/constants.py
  • src/processing_pipeline/stage_3/executors.py
  • src/processing_pipeline/stage_3/flows.py
  • src/processing_pipeline/stage_3/tasks.py
  • src/processing_pipeline/stage_3/web_tools.py

Comment thread src/processing_pipeline/stage_3/web_tools.py
Comment thread src/processing_pipeline/stage_3/web_tools.py Outdated
- Narrow BaseException to Exception to avoid catching SystemExit/KeyboardInterrupt
- Validate SEARXNG_URL is set
- Add 10s HTTP timeout to searxng_web_search (matching web_url_read)
@PublicDataWorks PublicDataWorks deleted a comment from coderabbitai Bot Mar 20, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/processing_pipeline/stage_3/tasks.py`:
- Around line 220-223: Add a safe import for ExceptionGroup and use the backport
on older interpreters: in the modules that reference ExceptionGroup (the
exception-handling block building error_message), wrap the import with a
try/except to import builtins.ExceptionGroup and fall back to the exceptiongroup
backport if ImportError; alternatively, if the project requires Python 3.11+,
ensure pyproject.toml specifies that and add the exceptiongroup backport to
dev/test requirements so imports succeed during development.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 44039503-e8fd-47a7-af97-168b200768e1

📥 Commits

Reviewing files that changed from the base of the PR and between d4efd8d and 761ee60.

📒 Files selected for processing (2)
  • src/processing_pipeline/stage_3/tasks.py
  • src/processing_pipeline/stage_3/web_tools.py
✅ Files skipped from review due to trivial changes (1)
  • src/processing_pipeline/stage_3/web_tools.py

@PublicDataWorks PublicDataWorks deleted a comment from coderabbitai Bot Mar 20, 2026
@quancao-ea quancao-ea merged commit a02f1fd into main Mar 20, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant