Skip to content

VER-305: Integrate knowledge base with Stage 1 - Initial detection process#67

Merged
quancao-ea merged 1 commit intomainfrom
features/integrate-knowledge-base-with-stage-1
Mar 17, 2026
Merged

VER-305: Integrate knowledge base with Stage 1 - Initial detection process#67
quancao-ea merged 1 commit intomainfrom
features/integrate-knowledge-base-with-stage-1

Conversation

@quancao-ea
Copy link
Copy Markdown
Collaborator

@quancao-ea quancao-ea commented Mar 16, 2026

Summary by CodeRabbit

  • New Features
    • Integrated knowledge base context retrieval to enhance disinformation detection by cross-referencing against verified facts
    • Enhanced detection workflows with improved contextual guidance for more accurate analysis
    • Added verified facts verification process to strengthen detection capabilities
    • Expanded processing pipeline with additional detection stages for comprehensive analysis

Use semantic search to retrieve relevant verified facts from the KB
and inject them into both initial and main detection prompts, helping
Gemini better identify known disinformation and avoid false positives.

- Add kb_context.py: chunked embedding + deduplication across chunks
  to cover all topics in 30-min radio broadcasts
- Refactor detection prompts as templates with .format() placeholders
  for kb_context, metadata, and transcription
- Thread OpenAI client and kb_context through flows, tasks, executors
- Add KB_STAGE1_CHUNK_SIZE and KB_STAGE1_MATCH_COUNT_PER_CHUNK constants
@linear
Copy link
Copy Markdown

linear Bot commented Mar 16, 2026

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the disinformation detection capabilities in Stage 1 of the processing pipeline by integrating a knowledge base. The system can now retrieve verified facts relevant to a given transcription and provide them to the LLM, guiding it to more accurately identify disinformation and avoid flagging truthful content. This integration aims to reduce false positives and improve the overall reliability of the detection process.

Highlights

  • Knowledge Base Integration: Integrated a knowledge base (KB) into the Stage 1 disinformation detection process to improve accuracy by providing verified facts to the LLM.
  • Prompt Template Updates: Modified user prompt templates for both initial and main detection to include a new kb_context placeholder, allowing relevant knowledge base entries to be injected directly into the LLM's input.
  • OpenAI Embedding for KB Search: Introduced the use of OpenAI's embedding API to generate embeddings for transcription chunks, enabling semantic search against the Supabase knowledge base.
  • New KB Context Module: Added a dedicated Python module (kb_context.py) responsible for chunking transcriptions, generating embeddings, searching the knowledge base, and formatting the retrieved facts for LLM consumption.
  • Flow and Task Updates: Updated the processing pipeline's flows and tasks to initialize an OpenAI client, fetch knowledge base context based on the transcription, and pass this context to the Gemini-based detection executors.
Changelog
  • prompts/stage_1/main/detection_user_prompt.md
    • Updated JSON schema curly braces to use {{ and }} for templating.
    • Added a new section for 'Knowledge Base: Verified Facts' with a {kb_context} placeholder.
    • Added a 'Transcription Input' section with {metadata} and {timestamped_transcription} placeholders.
  • prompts/stage_1/preprocess/initial_detection_user_prompt.md
    • Added a new section for 'Knowledge Base: Verified Facts' with a {kb_context} placeholder.
    • Updated example JSON curly braces to use {{ and }} for templating.
    • Added a 'Transcription Input' section with {metadata} and {transcription} placeholders.
  • src/processing_pipeline/stage_1/constants.py
    • Added KB_STAGE1_CHUNK_SIZE and KB_STAGE1_MATCH_COUNT_PER_CHUNK constants for knowledge base processing.
  • src/processing_pipeline/stage_1/executors.py
    • Modified run methods in Stage1PreprocessDetectionExecutor and Stage1MainDetectionExecutor to accept kb_context.
    • Updated prompt formatting to dynamically inject kb_context, metadata, and transcription into the user prompt.
  • src/processing_pipeline/stage_1/flows.py
    • Imported OpenAI client.
    • Imported fetch_kb_context task.
    • Added _create_openai_client function to initialize the OpenAI client.
    • Initialized openai_client in initial_disinformation_detection, redo_main_detection, and regenerate_timestamped_transcript flows.
    • Passed openai_client to process_audio_file.
    • Added logic to fetch kb_context using fetch_kb_context and openai_client before calling detection functions.
    • Passed kb_context to disinformation_detection_with_gemini calls.
  • src/processing_pipeline/stage_1/kb_context.py
    • Added new file for knowledge base context retrieval and formatting.
    • Implemented retrieve_kb_context to chunk transcription, generate OpenAI embeddings, search Supabase KB, and format results.
    • Included helper functions _split_into_chunks and _format_kb_entries.
  • src/processing_pipeline/stage_1/tasks.py
    • Imported retrieve_kb_context.
    • Added fetch_kb_context task to retrieve and log knowledge base context.
    • Modified initial_disinformation_detection_with_gemini and disinformation_detection_with_gemini to accept kb_context.
    • Updated process_audio_file to accept openai_client, fetch kb_context, and pass it to subsequent detection tasks.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully integrates a knowledge base into the Stage 1 detection process. The changes include updating LLM prompts to utilize knowledge base context, adding a new kb_context.py module for retrieving facts, and modifying processing flows to incorporate this new context. The implementation is solid, though I've identified a minor code duplication in src/processing_pipeline/stage_1/flows.py for fetching the knowledge base context. I've suggested a refactoring to improve maintainability. Overall, the changes are well-aligned with the feature's goal.

Comment on lines +185 to +187
# Fetch KB context using the initial transcription
initial_transcription = stage_1_llm_response.get("initial_transcription", "")
kb_context = fetch_kb_context(supabase_client, openai_client, initial_transcription)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic for fetching the knowledge base context is duplicated in the regenerate_timestamped_transcript flow on lines 262-264. To improve maintainability and adhere to the DRY (Don't Repeat Yourself) principle, consider extracting this logic into a private helper function.

For example:

def _get_kb_context_for_response(supabase_client: SupabaseClient, openai_client: OpenAI, stage_1_llm_response: dict) -> str | None:
    """Fetches KB context for a given stage 1 LLM response."""
    initial_transcription = stage_1_llm_response.get("initial_transcription", "")
    return fetch_kb_context(supabase_client, openai_client, initial_transcription)

This helper can then be called in both redo_main_detection and regenerate_timestamped_transcript to reduce redundancy.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 16, 2026

Walkthrough

This PR integrates knowledge base context retrieval into the Stage 1 detection pipeline. It introduces a new kb_context module that retrieves and formats KB entries using OpenAI embeddings and Supabase queries, updates prompt templates with knowledge base guidance sections, modifies executor and task signatures to accept kb_context parameters, and wires OpenAI client creation throughout the flow.

Changes

Cohort / File(s) Summary
Prompt Updates
prompts/stage_1/main/detection_user_prompt.md, prompts/stage_1/preprocess/initial_detection_user_prompt.md
Updated JSON schema formatting with doubled braces; added "Knowledge Base" sections with verified facts guidance; added "Transcription Input" blocks with metadata placeholders; replaced "Self-Review Process" with knowledge base verification workflow.
Constants
src/processing_pipeline/stage_1/constants.py
Added two KB-related constants (KB_STAGE1_CHUNK_SIZE, KB_STAGE1_MATCH_COUNT_PER_CHUNK) and three new enum members to Stage1SubStage (INITIAL_DETECTION, TIMESTAMPED_TRANSCRIPTION, DISINFORMATION_DETECTION).
KB Context Retrieval
src/processing_pipeline/stage_1/kb_context.py
New module providing retrieve_kb_context() function that chunks transcription text, embeds via OpenAI, queries Supabase KB with deduplication and similarity filtering, and formats results as Markdown output.
Executor Updates
src/processing_pipeline/stage_1/executors.py
Added kb_context: str | None parameter to Stage1PreprocessDetectionExecutor.run() and Stage1Executor.run(); updated prompt construction to use template-based formatting with kb_context placeholder injection.
Flow Integration
src/processing_pipeline/stage_1/flows.py
Added _create_openai_client() helper; wired OpenAI client creation and KB context retrieval across multiple stage 1 flow entry points (initial_disinformation_detection, redo_main_detection, regenerate_timestamped_transcript); passes kb_context to detection routines.
Task Layer
src/processing_pipeline/stage_1/tasks.py
Added fetch_kb_context() task; added kb_context: str | None parameter to initial_disinformation_detection_with_gemini() and disinformation_detection_with_gemini(); added openai_client: OpenAI parameter to process_audio_file() to enable KB context retrieval after initial transcription.

Sequence Diagram

sequenceDiagram
    participant Flow as Stage 1 Flow
    participant Audio as process_audio_file()
    participant Initial as initial_disinformation_<br/>detection_with_gemini()
    participant KB as KB Context<br/>Retrieval
    participant OpenAI as OpenAI Client<br/>(Embeddings)
    participant Supabase as Supabase KB<br/>Query
    participant Detect as Disinformation<br/>Detection Executor

    Flow->>Audio: Call with openai_client
    Audio->>Initial: Run with transcription
    Initial->>KB: fetch_kb_context(transcription)
    KB->>OpenAI: Embed text chunks
    OpenAI-->>KB: Return embeddings
    KB->>Supabase: Query KB with embeddings
    Supabase-->>KB: Return matching entries
    KB-->>Initial: Return formatted KB context
    Initial->>Detect: Run with kb_context
    Detect-->>Initial: Return detection results
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • nhphong

Poem

🐰 A knowledge base awakens in the detection flow,
With embeddings dancing through Supabase below,
Chunks and deduplication, chunks galore,
Now disinformation detection knows much more!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: integrating knowledge base functionality with Stage 1's initial detection process, which aligns with all the substantial modifications across prompts, constants, executors, flows, and the new kb_context module.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch features/integrate-knowledge-base-with-stage-1
📝 Coding Plan
  • Generate coding plan for human review comments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Pylint (4.0.5)
src/processing_pipeline/stage_1/kb_context.py

************* Module .pylintrc
.pylintrc:1:0: F0011: error while parsing the configuration: File contains no section headers.
file: '.pylintrc', line: 1
'disable=C0116\n' (config-parse-error)
[
{
"type": "convention",
"module": "src.processing_pipeline.stage_1.kb_context",
"obj": "",
"line": 23,
"column": 0,
"endLine": null,
"endColumn": null,
"path": "src/processing_pipeline/stage_1/kb_context.py",
"symbol": "line-too-long",
"message": "Line too long (101/100)",
"message-id": "C0301"
},
{
"type": "error",
"module": "src.processing_pipeline.stage_1.kb_context",
"obj": "",
"line": 3,
"column": 0,
"endLine": 3,
"endColumn": 25,
"path": "src/processing_pipeline/stage_1/kb_context.py",
"symbol": "import-error",
"message": "Unable to import 'openai'",
"message-id": "E0401"
},
{
"type": "err

... [truncated 848 characters] ...

rc.processing_pipeline.stage_1.kb_context",
"obj": "",
"line": 10,
"column": 0,
"endLine": 10,
"endColumn": 61,
"path": "src/processing_pipeline/stage_1/kb_context.py",
"symbol": "import-error",
"message": "Unable to import 'processing_pipeline.supabase_utils'",
"message-id": "E0401"
},
{
"type": "convention",
"module": "src.processing_pipeline.stage_1.kb_context",
"obj": "retrieve_kb_context",
"line": 13,
"column": 0,
"endLine": 13,
"endColumn": 23,
"path": "src/processing_pipeline/stage_1/kb_context.py",
"symbol": "missing-function-docstring",
"message": "Missing function or method docstring",
"message-id": "C0116"
}
]

src/processing_pipeline/stage_1/constants.py

************* Module .pylintrc
.pylintrc:1:0: F0011: error while parsing the configuration: File contains no section headers.
file: '.pylintrc', line: 1
'disable=C0116\n' (config-parse-error)
[
{
"type": "convention",
"module": "src.processing_pipeline.stage_1.constants",
"obj": "",
"line": 1,
"column": 0,
"endLine": null,
"endColumn": null,
"path": "src/processing_pipeline/stage_1/constants.py",
"symbol": "missing-module-docstring",
"message": "Missing module docstring",
"message-id": "C0114"
},
{
"type": "convention",
"module": "src.processing_pipeline.stage_1.constants",
"obj": "Stage1SubStage",
"line": 7,
"column": 0,
"endLine": 7,
"endColumn": 20,
"path": "src/processing_pipeline/stage_1/constants.py",
"symbol": "missing-class-docstring",
"message": "Missing class docstring",
"message-id": "C0115"
}
]

src/processing_pipeline/stage_1/executors.py

************* Module .pylintrc
.pylintrc:1:0: F0011: error while parsing the configuration: File contains no section headers.
file: '.pylintrc', line: 1
'disable=C0116\n' (config-parse-error)
[
{
"type": "convention",
"module": "src.processing_pipeline.stage_1.executors",
"obj": "",
"line": 171,
"column": 0,
"endLine": null,
"endColumn": null,
"path": "src/processing_pipeline/stage_1/executors.py",
"symbol": "line-too-long",
"message": "Line too long (102/100)",
"message-id": "C0301"
},
{
"type": "convention",
"module": "src.processing_pipeline.stage_1.executors",
"obj": "",
"line": 194,
"column": 0,
"endLine": null,
"endColumn": null,
"path": "src/processing_pipeline/stage_1/executors.py",
"symbol": "line-too-long",
"message": "Line too long (103/100)",
"message-id": "C0301"
},
{
"typ

... [truncated 12854 characters] ...

inal_transcription",
"line": 254,
"column": 4,
"endLine": 254,
"endColumn": 34,
"path": "src/processing_pipeline/stage_1/executors.py",
"symbol": "missing-function-docstring",
"message": "Missing function or method docstring",
"message-id": "C0116"
},
{
"type": "convention",
"module": "src.processing_pipeline.stage_1.executors",
"obj": "GeminiTimestampTranscriptionGenerator.split_audio_into_segments",
"line": 269,
"column": 4,
"endLine": 269,
"endColumn": 33,
"path": "src/processing_pipeline/stage_1/executors.py",
"symbol": "missing-function-docstring",
"message": "Missing function or method docstring",
"message-id": "C0116"
}
]

  • 2 others

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can approve the review once all CodeRabbit's comments are resolved.

Enable the reviews.request_changes_workflow setting to automatically approve the review once all CodeRabbit's comments are resolved.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
src/processing_pipeline/stage_1/flows.py (1)

298-307: Inconsistent error handling between Gemini and OpenAI client creation.

_create_gemini_client() returns None when the API key is missing, while _create_openai_client() raises a ValueError. This inconsistency could lead to confusing behavior:

  • Missing GOOGLE_GEMINI_KEY → silent None → runtime error later when gemini_client is used
  • Missing OPENAI_API_KEY → immediate ValueError with clear message

Consider aligning the behavior for consistency.

♻️ Suggested fix for consistent error handling
 def _create_gemini_client() -> genai.Client | None:
     gemini_key = os.getenv("GOOGLE_GEMINI_KEY")
-    return genai.Client(api_key=gemini_key) if gemini_key else None
+    if not gemini_key:
+        raise ValueError("GOOGLE_GEMINI_KEY environment variable is not set")
+    return genai.Client(api_key=gemini_key)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/processing_pipeline/stage_1/flows.py` around lines 298 - 307, The Gemini
client creator _create_gemini_client currently returns None when
GOOGLE_GEMINI_KEY is unset, causing inconsistent behavior with
_create_openai_client which raises ValueError; change _create_gemini_client to
validate the env var and raise a ValueError with a clear message if
GOOGLE_GEMINI_KEY is missing (mirror the pattern used in _create_openai_client)
so callers always get a concrete client or an explicit error.
src/processing_pipeline/stage_1/kb_context.py (2)

52-58: Character-based chunking may split words or sentences mid-stream.

The current implementation splits text at fixed character boundaries regardless of word or sentence boundaries. This could result in incomplete or nonsensical chunks being embedded, potentially degrading KB search quality.

Consider splitting on sentence or paragraph boundaries, or at minimum on whitespace near the chunk boundary.

♻️ Suggested improvement for word-boundary-aware chunking
 def _split_into_chunks(text: str, chunk_size: int) -> list[str]:
     chunks = []
-    for i in range(0, len(text), chunk_size):
-        chunk = text[i:i + chunk_size]
-        if chunk:
-            chunks.append(chunk)
+    start = 0
+    while start < len(text):
+        end = start + chunk_size
+        if end < len(text):
+            # Try to find a whitespace near the boundary to avoid splitting words
+            space_idx = text.rfind(' ', start, end)
+            if space_idx > start:
+                end = space_idx + 1
+        chunk = text[start:end].strip()
+        if chunk:
+            chunks.append(chunk)
+        start = end
     return chunks
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/processing_pipeline/stage_1/kb_context.py` around lines 52 - 58, The
_split_into_chunks function currently slices by fixed character windows and can
cut words/sentences; change it to be boundary-aware by finding a nearest safe
break (sentence or whitespace) before the chunk_size limit: for example, attempt
to split on sentence boundaries (use a sentence tokenizer like
nltk.sent_tokenize or a simple regex to detect sentence-ending punctuation) and
if none are available within the window, fall back to the last whitespace before
chunk_size (use str.rfind(' ', 0, i+chunk_size) or similar) and only hard-split
if no whitespace exists; update the _split_into_chunks implementation to iterate
through text advancing by these boundary-aware cut points and ensure it still
returns list[str].

25-40: Consider handling partial failures during KB search loop.

If the OpenAI embeddings call succeeds but a subsequent search_kb_entries call fails mid-loop, all progress is lost. While the task-level retry (in fetch_kb_context) will restart the operation, this could be inefficient for large transcriptions.

For resilience, consider catching and logging individual search failures while continuing with other chunks.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/processing_pipeline/stage_1/kb_context.py` around lines 25 - 40, The KB
search loop can fail mid-iteration and lose all progress; wrap the call to
supabase_client.search_kb_entries inside a try/except in the embeddings loop
(the block that iterates over embeddings and calls
supabase_client.search_kb_entries with KB_SEARCH_MATCH_THRESHOLD and
KB_STAGE1_MATCH_COUNT_PER_CHUNK), log the exception (including the embedding
index or a short context) and continue to the next embedding so
already-collected entries in the seen dict are preserved; keep updating seen as
before when results are returned and ensure fetch_kb_context’s task-level retry
still applies for total failure cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/processing_pipeline/stage_1/flows.py`:
- Around line 298-307: The Gemini client creator _create_gemini_client currently
returns None when GOOGLE_GEMINI_KEY is unset, causing inconsistent behavior with
_create_openai_client which raises ValueError; change _create_gemini_client to
validate the env var and raise a ValueError with a clear message if
GOOGLE_GEMINI_KEY is missing (mirror the pattern used in _create_openai_client)
so callers always get a concrete client or an explicit error.

In `@src/processing_pipeline/stage_1/kb_context.py`:
- Around line 52-58: The _split_into_chunks function currently slices by fixed
character windows and can cut words/sentences; change it to be boundary-aware by
finding a nearest safe break (sentence or whitespace) before the chunk_size
limit: for example, attempt to split on sentence boundaries (use a sentence
tokenizer like nltk.sent_tokenize or a simple regex to detect sentence-ending
punctuation) and if none are available within the window, fall back to the last
whitespace before chunk_size (use str.rfind(' ', 0, i+chunk_size) or similar)
and only hard-split if no whitespace exists; update the _split_into_chunks
implementation to iterate through text advancing by these boundary-aware cut
points and ensure it still returns list[str].
- Around line 25-40: The KB search loop can fail mid-iteration and lose all
progress; wrap the call to supabase_client.search_kb_entries inside a try/except
in the embeddings loop (the block that iterates over embeddings and calls
supabase_client.search_kb_entries with KB_SEARCH_MATCH_THRESHOLD and
KB_STAGE1_MATCH_COUNT_PER_CHUNK), log the exception (including the embedding
index or a short context) and continue to the next embedding so
already-collected entries in the seen dict are preserved; keep updating seen as
before when results are returned and ensure fetch_kb_context’s task-level retry
still applies for total failure cases.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1ca69a98-2e24-4949-89e9-66df00f4fc77

📥 Commits

Reviewing files that changed from the base of the PR and between 0ed305f and 8c93b6a.

📒 Files selected for processing (7)
  • prompts/stage_1/main/detection_user_prompt.md
  • prompts/stage_1/preprocess/initial_detection_user_prompt.md
  • src/processing_pipeline/stage_1/constants.py
  • src/processing_pipeline/stage_1/executors.py
  • src/processing_pipeline/stage_1/flows.py
  • src/processing_pipeline/stage_1/kb_context.py
  • src/processing_pipeline/stage_1/tasks.py

@quancao-ea quancao-ea merged commit ad00c4d into main Mar 17, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant