Skip to content

ECHO-295-train-dutch-pyannote-model#182

Merged
ArindamRoy23 merged 2 commits intomainfrom
feature/echo-295-train-dutch-pyannote-model
Jun 12, 2025
Merged

ECHO-295-train-dutch-pyannote-model#182
ArindamRoy23 merged 2 commits intomainfrom
feature/echo-295-train-dutch-pyannote-model

Conversation

@ArindamRoy23
Copy link
Copy Markdown
Contributor

@ArindamRoy23 ArindamRoy23 commented Jun 10, 2025

Refactor RunPod diarization process in conversation_health.py. Added language compatabilities

  • Introduced helper functions to streamline fetching chunk data, generating audio URLs, and submitting diarization jobs.
  • Paaing language to runpod to run finetuned models
  • Enhanced error handling and logging for improved observability during the diarization workflow.
  • Updated the main get_runpod_diarization function to utilize new helper methods, improving code readability and maintainability.
  • Implemented job status polling and results updating directly within the main function, ensuring better management of diarization tasks.

Summary by CodeRabbit

  • Refactor
    • Improved the reliability and clarity of the conversation health feature, resulting in more robust handling of diarization jobs and better error management.
    • Enhanced logging for easier troubleshooting and more detailed debug information.

- Introduced helper functions to streamline fetching chunk data, generating audio URLs, and submitting diarization jobs.
- Paaing language to runpod to run finetuned models
- Enhanced error handling and logging for improved observability during the diarization workflow.
- Updated the main get_runpod_diarization function to utilize new helper methods, improving code readability and maintainability.
- Implemented job status polling and results updating directly within the main function, ensuring better management of diarization tasks.
@linear
Copy link
Copy Markdown

linear bot commented Jun 10, 2025

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jun 10, 2025

Walkthrough

The get_runpod_diarization function in echo/server/dembrane/conversation_health.py was refactored into several focused helper functions. The refactor modularizes the logic for fetching chunk data, generating URLs, checking if diarization should be skipped, submitting and polling RunPod jobs, updating results, and handling job timeouts. Logging and error handling were also improved.

Changes

File(s) Change Summary
echo/server/dembrane/conversation_health.py Refactored get_runpod_diarization into multiple helpers for modularity, clarity, and error handling. Function signature changed to require chunk_id. Logging and job submission logic updated.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ConversationHealth

    Client->>ConversationHealth: get_runpod_diarization(chunk_id)
    ConversationHealth->>ConversationHealth: _fetch_chunk_data(chunk_id)
    ConversationHealth->>ConversationHealth: _generate_audio_url(audio_file_uri)
    ConversationHealth->>ConversationHealth: _should_skip_diarization(project_language)
    alt Diarization skipped
        ConversationHealth-->>Client: Return (no action)
    else Diarization required
        ConversationHealth->>ConversationHealth: _submit_diarization_job(audio_url, project_language)
        ConversationHealth->>ConversationHealth: _poll_job_status(job_status_link, headers)
        alt Job completed
            ConversationHealth->>ConversationHealth: _update_chunk_with_results(chunk_id, dirz_response_data)
        else Timeout
            ConversationHealth->>ConversationHealth: _cancel_job_on_timeout(job_id)
        end
    end
Loading

Assessment against linked issues

Objective Addressed Explanation
Train Dutch Pyannote model (ECHO-295) No code related to model training or dataset creation is present; only diarization orchestration logic is refactored.

Assessment against linked issues: Out-of-scope changes

Code Change Explanation
Refactor of diarization orchestration logic (echo/server/dembrane/conversation_health.py) The changes refactor diarization job handling; this is unrelated to training or publishing a Dutch Pyannote model as specified in ECHO-295.

LGTM.

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@ArindamRoy23 ArindamRoy23 enabled auto-merge June 10, 2025 14:48
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e415186 and af7182b.

📒 Files selected for processing (1)
  • echo/server/dembrane/conversation_health.py (2 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
echo/server/dembrane/conversation_health.py

[convention] 40-40: Line too long (132/100)

(C0301)


[warning] 42-42: Catching too general exception Exception

(W0718)


[warning] 40-40: Use lazy % formatting in logging functions

(W1203)


[warning] 43-43: Use lazy % formatting in logging functions

(W1203)


[warning] 56-56: Catching too general exception Exception

(W0718)


[warning] 54-54: Use lazy % formatting in logging functions

(W1203)


[warning] 57-57: Use lazy % formatting in logging functions

(W1203)


[warning] 66-66: Use lazy % formatting in logging functions

(W1203)


[convention] 86-86: Trailing whitespace

(C0303)


[convention] 123-123: Trailing whitespace

(C0303)


[convention] 147-147: Trailing whitespace

(C0303)


[convention] 193-193: Trailing whitespace

(C0303)


[convention] 200-200: Trailing whitespace

(C0303)


[convention] 204-204: Line too long (111/100)

(C0301)


[convention] 208-208: Line too long (102/100)

(C0301)


[convention] 210-210: Trailing whitespace

(C0303)


[warning] 95-95: Catching too general exception Exception

(W0718)


[warning] 88-88: Use lazy % formatting in logging functions

(W1203)


[warning] 93-93: Use lazy % formatting in logging functions

(W1203)


[warning] 96-96: Use lazy % formatting in logging functions

(W1203)


[warning] 110-110: Catching too general exception Exception

(W0718)


[warning] 106-106: Use lazy % formatting in logging functions

(W1203)


[warning] 111-111: Use lazy % formatting in logging functions

(W1203)


[warning] 134-134: Use lazy % formatting in logging functions

(W1203)


[warning] 154-154: Catching too general exception Exception

(W0718)


[warning] 150-150: Use lazy % formatting in logging functions

(W1203)


[warning] 153-153: Use lazy % formatting in logging functions

(W1203)


[warning] 155-155: Use lazy % formatting in logging functions

(W1203)


[warning] 199-199: Use lazy % formatting in logging functions

(W1203)


[refactor] 203-209: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[warning] 204-204: Use lazy % formatting in logging functions

(W1203)


[warning] 208-208: Use lazy % formatting in logging functions

(W1203)


[refactor] 158-158: Too many return statements (7/6)

(R0911)


[refactor] 158-158: Either all return statements in a function should return an expression, or none of them should.

(R1710)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci-check-server
🔇 Additional comments (8)
echo/server/dembrane/conversation_health.py (8)

23-44: LGTM!

Clean extraction of chunk data fetching logic. The tuple return type is slick.

🧰 Tools
🪛 Pylint (3.3.7)

[convention] 40-40: Line too long (132/100)

(C0301)


[warning] 42-42: Catching too general exception Exception

(W0718)


[warning] 40-40: Use lazy % formatting in logging functions

(W1203)


[warning] 43-43: Use lazy % formatting in logging functions

(W1203)


47-59: LGTM!

Solid encapsulation of URL generation. Clean and focused.

🧰 Tools
🪛 Pylint (3.3.7)

[warning] 56-56: Catching too general exception Exception

(W0718)


[warning] 54-54: Use lazy % formatting in logging functions

(W1203)


[warning] 57-57: Use lazy % formatting in logging functions

(W1203)


61-69: LGTM!

Smart guard clause. Love the explicit boolean return.

🧰 Tools
🪛 Pylint (3.3.7)

[warning] 66-66: Use lazy % formatting in logging functions

(W1203)


71-98: Nice language param addition!

The language field in the payload enables the fine-tuned model selection. This is the key change for multilingual diarization support. Ship it.

🧰 Tools
🪛 Pylint (3.3.7)

[convention] 86-86: Trailing whitespace

(C0303)


[warning] 79-79: Use lazy % formatting in logging functions

(W1203)


[warning] 95-95: Catching too general exception Exception

(W0718)


[warning] 88-88: Use lazy % formatting in logging functions

(W1203)


[warning] 93-93: Use lazy % formatting in logging functions

(W1203)


[warning] 96-96: Use lazy % formatting in logging functions

(W1203)


100-113: LGTM!

Clean polling abstraction. Single responsibility principle FTW.

🧰 Tools
🪛 Pylint (3.3.7)

[warning] 110-110: Catching too general exception Exception

(W0718)


[warning] 106-106: Use lazy % formatting in logging functions

(W1203)


[warning] 111-111: Use lazy % formatting in logging functions

(W1203)


115-135: LGTM!

Clean result mapping. The field extraction is crisp.

🧰 Tools
🪛 Pylint (3.3.7)

[convention] 123-123: Trailing whitespace

(C0303)


[warning] 134-134: Use lazy % formatting in logging functions

(W1203)


137-156: LGTM!

Good timeout handling. Prevents runaway jobs.

🧰 Tools
🪛 Pylint (3.3.7)

[convention] 147-147: Trailing whitespace

(C0303)


[warning] 154-154: Catching too general exception Exception

(W0718)


[warning] 150-150: Use lazy % formatting in logging functions

(W1203)


[warning] 153-153: Use lazy % formatting in logging functions

(W1203)


[warning] 155-155: Use lazy % formatting in logging functions

(W1203)


158-161:

Details

❌ Incorrect review comment

Breaking change alert: path parameter removed!

The function signature changed from (chunk_id: str | None = None, path: str | None = None) to (chunk_id: str). Any callers passing the path parameter will break. Ensure all call sites are updated.


🏁 Script executed:

#!/bin/bash
# Description: Find all calls to get_runpod_diarization to verify they don't use the path parameter

# Search for function calls with path parameter
rg -A 2 'get_runpod_diarization\(' --type py | grep -E 'path\s*='

Length of output: 67


Let’s enumerate all call sites to catch any zero-arg or legacy calls:


🏁 Script executed:

#!/bin/bash
# List all invocations of get_runpod_diarization to inspect argument usage
rg -n 'get_runpod_diarization\(' --type py

Length of output: 279


Let’s pull in the full body of get_runpod_diarization to audit its return paths. LGTM.


🏁 Script executed:

#!/bin/bash
# Extract the implementation of get_runpod_diarization for return-value inspection
rg -n -A 200 'def get_runpod_diarization' echo/server/dembrane/conversation_health.py

Length of output: 7817


No breaking change detected for get_runpod_diarization – safe to remove path

After auditing all call sites and return paths:

  • Only invocation lives in echo/server/dembrane/tasks.py and calls get_runpod_diarization(chunk_id) without a path argument.
  • Every return in get_runpod_diarization is either return or return None and the signature declares -> None, so return values are consistent.

You can safely ignore the earlier “breaking change” alert and the return-value inconsistency flag. LGTM.

Likely an incorrect or invalid review comment.

🧰 Tools
🪛 Pylint (3.3.7)

[refactor] 158-158: Too many return statements (7/6)

(R0911)


[refactor] 158-158: Either all return statements in a function should return an expression, or none of them should.

(R1710)

@ArindamRoy23 ArindamRoy23 disabled auto-merge June 10, 2025 17:29
@ArindamRoy23 ArindamRoy23 enabled auto-merge June 10, 2025 17:29
coderabbitai bot added a commit that referenced this pull request Jun 10, 2025
Docstrings generation was requested by @ArindamRoy23.

* #182 (comment)

The following files were modified:

* `echo/server/dembrane/conversation_health.py`
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Jun 10, 2025

Note

Generated docstrings for this pull request at #183

Docstrings generation was requested by @ArindamRoy23.

* #182 (comment)

The following files were modified:

* `echo/server/dembrane/conversation_health.py`

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

♻️ Duplicate comments (1)
echo/server/dembrane/conversation_health.py (1)

197-197: Make return statements consistent.

The function has mixed explicit and implicit None returns. For clarity, be consistent.

-        return None
+        return

Or make all returns explicit with None. Pick one style and stick with it throughout.

Also applies to: 203-203, 207-207, 235-235, 238-238, 244-244

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between af7182b and 819e9d6.

📒 Files selected for processing (1)
  • echo/server/dembrane/conversation_health.py (2 hunks)
🧰 Additional context used
🪛 Pylint (3.3.7)
echo/server/dembrane/conversation_health.py

[convention] 28-28: Line too long (104/100)

(C0301)


[convention] 42-42: Line too long (132/100)

(C0301)


[convention] 72-72: Line too long (119/100)

(C0301)


[convention] 89-89: Line too long (120/100)

(C0301)


[warning] 44-44: Catching too general exception Exception

(W0718)


[warning] 42-42: Use lazy % formatting in logging functions

(W1203)


[warning] 45-45: Use lazy % formatting in logging functions

(W1203)


[warning] 63-63: Catching too general exception Exception

(W0718)


[warning] 61-61: Use lazy % formatting in logging functions

(W1203)


[warning] 64-64: Use lazy % formatting in logging functions

(W1203)


[warning] 75-75: Use lazy % formatting in logging functions

(W1203)


[convention] 101-101: Trailing whitespace

(C0303)


[convention] 148-148: Trailing whitespace

(C0303)


[convention] 174-174: Trailing whitespace

(C0303)


[convention] 189-189: Line too long (226/100)

(C0301)


[convention] 222-222: Trailing whitespace

(C0303)


[convention] 229-229: Trailing whitespace

(C0303)


[convention] 233-233: Line too long (111/100)

(C0301)


[convention] 237-237: Line too long (102/100)

(C0301)


[convention] 239-239: Trailing whitespace

(C0303)


[warning] 110-110: Catching too general exception Exception

(W0718)


[warning] 103-103: Use lazy % formatting in logging functions

(W1203)


[warning] 108-108: Use lazy % formatting in logging functions

(W1203)


[warning] 111-111: Use lazy % formatting in logging functions

(W1203)


[warning] 131-131: Catching too general exception Exception

(W0718)


[warning] 127-127: Use lazy % formatting in logging functions

(W1203)


[warning] 132-132: Use lazy % formatting in logging functions

(W1203)


[warning] 159-159: Use lazy % formatting in logging functions

(W1203)


[warning] 181-181: Catching too general exception Exception

(W0718)


[warning] 177-177: Use lazy % formatting in logging functions

(W1203)


[warning] 180-180: Use lazy % formatting in logging functions

(W1203)


[warning] 182-182: Use lazy % formatting in logging functions

(W1203)


[warning] 228-228: Use lazy % formatting in logging functions

(W1203)


[refactor] 232-238: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[warning] 233-233: Use lazy % formatting in logging functions

(W1203)


[warning] 237-237: Use lazy % formatting in logging functions

(W1203)


[refactor] 185-185: Too many return statements (7/6)

(R0911)


[refactor] 185-185: Either all return statements in a function should return an expression, or none of them should.

(R1710)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci-check-server
🔇 Additional comments (2)
echo/server/dembrane/conversation_health.py (2)

100-100: LGTM! Language parameter correctly passed to RunPod.

This implements the core objective of the PR - passing language parameters to enable fine-tuned models.


23-244: LGTM! Solid refactoring that achieves the PR objectives.

The modularization of get_runpod_diarization into focused helper functions is clean and improves maintainability. The language parameter support for fine-tuned models is correctly implemented. Just address the logging format nits and return statement consistency for a perfect 100x implementation.

🧰 Tools
🪛 Pylint (3.3.7)

[convention] 28-28: Line too long (104/100)

(C0301)


[convention] 42-42: Line too long (132/100)

(C0301)


[convention] 72-72: Line too long (119/100)

(C0301)


[convention] 89-89: Line too long (120/100)

(C0301)


[convention] 101-101: Trailing whitespace

(C0303)


[convention] 148-148: Trailing whitespace

(C0303)


[convention] 174-174: Trailing whitespace

(C0303)


[convention] 189-189: Line too long (226/100)

(C0301)


[convention] 222-222: Trailing whitespace

(C0303)


[convention] 229-229: Trailing whitespace

(C0303)


[convention] 233-233: Line too long (111/100)

(C0301)


[convention] 237-237: Line too long (102/100)

(C0301)


[convention] 239-239: Trailing whitespace

(C0303)


[warning] 44-44: Catching too general exception Exception

(W0718)


[warning] 42-42: Use lazy % formatting in logging functions

(W1203)


[warning] 45-45: Use lazy % formatting in logging functions

(W1203)


[warning] 63-63: Catching too general exception Exception

(W0718)


[warning] 61-61: Use lazy % formatting in logging functions

(W1203)


[warning] 64-64: Use lazy % formatting in logging functions

(W1203)


[warning] 75-75: Use lazy % formatting in logging functions

(W1203)


[warning] 94-94: Use lazy % formatting in logging functions

(W1203)


[warning] 110-110: Catching too general exception Exception

(W0718)


[warning] 103-103: Use lazy % formatting in logging functions

(W1203)


[warning] 108-108: Use lazy % formatting in logging functions

(W1203)


[warning] 111-111: Use lazy % formatting in logging functions

(W1203)


[warning] 131-131: Catching too general exception Exception

(W0718)


[warning] 127-127: Use lazy % formatting in logging functions

(W1203)


[warning] 132-132: Use lazy % formatting in logging functions

(W1203)


[warning] 159-159: Use lazy % formatting in logging functions

(W1203)


[warning] 181-181: Catching too general exception Exception

(W0718)


[warning] 177-177: Use lazy % formatting in logging functions

(W1203)


[warning] 180-180: Use lazy % formatting in logging functions

(W1203)


[warning] 182-182: Use lazy % formatting in logging functions

(W1203)


[warning] 228-228: Use lazy % formatting in logging functions

(W1203)


[refactor] 232-238: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[warning] 233-233: Use lazy % formatting in logging functions

(W1203)


[warning] 237-237: Use lazy % formatting in logging functions

(W1203)


[refactor] 185-185: Too many return statements (7/6)

(R0911)


[refactor] 185-185: Either all return statements in a function should return an expression, or none of them should.

(R1710)


try:
cancel_endpoint = f"{base_url}/cancel/{job_id}"
logger.warning(f"Timeout reached. Cancelling diarization job {job_id} at {cancel_endpoint}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Use lazy % formatting in logging functions.

-        logger.warning(f"Timeout reached. Cancelling diarization job {job_id} at {cancel_endpoint}")
+        logger.warning("Timeout reached. Cancelling diarization job %s at %s", job_id, cancel_endpoint)
-        logger.info(f"Cancelled diarization job {job_id} after timeout.")
+        logger.info("Cancelled diarization job %s after timeout.", job_id)
-        logger.error(f"Failed to cancel diarization job {job_id}: {e}")
+        logger.error("Failed to cancel diarization job %s: %s", job_id, e)

Also applies to: 180-180, 182-182

🧰 Tools
🪛 Pylint (3.3.7)

[warning] 177-177: Use lazy % formatting in logging functions

(W1203)

🤖 Prompt for AI Agents
In echo/server/dembrane/conversation_health.py at lines 177, 180, and 182,
replace the f-string formatting inside the logger.warning calls with lazy %
formatting by passing the message template and variables separately to the
logger. This change improves logging efficiency by deferring string
interpolation until it is necessary.

"""
try:
audio_url = get_signed_url(audio_file_uri)
logger.debug(f"Generated signed audio_url: {audio_url}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Use lazy % formatting in logging functions.

-        logger.debug(f"Generated signed audio_url: {audio_url}")
+        logger.debug("Generated signed audio_url: %s", audio_url)
-        logger.error(f"Failed to generate signed URL for {audio_file_uri}: {e}")
+        logger.error("Failed to generate signed URL for %s: %s", audio_file_uri, e)

Also applies to: 64-64

🧰 Tools
🪛 Pylint (3.3.7)

[warning] 61-61: Use lazy % formatting in logging functions

(W1203)

🤖 Prompt for AI Agents
In echo/server/dembrane/conversation_health.py at lines 61 and 64, the logging
statements use f-string formatting which eagerly evaluates the message. Change
these to use lazy % formatting by passing the message template and variables
separately to the logger.debug function to improve performance and avoid
unnecessary string interpolation when the log level is not enabled.

data = {"input": {"audio": audio_url, "language": project_language}}

try:
logger.debug(f"Sending POST to {base_url}/run with data: {data}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Use lazy % formatting in logging functions.

-        logger.debug(f"Sending POST to {base_url}/run with data: {data}")
+        logger.debug("Sending POST to %s/run with data: %s", base_url, data)
-        logger.info(f"Started diarization job {job_id}")
+        logger.info("Started diarization job %s", job_id)
-        logger.error(f"Failed to queue diarization job: {e}")
+        logger.error("Failed to queue diarization job: %s", e)

Also applies to: 108-108, 111-111

🧰 Tools
🪛 Pylint (3.3.7)

[warning] 103-103: Use lazy % formatting in logging functions

(W1203)

🤖 Prompt for AI Agents
In echo/server/dembrane/conversation_health.py at lines 103, 108, and 111, the
logging calls use f-string formatting which eagerly evaluates the message even
if the log level is higher than debug. Change these to use lazy % formatting by
passing the format string and arguments separately to the logger.debug function
to improve performance and follow best practices.

The JSON response containing job status information, or None if the request fails.
"""
try:
logger.debug(f"Polling job status at {job_status_link}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Use lazy % formatting in logging functions.

-        logger.debug(f"Polling job status at {job_status_link}")
+        logger.debug("Polling job status at %s", job_status_link)
-        logger.error(f"Error polling diarization job status: {e}")
+        logger.error("Error polling diarization job status: %s", e)

Also applies to: 132-132

🧰 Tools
🪛 Pylint (3.3.7)

[warning] 127-127: Use lazy % formatting in logging functions

(W1203)

🤖 Prompt for AI Agents
In echo/server/dembrane/conversation_health.py at lines 127 and 132, replace the
f-string formatting inside logger.debug calls with lazy % formatting by passing
the message string with % placeholders and the variables as arguments to the
logging function. This change improves logging performance by deferring string
interpolation until it is needed.

response_data = _poll_job_status(job_status_link, headers)
if response_data:
status = response_data.get("status")
logger.debug(f"Job {job_id} status: {status}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Use lazy % formatting in logging functions.

-            logger.debug(f"Job {job_id} status: {status}")
+            logger.debug("Job %s status: %s", job_id, status)
-                    logger.info(f"Diarization job {job_id} completed. Updating chunk {chunk_id} with results.")
+                    logger.info("Diarization job %s completed. Updating chunk %s with results.", job_id, chunk_id)
-                    logger.warning(f"Diarization job {job_id} completed but no output data received.")
+                    logger.warning("Diarization job %s completed but no output data received.", job_id)

Also applies to: 233-233, 237-237

🧰 Tools
🪛 Pylint (3.3.7)

[warning] 228-228: Use lazy % formatting in logging functions

(W1203)

🤖 Prompt for AI Agents
In echo/server/dembrane/conversation_health.py at lines 228, 233, and 237,
replace the f-string formatting inside logger.debug calls with lazy % formatting
by passing the message template and variables separately to defer string
interpolation until needed. This improves logging performance by avoiding
unnecessary string construction when the log level is not enabled.

)[0]
audio_file_uri = directus_item["path"]
project_language = directus_item["conversation_id"]["project_id"]["language"]
logger.debug(f"Starting diarization for chunk_id: {chunk_id}, path: {audio_file_uri}, project_language: {project_language}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Use lazy % formatting in logging functions.

LGTM on the logic, but let's optimize the logging calls.

-        logger.debug(f"Starting diarization for chunk_id: {chunk_id}, path: {audio_file_uri}, project_language: {project_language}")
+        logger.debug("Starting diarization for chunk_id: %s, path: %s, project_language: %s", chunk_id, audio_file_uri, project_language)
-        logger.error(f"Failed to fetch audio_file_uri for chunk_id {chunk_id}: {e}")
+        logger.error("Failed to fetch audio_file_uri for chunk_id %s: %s", chunk_id, e)

Also applies to: 45-45

🧰 Tools
🪛 Pylint (3.3.7)

[convention] 42-42: Line too long (132/100)

(C0301)


[warning] 42-42: Use lazy % formatting in logging functions

(W1203)

🤖 Prompt for AI Agents
In echo/server/dembrane/conversation_health.py at lines 42 and 45, the logging
calls use f-string formatting which eagerly evaluates the string even if the log
level is higher than debug. Change these to use lazy % formatting by passing the
format string and arguments separately to logger.debug to optimize performance
and avoid unnecessary string interpolation when debug logging is disabled.

if DISABLE_MULTILINGUAL_DIARIZATION and project_language != "en":
logger.debug(f"Skipping diarization for chunk {chunk_id} because project language is {project_language}")
return None
logger.debug(f"Skipping diarization because project language is {project_language}")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Use lazy % formatting in logging.

-        logger.debug(f"Skipping diarization because project language is {project_language}")
+        logger.debug("Skipping diarization because project language is %s", project_language)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
logger.debug(f"Skipping diarization because project language is {project_language}")
logger.debug("Skipping diarization because project language is %s", project_language)
🧰 Tools
🪛 Pylint (3.3.7)

[warning] 75-75: Use lazy % formatting in logging functions

(W1203)

🤖 Prompt for AI Agents
In echo/server/dembrane/conversation_health.py at line 75, the logging statement
uses f-string formatting which eagerly evaluates the message. Change it to use
lazy % formatting by passing the message template and variables separately to
logger.debug, e.g., use logger.debug("Skipping diarization because project
language is %s", project_language) to defer string interpolation until needed.

"diarization": joined_diarization,
},
)
logger.debug(f"Updated chunk {chunk_id} with diarization results.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Use lazy % formatting in logging.

-    logger.debug(f"Updated chunk {chunk_id} with diarization results.")
+    logger.debug("Updated chunk %s with diarization results.", chunk_id)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
logger.debug(f"Updated chunk {chunk_id} with diarization results.")
logger.debug("Updated chunk %s with diarization results.", chunk_id)
🧰 Tools
🪛 Pylint (3.3.7)

[warning] 159-159: Use lazy % formatting in logging functions

(W1203)

🤖 Prompt for AI Agents
In echo/server/dembrane/conversation_health.py at line 159, the logging
statement uses f-string formatting which eagerly evaluates the string. Change
this to use lazy % formatting by passing the message and variables separately to
logger.debug, e.g., use logger.debug("Updated chunk %s with diarization
results.", chunk_id) to defer string interpolation until it is needed.

Comment on lines +232 to +238
if dirz_response_data:
logger.info(f"Diarization job {job_id} completed. Updating chunk {chunk_id} with results.")
_update_chunk_with_results(chunk_id, dirz_response_data)
return
else:
logger.warning(f"Diarization job {job_id} completed but no output data received.")
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Remove unnecessary else block after return.

LGTM on the logic flow, but we can flatten this a bit.

             if status == "COMPLETED":
                 dirz_response_data = response_data.get("output")
                 if dirz_response_data:
                     logger.info(f"Diarization job {job_id} completed. Updating chunk {chunk_id} with results.")
                     _update_chunk_with_results(chunk_id, dirz_response_data)
                     return
-                else:
-                    logger.warning(f"Diarization job {job_id} completed but no output data received.")
-                    return
+                logger.warning(f"Diarization job {job_id} completed but no output data received.")
+                return
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if dirz_response_data:
logger.info(f"Diarization job {job_id} completed. Updating chunk {chunk_id} with results.")
_update_chunk_with_results(chunk_id, dirz_response_data)
return
else:
logger.warning(f"Diarization job {job_id} completed but no output data received.")
return
if status == "COMPLETED":
dirz_response_data = response_data.get("output")
if dirz_response_data:
logger.info(f"Diarization job {job_id} completed. Updating chunk {chunk_id} with results.")
_update_chunk_with_results(chunk_id, dirz_response_data)
return
logger.warning(f"Diarization job {job_id} completed but no output data received.")
return
🧰 Tools
🪛 Pylint (3.3.7)

[convention] 233-233: Line too long (111/100)

(C0301)


[convention] 237-237: Line too long (102/100)

(C0301)


[refactor] 232-238: Unnecessary "else" after "return", remove the "else" and de-indent the code inside it

(R1705)


[warning] 233-233: Use lazy % formatting in logging functions

(W1203)


[warning] 237-237: Use lazy % formatting in logging functions

(W1203)

🤖 Prompt for AI Agents
In echo/server/dembrane/conversation_health.py around lines 232 to 238, remove
the else block following the return statement to flatten the code structure.
Since the if block ends with a return, the else is redundant; simply unindent
the code inside the else block to follow the if block directly.

@ArindamRoy23 ArindamRoy23 added this pull request to the merge queue Jun 12, 2025
Merged via the queue into main with commit 9d42c16 Jun 12, 2025
7 checks passed
@coderabbitai coderabbitai bot mentioned this pull request Sep 16, 2025
spashii pushed a commit that referenced this pull request Nov 18, 2025
* Refactor RunPod diarization process in conversation_health.py

- Introduced helper functions to streamline fetching chunk data, generating audio URLs, and submitting diarization jobs.
- Paaing language to runpod to run finetuned models
- Enhanced error handling and logging for improved observability during the diarization workflow.
- Updated the main get_runpod_diarization function to utilize new helper methods, improving code readability and maintainability.
- Implemented job status polling and results updating directly within the main function, ensuring better management of diarization tasks.

* 📝 Add docstrings to `feature/echo-295-train-dutch-pyannote-model` (#183)

Docstrings generation was requested by @ArindamRoy23.

* #182 (comment)

The following files were modified:

* `echo/server/dembrane/conversation_health.py`

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants