Skip to content

ECHO-551 Replace deprecated anthropic token counter and fix server lint errors#352

Merged
spashii merged 6 commits intomainfrom
hotfix-2025-10-29
Oct 30, 2025
Merged

ECHO-551 Replace deprecated anthropic token counter and fix server lint errors#352
spashii merged 6 commits intomainfrom
hotfix-2025-10-29

Conversation

@ussaama
Copy link
Copy Markdown
Contributor

@ussaama ussaama commented Oct 29, 2025

Summary by CodeRabbit

  • Bug Fixes

    • Improved async input validation to reject coroutine inputs early.
    • Safer thread-pool shutdown to avoid teardown errors.
  • Refactor

    • Standardized token counting across conversation endpoints.
    • Clarified and tightened report endpoint return types (now return report data structures).
  • Chores

    • Updated local service and dev environment configuration and small-model summary generation settings.

@linear
Copy link
Copy Markdown

linear bot commented Oct 29, 2025

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Oct 29, 2025

Walkthrough

Replaced legacy token counting with LiteLLM token_counter and wired model config, hardened run_in_thread_pool runtime checks and kwargs binding, changed several helper return types to dict, small typing/type-ignore and formatting tweaks, and updated devcontainer service config entries.

Changes

Cohort / File(s) Summary
Token counting & Anthropic
echo/server/dembrane/anthropic.py, echo/server/dembrane/api/conversation.py
Replaced count_tokens with litellm.utils.token_counter, imported and used LIGHTRAG_LITELLM_INFERENCE_MODEL, and reformatted docstring/call sites.
Project API signatures & call-site cleanup
echo/server/dembrane/api/project.py
Changed create_report return type to dict and added explicit -> dict on internal helpers; added type annotation to analysis helper and consolidated several run_in_thread_pool call styles.
Async utilities hardening
echo/server/dembrane/async_helpers.py
Strengthened run_in_thread_pool by rejecting async funcs/coroutine objects, binding kwargs via functools.partial, safer thread-pool shutdown check, and minor formatting.
Task typing adjustment
echo/server/dembrane/tasks.py
Added # type: ignore to NoContentFoundException except clause in task_merge_conversation_chunks.
Devcontainer / services
echo/.devcontainer/docker-compose.yml
Escaped MinIO env vars in entrypoint ($$MINIO_ROOT_USER $$MINIO_ROOT_PASSWORD), switched Redis image and volume mount path, and minor formatting cleanup.
Stateless summary LLM config
echo/server/dembrane/api/stateless.py
Replaced hardcoded model with SMALL_LITELLM_MODEL and passed api_key, api_base, and api_version config values into completion call.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

  • Verify token counting equivalence and correct model selection where token_counter is used (conversation.py).
  • Confirm callers of create_report and related helpers handle the new dict return values (project.py).
  • Review call sites of run_in_thread_pool to ensure no async functions or coroutine objects are passed and that kwargs binding behavior is acceptable (async_helpers.py).
  • Check devcontainer Redis and MinIO changes for local dev compatibility.

Possibly related PRs

LGTM.

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)
Check name Status Explanation Resolution
Out of Scope Changes Check ⚠️ Warning Most changes align with ECHO-551's objectives, including the token counter replacement and lint error fixes through type annotations and validation improvements. However, the Docker compose changes (Redis version upgrade from 6.2.14 to valkey/valkey:8.0, Minio environment variable escaping) appear to be infrastructure updates unrelated to replacing the deprecated token counter or fixing Python server lint errors. These infrastructure changes seem introduced outside the stated objectives. Consider isolating the Docker compose infrastructure changes into a separate PR focused on dependency upgrades and maintenance. This PR should remain focused on the token counter replacement and lint error fixes, which are solidly implemented across the Python codebase. The separation would improve clarity and make it easier to track infrastructure updates independently from code quality improvements.
Docstring Coverage ⚠️ Warning Docstring coverage is 36.84% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "ECHO-551 Replace deprecated anthropic token counter and fix server lint errors" directly aligns with the primary changes in the changeset. The core modification in conversation.py replaces count_tokens with token_counter from litellm.utils, and numerous type annotations and validation improvements are added across multiple files to address linting concerns. The title is specific, concise, and accurately summarizes the main objectives without noise or vagueness.
Linked Issues Check ✅ Passed The changes effectively address the ECHO-551 requirement to replace the deprecated Anthropic token counter, most notably in conversation.py where count_tokens is replaced with token_counter and wired to use LIGHTRAG_LITELLM_INFERENCE_MODEL [conversation.py changes]. The PR also resolves linting errors through type annotations in project.py and async_helpers.py, improved validation in run_in_thread_pool, and configuration-driven model selection in stateless.py, all contributing to the lint error remediation objective.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch hotfix-2025-10-29

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 63e53bc and 6f12ca5.

📒 Files selected for processing (1)
  • echo/server/dembrane/api/stateless.py (3 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
echo/server/dembrane/**/*.py

📄 CodeRabbit inference engine (echo/.cursor/rules/async-threadpool.mdc)

echo/server/dembrane/**/*.py: Always wrap blocking I/O calls using run_in_thread_pool from dembrane.async_helpers
Wrap calls to directus., conversation_service., project_service.*, S3 helpers, and CPU-heavy utilities (e.g., token counting, summary generation) with run_in_thread_pool if they are synchronous
Do not wrap already-async functions or LightRAG calls (e.g., rag.aquery, rag.ainsert) with run_in_thread_pool
Prefer converting endpoints to async def and awaiting results

Files:

  • echo/server/dembrane/api/stateless.py
🧠 Learnings (1)
📓 Common learnings
Learnt from: ussaama
PR: Dembrane/echo#205
File: echo/frontend/src/lib/query.ts:1444-1506
Timestamp: 2025-07-10T12:48:20.683Z
Learning: ussaama prefers string concatenation over template literals for simple cases where readability is clearer, even when linting tools suggest template literals. Human readability takes precedence over strict linting rules in straightforward concatenation scenarios.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: ci-check-server
  • GitHub Check: ci-check-server
🔇 Additional comments (3)
echo/server/dembrane/api/stateless.py (3)

10-15: LGTM! Config-driven model selection ships.

Pulling model config from centralized constants is straight fire - way cleaner than hardcoded values scattered everywhere. This is how we scale.


274-274: LGTM! Breathing room added.

Extra blank line before the try block - clean visual separation. No functional changes, just making the code easier on the eyes.


58-69: Config params are dialed in, async hygiene is solid.

The single call site in echo/server/dembrane/api/conversation.py:545 already wraps generate_summary with run_in_thread_pool and awaits it properly. The blocking operation is correctly offloaded to the thread pool. No action needed—guidelines are satisfied.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai bot added the bug Something isn't working label Oct 29, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)
echo/server/dembrane/api/project.py (6)

204-216: Tighten types; simplify None checks in analysis-run helper.

Directus get_items returns a list; Optional is unnecessary. Drop the None branch and use a truthy check. Cleaner and avoids dead code paths.

Apply:

-        def _get_analysis_run() -> Optional[list[dict]]:
+        def _get_analysis_run() -> list[dict]:
@@
-        analysis_run: Optional[list[dict]] = await run_in_thread_pool(_get_analysis_run)
+        analysis_run: list[dict] = await run_in_thread_pool(_get_analysis_run)
@@
-        if analysis_run is None:
-            return None
-
-        if len(analysis_run) == 0:
+        if not analysis_run:
             return None

Also applies to: 218-226


328-341: Error path: good offload; consider persisting error_message.

You return status="error" and error_code. Persisting a brief error_message (non-PII) helps triage.

Suggested tweak:

                 return client.create_item(
                     "project_report",
                     item_data={
                         "content": "",
                         "project_id": project_id,
                         "language": language,
                         "status": "error",
                         "error_code": "CONTEXT_TOO_LONG",
+                        "error_message": "Report context exceeded limit",
                     },
                 )["data"]

118-128: Blocking file I/O inside async fn — offload to thread pool.

open(...).write(...) will block the event loop. Wrap with run_in_thread_pool, per dembrane guidelines.

Based on coding guidelines.

-    with open(file_path, "w") as file:
-        for chunk in chunks:
-            try:
-                if chunk.transcript is not None:
-                    file.write(str(chunk.transcript) + "\n")
-            except Exception as e:
-                logger.error(f"Failed to write transcript for chunk {chunk.id}: {e}")
+    def _write_transcript(path: str, chunk_list):
+        with open(path, "w") as file:
+            for chunk in chunk_list:
+                try:
+                    if chunk.transcript is not None:
+                        file.write(str(chunk.transcript) + "\n")
+                except Exception as e:
+                    logger.error(f"Failed to write transcript for chunk {chunk.id}: {e}")
+    await run_in_thread_pool(_write_transcript, file_path, chunks)

178-184: Zip creation is blocking — move to thread pool.

zipfile.ZipFile work is CPU/FS bound; avoid blocking the event loop.

Based on coding guidelines.

-    with zipfile.ZipFile(zip_file_name, "w", zipfile.ZIP_DEFLATED) as zipf:
-        for filename in filenames:
-            if not filename:
-                continue
-            arcname = os.path.basename(filename)
-            zipf.write(filename, arcname)
+    def _create_zip(zip_file: str, files: List[str]) -> None:
+        with zipfile.ZipFile(zip_file, "w", zipfile.ZIP_DEFLATED) as zipf:
+            for filename in files:
+                if not filename:
+                    continue
+                arcname = os.path.basename(filename)
+                zipf.write(filename, arcname)
+    await run_in_thread_pool(_create_zip, zip_file_name, filenames)

185-201: Prefer FileResponse over manual StreamingResponse.

Starlette’s FileResponse handles efficient, non-blocking file sending. Simpler and avoids manual file reads in the event loop.

-from fastapi.responses import StreamingResponse
+from fastapi.responses import FileResponse
@@
-    def iterfile() -> Generator[bytes, None, None]:
-        with open(zip_file_name, "rb") as file:
-            yield from file
-
-    response = StreamingResponse(iterfile(), media_type="application/zip")
-    response.headers["Content-Disposition"] = f"attachment; filename={zip_file_name}"
+    response = FileResponse(
+        zip_file_name,
+        media_type="application/zip",
+        filename=zip_file_name,
+    )

BackgroundTasks cleanup remains correct.


325-341: Heads-up: get_report_content_for_project does sync Directus + CPU token counts.

In report_utils.py, directus.get_items (sync) and token_counter (CPU) run inside an async fn. Those should be offloaded with run_in_thread_pool in that module to keep this endpoint scalable.

Want a follow-up PR diff for report_utils?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7dc25af and b82f28b.

📒 Files selected for processing (1)
  • echo/server/dembrane/api/project.py (7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
echo/server/dembrane/**/*.py

📄 CodeRabbit inference engine (echo/.cursor/rules/async-threadpool.mdc)

echo/server/dembrane/**/*.py: Always wrap blocking I/O calls using run_in_thread_pool from dembrane.async_helpers
Wrap calls to directus., conversation_service., project_service.*, S3 helpers, and CPU-heavy utilities (e.g., token counting, summary generation) with run_in_thread_pool if they are synchronous
Do not wrap already-async functions or LightRAG calls (e.g., rag.aquery, rag.ainsert) with run_in_thread_pool
Prefer converting endpoints to async def and awaiting results

Files:

  • echo/server/dembrane/api/project.py
🧬 Code graph analysis (1)
echo/server/dembrane/api/project.py (3)
echo/server/dembrane/async_helpers.py (1)
  • run_in_thread_pool (74-142)
echo/server/dembrane/service/project.py (1)
  • get_by_id_or_raise (22-54)
echo/server/dembrane/report_utils.py (2)
  • get_report_content_for_project (34-184)
  • ContextTooLongException (28-31)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: ci-check-server
  • GitHub Check: ci-check-server
🔇 Additional comments (3)
echo/server/dembrane/api/project.py (3)

250-251: Good: sync service call correctly offloaded.

Using run_in_thread_pool for project_service.get_by_id_or_raise matches our I/O policy. LGTM.


298-299: Good: same offload pattern for view creation gate.

Consistent with guidelines. LGTM.


346-359: Verify report status; confirm “archived” is intended.

Creating the report with status="archived" feels odd for fresh content. If the desired lifecycle starts as "draft" or "pending", adjust here.

Do you want initial status to be "draft" or "published" instead of "archived"?

Comment on lines +322 to 325
async def create_report(project_id: str, body: CreateReportRequestBodySchema) -> dict:
language = body.language or "en"
try:
report_content_response = await get_report_content_for_project(project_id, language)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Blocker: create_report lacks auth and ownership checks.

This endpoint currently allows anyone to create a report for any project. Mirror the access guard used elsewhere (403 for non-owner, 404 if missing).

Apply:

-@ProjectRouter.post("/{project_id}/create-report")
-async def create_report(project_id: str, body: CreateReportRequestBodySchema) -> dict:
+@ProjectRouter.post("/{project_id}/create-report")
+async def create_report(
+    project_id: str,
+    body: CreateReportRequestBodySchema,
+    auth: DependencyDirectusSession,
+) -> dict:
     language = body.language or "en"
+    # Ownership / access guard
+    from dembrane.service import project_service
+    from dembrane.service.project import ProjectNotFoundException
+    try:
+        project = await run_in_thread_pool(project_service.get_by_id_or_raise, project_id)
+    except ProjectNotFoundException as e:
+        raise HTTPException(status_code=404, detail="Project not found") from e
+    if not auth.is_admin and project.get("directus_user_id", "") != auth.user_id:
+        raise HTTPException(status_code=403, detail="User does not have access to this project")

Also note: return type changed from None to dict — verify API consumers.


🏁 Script executed:

#!/bin/bash
# Find server/client code that calls or expects the create-report response shape
rg -n -C2 'create-report' --type=py --type=ts --type=tsx
rg -n -C2 '\bcreate_report\(' --type=py

Length of output: 412



🏁 Script executed:

#!/bin/bash
# Get full implementation of create_report
sed -n '320,360p' echo/server/dembrane/api/project.py

# Find other protected endpoints to see auth pattern
rg -n 'auth.*DependencyDirectusSession' --type=py -A5 echo/server/dembrane/api/

# Check project_service module structure
find . -name 'project_service.py' -o -name 'project.py' | grep service

# Check if create_report is called anywhere
rg -n 'create_report' --type=py -g '!node_modules'

Length of output: 12093


🏁 Script executed:

#!/bin/bash
# Also check what ProjectNotFoundException actually is
rg -n 'class ProjectNotFoundException' --type=py
rg -n 'ProjectNotFoundException' --type=py -B2 -A2

Length of output: 7113



🏁 Script executed:

#!/bin/bash
# Check if create_report has tests
rg -n 'def test.*create_report' --type=py
rg -n 'create_report' echo/server/tests/ --type=py

# Check git to see if function is newly added (look at git diff context)
git log --oneline -n 20 -- echo/server/dembrane/api/project.py 2>/dev/null || echo "Git history unavailable"

# Verify if there are ownership checks in tests for similar endpoints
rg -n 'is_admin.*auth' --type=py -B2 -A2 echo/server/tests/

Length of output: 722


Blocker: create_report lacks auth and ownership checks.

This endpoint allows unauthenticated requests to create reports for any project. Apply the same access guard pattern used in create_library and other protected endpoints:

 @ProjectRouter.post("/{project_id}/create-report")
-async def create_report(project_id: str, body: CreateReportRequestBodySchema) -> dict:
+async def create_report(
+    project_id: str,
+    body: CreateReportRequestBodySchema,
+    auth: DependencyDirectusSession,
+) -> dict:
     language = body.language or "en"
+    # Ownership / access guard
+    from dembrane.service import project_service
+    from dembrane.service.project import ProjectNotFoundException
+    try:
+        project = await run_in_thread_pool(project_service.get_by_id_or_raise, project_id)
+    except ProjectNotFoundException as e:
+        raise HTTPException(status_code=404, detail="Project not found") from e
+    if not auth.is_admin and project.get("directus_user_id", "") != auth.user_id:
+        raise HTTPException(status_code=403, detail="User does not have access to this project")
🤖 Prompt for AI Agents
In echo/server/dembrane/api/project.py around lines 322 to 325, the
create_report function currently performs no authentication or ownership checks;
update it to follow the same access-guard pattern used by create_library and
other protected endpoints by: 1) retrieving the current user (or auth context)
from the request/session/context at the start of the function, 2) returning 401
if unauthenticated, 3) loading the target project and verifying the user has
ownership or required permissions (or calling the existing guard helper used
elsewhere), returning 403 if the user is not allowed, and 4) only then calling
get_report_content_for_project and continuing; also import and reuse the same
guard/auth helper functions and error response types used by create_library to
keep behavior consistent.

Comment on lines 343 to 345
except Exception as e:
raise e

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Remove redundant catch/rethrow; log and re-raise cleanly.

Catching Exception just to re-raise is noise. Either let it bubble or log then raise.

Apply:

-    except Exception as e:
-        raise e
+    except Exception:
+        logger.exception(f"create_report failed for project {project_id}")
+        raise
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
except Exception as e:
raise e
except Exception:
logger.exception(f"create_report failed for project {project_id}")
raise
🤖 Prompt for AI Agents
In echo/server/dembrane/api/project.py around lines 343-345, the except block
catches Exception only to re-raise it which is redundant; either remove the
try/except so the exception naturally bubbles up, or if you need to record it,
replace the current except block with a logging statement that logs the
exception and then re-raise using plain "raise" (not "raise e") to preserve the
original traceback.

@ussaama ussaama requested a review from spashii October 29, 2025 13:02
@spashii spashii merged commit 0f99d28 into main Oct 30, 2025
15 checks passed
@spashii spashii deleted the hotfix-2025-10-29 branch October 30, 2025 12:01
spashii added a commit that referenced this pull request Nov 18, 2025
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Bug Fixes**
  * Improved async input validation to reject coroutine inputs early.
  * Safer thread-pool shutdown to avoid teardown errors.

* **Refactor**
  * Standardized token counting across conversation endpoints.
* Clarified and tightened report endpoint return types (now return
report data structures).

* **Chores**
* Updated local service and dev environment configuration and
small-model summary generation settings.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Sameer Pashikanti <sameer@dembrane.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working improvement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants