ECHO-551 Replace deprecated anthropic token counter and fix server lint errors by ussaama · Pull Request #352 · Dembrane/echo

ussaama · 2025-10-29T12:28:33Z

Summary by CodeRabbit

Bug Fixes
- Improved async input validation to reject coroutine inputs early.
- Safer thread-pool shutdown to avoid teardown errors.
Refactor
- Standardized token counting across conversation endpoints.
- Clarified and tightened report endpoint return types (now return report data structures).
Chores
- Updated local service and dev environment configuration and small-model summary generation settings.

…025-10-29

linear · 2025-10-29T12:28:36Z

ECHO-551 Replace deprecated anthropic token counter and fix server lint errors

coderabbitai · 2025-10-29T12:28:43Z

Walkthrough

Replaced legacy token counting with LiteLLM token_counter and wired model config, hardened run_in_thread_pool runtime checks and kwargs binding, changed several helper return types to dict, small typing/type-ignore and formatting tweaks, and updated devcontainer service config entries.

Changes

Cohort / File(s)	Summary
Token counting & Anthropic `echo/server/dembrane/anthropic.py`, `echo/server/dembrane/api/conversation.py`	Replaced `count_tokens` with `litellm.utils.token_counter`, imported and used `LIGHTRAG_LITELLM_INFERENCE_MODEL`, and reformatted docstring/call sites.
Project API signatures & call-site cleanup `echo/server/dembrane/api/project.py`	Changed `create_report` return type to `dict` and added explicit `-> dict` on internal helpers; added type annotation to analysis helper and consolidated several `run_in_thread_pool` call styles.
Async utilities hardening `echo/server/dembrane/async_helpers.py`	Strengthened `run_in_thread_pool` by rejecting async funcs/coroutine objects, binding `kwargs` via `functools.partial`, safer thread-pool shutdown check, and minor formatting.
Task typing adjustment `echo/server/dembrane/tasks.py`	Added `# type: ignore` to `NoContentFoundException` except clause in `task_merge_conversation_chunks`.
Devcontainer / services `echo/.devcontainer/docker-compose.yml`	Escaped MinIO env vars in entrypoint (`$$MINIO_ROOT_USER $$MINIO_ROOT_PASSWORD`), switched Redis image and volume mount path, and minor formatting cleanup.
Stateless summary LLM config `echo/server/dembrane/api/stateless.py`	Replaced hardcoded model with `SMALL_LITELLM_MODEL` and passed `api_key`, `api_base`, and `api_version` config values into completion call.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20–30 minutes

Verify token counting equivalence and correct model selection where token_counter is used (conversation.py).
Confirm callers of create_report and related helpers handle the new dict return values (project.py).
Review call sites of run_in_thread_pool to ensure no async functions or coroutine objects are passed and that kwargs binding behavior is acceptable (async_helpers.py).
Check devcontainer Redis and MinIO changes for local dev compatibility.

Possibly related PRs

ECHO-180 ECHO-224 Map all LLM calls in current codebase and convert them to use the new infra #148 — Continues migration to LiteLLM utilities and config constants used here.
get reply changes #178 — Also replaces legacy token counting with litellm.utils.token_counter; overlaps conversation token logic.
fix report and library #225 — Refactors create_report and project endpoints; directly relates to the create_report return-type changes.

LGTM.

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Out of Scope Changes Check	⚠️ Warning	Most changes align with ECHO-551's objectives, including the token counter replacement and lint error fixes through type annotations and validation improvements. However, the Docker compose changes (Redis version upgrade from 6.2.14 to valkey/valkey:8.0, Minio environment variable escaping) appear to be infrastructure updates unrelated to replacing the deprecated token counter or fixing Python server lint errors. These infrastructure changes seem introduced outside the stated objectives.	Consider isolating the Docker compose infrastructure changes into a separate PR focused on dependency upgrades and maintenance. This PR should remain focused on the token counter replacement and lint error fixes, which are solidly implemented across the Python codebase. The separation would improve clarity and make it easier to track infrastructure updates independently from code quality improvements.
Docstring Coverage	⚠️ Warning	Docstring coverage is 36.84% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "ECHO-551 Replace deprecated anthropic token counter and fix server lint errors" directly aligns with the primary changes in the changeset. The core modification in conversation.py replaces count_tokens with token_counter from litellm.utils, and numerous type annotations and validation improvements are added across multiple files to address linting concerns. The title is specific, concise, and accurately summarizes the main objectives without noise or vagueness.
Linked Issues Check	✅ Passed	The changes effectively address the ECHO-551 requirement to replace the deprecated Anthropic token counter, most notably in conversation.py where count_tokens is replaced with token_counter and wired to use LIGHTRAG_LITELLM_INFERENCE_MODEL [conversation.py changes]. The PR also resolves linting errors through type annotations in project.py and async_helpers.py, improved validation in run_in_thread_pool, and configuration-driven model selection in stateless.py, all contributing to the lint error remediation objective.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch hotfix-2025-10-29

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 63e53bc and 6f12ca5.

📒 Files selected for processing (1)

echo/server/dembrane/api/stateless.py (3 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

echo/server/dembrane/**/*.py

📄 CodeRabbit inference engine (echo/.cursor/rules/async-threadpool.mdc)

echo/server/dembrane/**/*.py: Always wrap blocking I/O calls using run_in_thread_pool from dembrane.async_helpers
Wrap calls to directus., conversation_service., project_service.*, S3 helpers, and CPU-heavy utilities (e.g., token counting, summary generation) with run_in_thread_pool if they are synchronous
Do not wrap already-async functions or LightRAG calls (e.g., rag.aquery, rag.ainsert) with run_in_thread_pool
Prefer converting endpoints to async def and awaiting results

Files:

echo/server/dembrane/api/stateless.py

🧠 Learnings (1)

📓 Common learnings

Learnt from: ussaama
PR: Dembrane/echo#205
File: echo/frontend/src/lib/query.ts:1444-1506
Timestamp: 2025-07-10T12:48:20.683Z
Learning: ussaama prefers string concatenation over template literals for simple cases where readability is clearer, even when linting tools suggest template literals. Human readability takes precedence over strict linting rules in straightforward concatenation scenarios.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: ci-check-server
GitHub Check: ci-check-server

🔇 Additional comments (3)

echo/server/dembrane/api/stateless.py (3)

10-15: LGTM! Config-driven model selection ships.

Pulling model config from centralized constants is straight fire - way cleaner than hardcoded values scattered everywhere. This is how we scale.

274-274: LGTM! Breathing room added.

Extra blank line before the try block - clean visual separation. No functional changes, just making the code easier on the eyes.

58-69: Config params are dialed in, async hygiene is solid.

The single call site in echo/server/dembrane/api/conversation.py:545 already wraps generate_summary with run_in_thread_pool and awaits it properly. The blocking operation is correctly offloaded to the thread pool. No action needed—guidelines are satisfied.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…dd types to it

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (6)

echo/server/dembrane/api/project.py (6)

204-216: Tighten types; simplify None checks in analysis-run helper.

Directus get_items returns a list; Optional is unnecessary. Drop the None branch and use a truthy check. Cleaner and avoids dead code paths.

Apply:

-        def _get_analysis_run() -> Optional[list[dict]]:
+        def _get_analysis_run() -> list[dict]:
@@
-        analysis_run: Optional[list[dict]] = await run_in_thread_pool(_get_analysis_run)
+        analysis_run: list[dict] = await run_in_thread_pool(_get_analysis_run)
@@
-        if analysis_run is None:
-            return None
-
-        if len(analysis_run) == 0:
+        if not analysis_run:
             return None

Also applies to: 218-226

328-341: Error path: good offload; consider persisting error_message.

You return status="error" and error_code. Persisting a brief error_message (non-PII) helps triage.

Suggested tweak:

                 return client.create_item(
                     "project_report",
                     item_data={
                         "content": "",
                         "project_id": project_id,
                         "language": language,
                         "status": "error",
                         "error_code": "CONTEXT_TOO_LONG",
+                        "error_message": "Report context exceeded limit",
                     },
                 )["data"]

118-128: Blocking file I/O inside async fn — offload to thread pool.

open(...).write(...) will block the event loop. Wrap with run_in_thread_pool, per dembrane guidelines.

Based on coding guidelines.

-    with open(file_path, "w") as file:
-        for chunk in chunks:
-            try:
-                if chunk.transcript is not None:
-                    file.write(str(chunk.transcript) + "\n")
-            except Exception as e:
-                logger.error(f"Failed to write transcript for chunk {chunk.id}: {e}")
+    def _write_transcript(path: str, chunk_list):
+        with open(path, "w") as file:
+            for chunk in chunk_list:
+                try:
+                    if chunk.transcript is not None:
+                        file.write(str(chunk.transcript) + "\n")
+                except Exception as e:
+                    logger.error(f"Failed to write transcript for chunk {chunk.id}: {e}")
+    await run_in_thread_pool(_write_transcript, file_path, chunks)

178-184: Zip creation is blocking — move to thread pool.

zipfile.ZipFile work is CPU/FS bound; avoid blocking the event loop.

Based on coding guidelines.

-    with zipfile.ZipFile(zip_file_name, "w", zipfile.ZIP_DEFLATED) as zipf:
-        for filename in filenames:
-            if not filename:
-                continue
-            arcname = os.path.basename(filename)
-            zipf.write(filename, arcname)
+    def _create_zip(zip_file: str, files: List[str]) -> None:
+        with zipfile.ZipFile(zip_file, "w", zipfile.ZIP_DEFLATED) as zipf:
+            for filename in files:
+                if not filename:
+                    continue
+                arcname = os.path.basename(filename)
+                zipf.write(filename, arcname)
+    await run_in_thread_pool(_create_zip, zip_file_name, filenames)

185-201: Prefer FileResponse over manual StreamingResponse.

Starlette’s FileResponse handles efficient, non-blocking file sending. Simpler and avoids manual file reads in the event loop.

-from fastapi.responses import StreamingResponse
+from fastapi.responses import FileResponse
@@
-    def iterfile() -> Generator[bytes, None, None]:
-        with open(zip_file_name, "rb") as file:
-            yield from file
-
-    response = StreamingResponse(iterfile(), media_type="application/zip")
-    response.headers["Content-Disposition"] = f"attachment; filename={zip_file_name}"
+    response = FileResponse(
+        zip_file_name,
+        media_type="application/zip",
+        filename=zip_file_name,
+    )

BackgroundTasks cleanup remains correct.

325-341: Heads-up: get_report_content_for_project does sync Directus + CPU token counts.

In report_utils.py, directus.get_items (sync) and token_counter (CPU) run inside an async fn. Those should be offloaded with run_in_thread_pool in that module to keep this endpoint scalable.

Want a follow-up PR diff for report_utils?

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7dc25af and b82f28b.

📒 Files selected for processing (1)

echo/server/dembrane/api/project.py (7 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

echo/server/dembrane/**/*.py

📄 CodeRabbit inference engine (echo/.cursor/rules/async-threadpool.mdc)

echo/server/dembrane/**/*.py: Always wrap blocking I/O calls using run_in_thread_pool from dembrane.async_helpers
Wrap calls to directus., conversation_service., project_service.*, S3 helpers, and CPU-heavy utilities (e.g., token counting, summary generation) with run_in_thread_pool if they are synchronous
Do not wrap already-async functions or LightRAG calls (e.g., rag.aquery, rag.ainsert) with run_in_thread_pool
Prefer converting endpoints to async def and awaiting results

Files:

echo/server/dembrane/api/project.py

🧬 Code graph analysis (1)

echo/server/dembrane/api/project.py (3)

echo/server/dembrane/async_helpers.py (1)

run_in_thread_pool (74-142)

echo/server/dembrane/service/project.py (1)

get_by_id_or_raise (22-54)

echo/server/dembrane/report_utils.py (2)

get_report_content_for_project (34-184)

ContextTooLongException (28-31)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: ci-check-server
GitHub Check: ci-check-server

🔇 Additional comments (3)

echo/server/dembrane/api/project.py (3)

250-251: Good: sync service call correctly offloaded.

Using run_in_thread_pool for project_service.get_by_id_or_raise matches our I/O policy. LGTM.

298-299: Good: same offload pattern for view creation gate.

Consistent with guidelines. LGTM.

346-359: Verify report status; confirm “archived” is intended.

Creating the report with status="archived" feels odd for fresh content. If the desired lifecycle starts as "draft" or "pending", adjust here.

Do you want initial status to be "draft" or "published" instead of "archived"?

coderabbitai · 2025-10-29T12:51:32Z

echo/server/dembrane/api/project.py

+async def create_report(project_id: str, body: CreateReportRequestBodySchema) -> dict:
    language = body.language or "en"
    try:
        report_content_response = await get_report_content_for_project(project_id, language)


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Blocker: create_report lacks auth and ownership checks.

This endpoint currently allows anyone to create a report for any project. Mirror the access guard used elsewhere (403 for non-owner, 404 if missing).

Apply:

-@ProjectRouter.post("/{project_id}/create-report") -async def create_report(project_id: str, body: CreateReportRequestBodySchema) -> dict: +@ProjectRouter.post("/{project_id}/create-report") +async def create_report( + project_id: str, + body: CreateReportRequestBodySchema, + auth: DependencyDirectusSession, +) -> dict: language = body.language or "en" + # Ownership / access guard + from dembrane.service import project_service + from dembrane.service.project import ProjectNotFoundException + try: + project = await run_in_thread_pool(project_service.get_by_id_or_raise, project_id) + except ProjectNotFoundException as e: + raise HTTPException(status_code=404, detail="Project not found") from e + if not auth.is_admin and project.get("directus_user_id", "") != auth.user_id: + raise HTTPException(status_code=403, detail="User does not have access to this project")

Also note: return type changed from None to dict — verify API consumers.

🏁 Script executed:

#!/bin/bash # Find server/client code that calls or expects the create-report response shape rg -n -C2 'create-report' --type=py --type=ts --type=tsx rg -n -C2 '\bcreate_report\(' --type=py

Length of output: 412

🏁 Script executed:

#!/bin/bash # Get full implementation of create_report sed -n '320,360p' echo/server/dembrane/api/project.py # Find other protected endpoints to see auth pattern rg -n 'auth.*DependencyDirectusSession' --type=py -A5 echo/server/dembrane/api/ # Check project_service module structure find . -name 'project_service.py' -o -name 'project.py' | grep service # Check if create_report is called anywhere rg -n 'create_report' --type=py -g '!node_modules'

Length of output: 12093

🏁 Script executed:

#!/bin/bash # Also check what ProjectNotFoundException actually is rg -n 'class ProjectNotFoundException' --type=py rg -n 'ProjectNotFoundException' --type=py -B2 -A2

Length of output: 7113

🏁 Script executed:

#!/bin/bash # Check if create_report has tests rg -n 'def test.*create_report' --type=py rg -n 'create_report' echo/server/tests/ --type=py # Check git to see if function is newly added (look at git diff context) git log --oneline -n 20 -- echo/server/dembrane/api/project.py 2>/dev/null || echo "Git history unavailable" # Verify if there are ownership checks in tests for similar endpoints rg -n 'is_admin.*auth' --type=py -B2 -A2 echo/server/tests/

Length of output: 722

Blocker: create_report lacks auth and ownership checks.

This endpoint allows unauthenticated requests to create reports for any project. Apply the same access guard pattern used in create_library and other protected endpoints:

@ProjectRouter.post("/{project_id}/create-report") -async def create_report(project_id: str, body: CreateReportRequestBodySchema) -> dict: +async def create_report( + project_id: str, + body: CreateReportRequestBodySchema, + auth: DependencyDirectusSession, +) -> dict: language = body.language or "en" + # Ownership / access guard + from dembrane.service import project_service + from dembrane.service.project import ProjectNotFoundException + try: + project = await run_in_thread_pool(project_service.get_by_id_or_raise, project_id) + except ProjectNotFoundException as e: + raise HTTPException(status_code=404, detail="Project not found") from e + if not auth.is_admin and project.get("directus_user_id", "") != auth.user_id: + raise HTTPException(status_code=403, detail="User does not have access to this project")

🤖 Prompt for AI Agents

In echo/server/dembrane/api/project.py around lines 322 to 325, the create_report function currently performs no authentication or ownership checks; update it to follow the same access-guard pattern used by create_library and other protected endpoints by: 1) retrieving the current user (or auth context) from the request/session/context at the start of the function, 2) returning 401 if unauthenticated, 3) loading the target project and verifying the user has ownership or required permissions (or calling the existing guard helper used elsewhere), returning 403 if the user is not allowed, and 4) only then calling get_report_content_for_project and continuing; also import and reuse the same guard/auth helper functions and error response types used by create_library to keep behavior consistent.

coderabbitai · 2025-10-29T12:51:33Z

echo/server/dembrane/api/project.py

    except Exception as e:
        raise e



🧹 Nitpick | 🔵 Trivial

Remove redundant catch/rethrow; log and re-raise cleanly.

Catching Exception just to re-raise is noise. Either let it bubble or log then raise.

Apply:

- except Exception as e: - raise e + except Exception: + logger.exception(f"create_report failed for project {project_id}") + raise

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

except Exception as e:

raise e

except Exception:

logger.exception(f"create_report failed for project {project_id}")

raise

🤖 Prompt for AI Agents

In echo/server/dembrane/api/project.py around lines 343-345, the except block catches Exception only to re-raise it which is redundant; either remove the try/except so the exception naturally bubbles up, or if you need to record it, replace the current except block with a logging statement that logs the exception and then re-raise using plain "raise" (not "raise e") to preserve the original traceback.

## Summary by CodeRabbit * **Bug Fixes** * Improved async input validation to reject coroutine inputs early. * Safer thread-pool shutdown to avoid teardown errors. * **Refactor** * Standardized token counting across conversation endpoints. * Clarified and tightened report endpoint return types (now return report data structures). * **Chores** * Updated local service and dev environment configuration and small-model summary generation settings.  --------- Co-authored-by: Sameer Pashikanti <sameer@dembrane.com>

ussaama and others added 3 commits October 29, 2025 09:23

- hotfix swap claude sonnet with token_counter function

fc334ad

Merge branch 'main' of https://github.com/Dembrane/echo into hotfix-2…

f4507fc

…025-10-29

- revert NoContentFoundException in tasks py fle

7dc25af

coderabbitai bot added the bug Something isn't working label Oct 29, 2025

- revert get_latest_project_analysis_run function in project py and a…

b82f28b

…dd types to it

vercel bot deployed to Preview October 29, 2025 12:44 View deployment

coderabbitai bot reviewed Oct 29, 2025

View reviewed changes

ussaama requested a review from spashii October 29, 2025 13:02

use minio and update summary model

63e53bc

vercel bot deployed to Preview October 30, 2025 11:35 View deployment

coderabbitai bot added the improvement label Oct 30, 2025

fix import lint error

6f12ca5

vercel bot deployed to Preview October 30, 2025 11:43 View deployment

spashii merged commit 0f99d28 into main Oct 30, 2025
15 checks passed

spashii deleted the hotfix-2025-10-29 branch October 30, 2025 12:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ECHO-551 Replace deprecated anthropic token counter and fix server lint errors#352

ECHO-551 Replace deprecated anthropic token counter and fix server lint errors#352
spashii merged 6 commits intomainfrom
hotfix-2025-10-29

ussaama commented Oct 29, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

linear bot commented Oct 29, 2025

Uh oh!

coderabbitai bot commented Oct 29, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 29, 2025

Uh oh!

coderabbitai bot Oct 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

-    except Exception as e:
-        raise e
+    except Exception:
+        logger.exception(f"create_report failed for project {project_id}")
+        raise

Conversation

ussaama commented Oct 29, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

linear bot commented Oct 29, 2025

Uh oh!

coderabbitai bot commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ussaama commented Oct 29, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 29, 2025 •

edited

Loading