ECHO-551 Replace deprecated anthropic token counter and fix server lint errors#352
ECHO-551 Replace deprecated anthropic token counter and fix server lint errors#352
Conversation
WalkthroughReplaced legacy token counting with LiteLLM token_counter and wired model config, hardened run_in_thread_pool runtime checks and kwargs binding, changed several helper return types to dict, small typing/type-ignore and formatting tweaks, and updated devcontainer service config entries. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–30 minutes
Possibly related PRs
LGTM. Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: ASSERTIVE Plan: Pro Disabled knowledge base sources:
📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)echo/server/dembrane/**/*.py📄 CodeRabbit inference engine (echo/.cursor/rules/async-threadpool.mdc)
Files:
🧠 Learnings (1)📓 Common learnings⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
🔇 Additional comments (3)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (6)
echo/server/dembrane/api/project.py (6)
204-216: Tighten types; simplify None checks in analysis-run helper.Directus get_items returns a list; Optional is unnecessary. Drop the None branch and use a truthy check. Cleaner and avoids dead code paths.
Apply:
- def _get_analysis_run() -> Optional[list[dict]]: + def _get_analysis_run() -> list[dict]: @@ - analysis_run: Optional[list[dict]] = await run_in_thread_pool(_get_analysis_run) + analysis_run: list[dict] = await run_in_thread_pool(_get_analysis_run) @@ - if analysis_run is None: - return None - - if len(analysis_run) == 0: + if not analysis_run: return NoneAlso applies to: 218-226
328-341: Error path: good offload; consider persisting error_message.You return status="error" and error_code. Persisting a brief error_message (non-PII) helps triage.
Suggested tweak:
return client.create_item( "project_report", item_data={ "content": "", "project_id": project_id, "language": language, "status": "error", "error_code": "CONTEXT_TOO_LONG", + "error_message": "Report context exceeded limit", }, )["data"]
118-128: Blocking file I/O inside async fn — offload to thread pool.open(...).write(...) will block the event loop. Wrap with run_in_thread_pool, per dembrane guidelines.
Based on coding guidelines.
- with open(file_path, "w") as file: - for chunk in chunks: - try: - if chunk.transcript is not None: - file.write(str(chunk.transcript) + "\n") - except Exception as e: - logger.error(f"Failed to write transcript for chunk {chunk.id}: {e}") + def _write_transcript(path: str, chunk_list): + with open(path, "w") as file: + for chunk in chunk_list: + try: + if chunk.transcript is not None: + file.write(str(chunk.transcript) + "\n") + except Exception as e: + logger.error(f"Failed to write transcript for chunk {chunk.id}: {e}") + await run_in_thread_pool(_write_transcript, file_path, chunks)
178-184: Zip creation is blocking — move to thread pool.zipfile.ZipFile work is CPU/FS bound; avoid blocking the event loop.
Based on coding guidelines.
- with zipfile.ZipFile(zip_file_name, "w", zipfile.ZIP_DEFLATED) as zipf: - for filename in filenames: - if not filename: - continue - arcname = os.path.basename(filename) - zipf.write(filename, arcname) + def _create_zip(zip_file: str, files: List[str]) -> None: + with zipfile.ZipFile(zip_file, "w", zipfile.ZIP_DEFLATED) as zipf: + for filename in files: + if not filename: + continue + arcname = os.path.basename(filename) + zipf.write(filename, arcname) + await run_in_thread_pool(_create_zip, zip_file_name, filenames)
185-201: Prefer FileResponse over manual StreamingResponse.Starlette’s FileResponse handles efficient, non-blocking file sending. Simpler and avoids manual file reads in the event loop.
-from fastapi.responses import StreamingResponse +from fastapi.responses import FileResponse @@ - def iterfile() -> Generator[bytes, None, None]: - with open(zip_file_name, "rb") as file: - yield from file - - response = StreamingResponse(iterfile(), media_type="application/zip") - response.headers["Content-Disposition"] = f"attachment; filename={zip_file_name}" + response = FileResponse( + zip_file_name, + media_type="application/zip", + filename=zip_file_name, + )BackgroundTasks cleanup remains correct.
325-341: Heads-up: get_report_content_for_project does sync Directus + CPU token counts.In report_utils.py, directus.get_items (sync) and token_counter (CPU) run inside an async fn. Those should be offloaded with run_in_thread_pool in that module to keep this endpoint scalable.
Want a follow-up PR diff for report_utils?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
echo/server/dembrane/api/project.py(7 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
echo/server/dembrane/**/*.py
📄 CodeRabbit inference engine (echo/.cursor/rules/async-threadpool.mdc)
echo/server/dembrane/**/*.py: Always wrap blocking I/O calls using run_in_thread_pool from dembrane.async_helpers
Wrap calls to directus., conversation_service., project_service.*, S3 helpers, and CPU-heavy utilities (e.g., token counting, summary generation) with run_in_thread_pool if they are synchronous
Do not wrap already-async functions or LightRAG calls (e.g., rag.aquery, rag.ainsert) with run_in_thread_pool
Prefer converting endpoints to async def and awaiting results
Files:
echo/server/dembrane/api/project.py
🧬 Code graph analysis (1)
echo/server/dembrane/api/project.py (3)
echo/server/dembrane/async_helpers.py (1)
run_in_thread_pool(74-142)echo/server/dembrane/service/project.py (1)
get_by_id_or_raise(22-54)echo/server/dembrane/report_utils.py (2)
get_report_content_for_project(34-184)ContextTooLongException(28-31)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: ci-check-server
- GitHub Check: ci-check-server
🔇 Additional comments (3)
echo/server/dembrane/api/project.py (3)
250-251: Good: sync service call correctly offloaded.Using run_in_thread_pool for project_service.get_by_id_or_raise matches our I/O policy. LGTM.
298-299: Good: same offload pattern for view creation gate.Consistent with guidelines. LGTM.
346-359: Verify report status; confirm “archived” is intended.Creating the report with status="archived" feels odd for fresh content. If the desired lifecycle starts as "draft" or "pending", adjust here.
Do you want initial status to be "draft" or "published" instead of "archived"?
| async def create_report(project_id: str, body: CreateReportRequestBodySchema) -> dict: | ||
| language = body.language or "en" | ||
| try: | ||
| report_content_response = await get_report_content_for_project(project_id, language) |
There was a problem hiding this comment.
🧩 Analysis chain
Blocker: create_report lacks auth and ownership checks.
This endpoint currently allows anyone to create a report for any project. Mirror the access guard used elsewhere (403 for non-owner, 404 if missing).
Apply:
-@ProjectRouter.post("/{project_id}/create-report")
-async def create_report(project_id: str, body: CreateReportRequestBodySchema) -> dict:
+@ProjectRouter.post("/{project_id}/create-report")
+async def create_report(
+ project_id: str,
+ body: CreateReportRequestBodySchema,
+ auth: DependencyDirectusSession,
+) -> dict:
language = body.language or "en"
+ # Ownership / access guard
+ from dembrane.service import project_service
+ from dembrane.service.project import ProjectNotFoundException
+ try:
+ project = await run_in_thread_pool(project_service.get_by_id_or_raise, project_id)
+ except ProjectNotFoundException as e:
+ raise HTTPException(status_code=404, detail="Project not found") from e
+ if not auth.is_admin and project.get("directus_user_id", "") != auth.user_id:
+ raise HTTPException(status_code=403, detail="User does not have access to this project")Also note: return type changed from None to dict — verify API consumers.
🏁 Script executed:
#!/bin/bash
# Find server/client code that calls or expects the create-report response shape
rg -n -C2 'create-report' --type=py --type=ts --type=tsx
rg -n -C2 '\bcreate_report\(' --type=pyLength of output: 412
🏁 Script executed:
#!/bin/bash
# Get full implementation of create_report
sed -n '320,360p' echo/server/dembrane/api/project.py
# Find other protected endpoints to see auth pattern
rg -n 'auth.*DependencyDirectusSession' --type=py -A5 echo/server/dembrane/api/
# Check project_service module structure
find . -name 'project_service.py' -o -name 'project.py' | grep service
# Check if create_report is called anywhere
rg -n 'create_report' --type=py -g '!node_modules'Length of output: 12093
🏁 Script executed:
#!/bin/bash
# Also check what ProjectNotFoundException actually is
rg -n 'class ProjectNotFoundException' --type=py
rg -n 'ProjectNotFoundException' --type=py -B2 -A2Length of output: 7113
🏁 Script executed:
#!/bin/bash
# Check if create_report has tests
rg -n 'def test.*create_report' --type=py
rg -n 'create_report' echo/server/tests/ --type=py
# Check git to see if function is newly added (look at git diff context)
git log --oneline -n 20 -- echo/server/dembrane/api/project.py 2>/dev/null || echo "Git history unavailable"
# Verify if there are ownership checks in tests for similar endpoints
rg -n 'is_admin.*auth' --type=py -B2 -A2 echo/server/tests/Length of output: 722
Blocker: create_report lacks auth and ownership checks.
This endpoint allows unauthenticated requests to create reports for any project. Apply the same access guard pattern used in create_library and other protected endpoints:
@ProjectRouter.post("/{project_id}/create-report")
-async def create_report(project_id: str, body: CreateReportRequestBodySchema) -> dict:
+async def create_report(
+ project_id: str,
+ body: CreateReportRequestBodySchema,
+ auth: DependencyDirectusSession,
+) -> dict:
language = body.language or "en"
+ # Ownership / access guard
+ from dembrane.service import project_service
+ from dembrane.service.project import ProjectNotFoundException
+ try:
+ project = await run_in_thread_pool(project_service.get_by_id_or_raise, project_id)
+ except ProjectNotFoundException as e:
+ raise HTTPException(status_code=404, detail="Project not found") from e
+ if not auth.is_admin and project.get("directus_user_id", "") != auth.user_id:
+ raise HTTPException(status_code=403, detail="User does not have access to this project")🤖 Prompt for AI Agents
In echo/server/dembrane/api/project.py around lines 322 to 325, the
create_report function currently performs no authentication or ownership checks;
update it to follow the same access-guard pattern used by create_library and
other protected endpoints by: 1) retrieving the current user (or auth context)
from the request/session/context at the start of the function, 2) returning 401
if unauthenticated, 3) loading the target project and verifying the user has
ownership or required permissions (or calling the existing guard helper used
elsewhere), returning 403 if the user is not allowed, and 4) only then calling
get_report_content_for_project and continuing; also import and reuse the same
guard/auth helper functions and error response types used by create_library to
keep behavior consistent.
| except Exception as e: | ||
| raise e | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Remove redundant catch/rethrow; log and re-raise cleanly.
Catching Exception just to re-raise is noise. Either let it bubble or log then raise.
Apply:
- except Exception as e:
- raise e
+ except Exception:
+ logger.exception(f"create_report failed for project {project_id}")
+ raise📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| except Exception as e: | |
| raise e | |
| except Exception: | |
| logger.exception(f"create_report failed for project {project_id}") | |
| raise |
🤖 Prompt for AI Agents
In echo/server/dembrane/api/project.py around lines 343-345, the except block
catches Exception only to re-raise it which is redundant; either remove the
try/except so the exception naturally bubbles up, or if you need to record it,
replace the current except block with a logging statement that logs the
exception and then re-raise using plain "raise" (not "raise e") to preserve the
original traceback.
<!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Bug Fixes** * Improved async input validation to reject coroutine inputs early. * Safer thread-pool shutdown to avoid teardown errors. * **Refactor** * Standardized token counting across conversation endpoints. * Clarified and tightened report endpoint return types (now return report data structures). * **Chores** * Updated local service and dev environment configuration and small-model summary generation settings. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Sameer Pashikanti <sameer@dembrane.com>
Summary by CodeRabbit
Bug Fixes
Refactor
Chores