Skip to content

ECHO-180 ECHO-224 Map all LLM calls in current codebase and convert them to use the new infra#148

Merged
spashii merged 4 commits intomainfrom
feature/echo-180-map-all-llm-calls-in-current-codebase-and-convert-them-to
May 15, 2025
Merged

ECHO-180 ECHO-224 Map all LLM calls in current codebase and convert them to use the new infra#148
spashii merged 4 commits intomainfrom
feature/echo-180-map-all-llm-calls-in-current-codebase-and-convert-them-to

Conversation

@ArindamRoy23
Copy link
Copy Markdown
Contributor

@ArindamRoy23 ArindamRoy23 commented May 15, 2025

  • Embedding shifted to LiteLLM
  • Quote util shifted to LiteLLM
  • Minor fix to test quote utils

@spashii , Requesting you look into this and test thoroughly before completing pull
All tests passing.

Summary by CodeRabbit

  • New Features

    • Added support for configuring and using the LiteLLM API for language model and embedding operations via new environment variables.
  • Refactor

    • Switched from the previous OpenAI client to the LiteLLM library for all language model and embedding functionality, improving flexibility and configuration.
    • Updated report generation to use LiteLLM utilities with enhanced token management and prompt handling.
    • Adjusted task execution priorities to optimize processing queues and resource allocation.
    • Simplified API endpoint parameters by removing unused database dependencies.
  • Chores

    • Updated test imports to support new embedding and utility requirements.
    • Enhanced prompt templates across multiple languages to clarify formatting by disallowing <br> tags and requiring line breaks instead.

- Embedding shifted to LiteLLM
- Quote util shifted to LiteLLM
- Minor fix to test quote utils
@linear
Copy link
Copy Markdown

linear bot commented May 15, 2025

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented May 15, 2025

## Walkthrough

This PR swaps out the OpenAI client for the litellm library across embedding and LLM completion utilities, parameterizes all LLM API credentials and model selection via new environment variables, and updates configuration and test imports accordingly. It refactors report generation to use litellm completions and Directus API calls instead of DB queries and Anthropic client. Task actor priorities are adjusted. No changes to public APIs or control flow beyond these backend and config updates.

## Changes

| File(s)                                                    | Change Summary                                                                                                 |
|------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|
| `echo/server/dembrane/config.py`                           | Added new env vars for SMALL, MEDIUM, LARGE LiteLLM tiers (model, API key, version, base) with assertions and debug logs; removed defaults for Whisper vars with assertions. |
| `echo/server/dembrane/embedding.py`                        | Replaced OpenAI embedding client with litellm embedding call; updated embedding dim from 1536 to 3072; adjusted response parsing. |
| `echo/server/dembrane/quote_utils.py`                      | Replaced OpenAI chat completion calls with litellm completion calls; parameterized model and API credentials; adjusted JSON parsing; removed unused imports. |
| `echo/server/dembrane/report_utils.py`                     | Refactored report content fetching to use Directus API and litellm token counting/completion; removed DB dependency and Anthropic client; added token limit logic and logging. |
| `echo/server/dembrane/api/project.py`                      | Removed DB parameter from `create_report` endpoint; updated internal call accordingly.                        |
| `echo/server/dembrane/tasks.py`                            | Added or adjusted dramatiq actor priorities; changed some task queues; replaced attribute access with dict access in one function; updated logging accordingly. |
| `echo/server/dembrane/reply_utils.py`                      | Added a FIXME comment before litellm completion call; no functional change.                                  |
| `echo/server/tests/test_quote_utils.py`                    | Added imports for `generate_uuid` and `EMBEDDING_DIM` to support updated test setups.                        |
| `echo/server/prompt_templates/system_report.*.jinja`       | Added instruction to all language variants forbidding `<br>` tags in output; instruct to use line breaks instead. |

## Sequence Diagram(s)

```mermaid
sequenceDiagram
    participant User
    participant quote_utils.py
    participant config.py
    participant litellm

    User->>quote_utils.py: Call LLM-related function
    quote_utils.py->>config.py: Fetch LLM config (model, API key, etc.)
    quote_utils.py->>litellm: Call completion/embedding API with config
    litellm-->>quote_utils.py: Return response
    quote_utils.py-->>User: Return processed result

Assessment against linked issues

Objective Addressed Explanation
Map all LLM calls in current codebase and convert them to use the new infra (ECHO-180)
Larger context length for report / use summaries first before feeding text (ECHO-224)

Possibly related PRs

Suggested reviewers

  • ussaama
LGTM.

<!-- walkthrough_end -->

<!-- announcements_start -->

> [!NOTE]
> <details>
> <summary>⚡️ AI Code Reviews for VS Code, Cursor, Windsurf</summary>
> 
> CodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback.
> Learn more [here](http://coderabbit.ai/ide).
> 
> </details>

---

> [!NOTE]
> <details>
> <summary>⚡️ Faster reviews with caching</summary>
> 
> CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure `Review - Disable Cache` at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the `Data Retention` setting under your Organization Settings.
> Enjoy the performance boost—your workflow just got faster.
> 
> </details>

<!-- announcements_end -->
<!-- internal state start -->


<!-- DwQgtGAEAqAWCWBnSTIEMB26CuAXA9mAOYCmGJATmriQCaQDG+Ats2bgFyQAOFk+AIwBWJBrngA3EsgEBPRvlqU0AgfFwA6NPEgQAfACgjoCEYDEZyAAUASpETZWaCrKNwSPbABsvkCiQBHbGlcSHFcLzpIACIAUQBhAAkAeTAARgAOAAZIBJSwACYCgBZIAFk0bnQfSAAZWrLGNB9keCwGbAp/DFCmJQE0RA9MeiYMKQpQ3FgSZjD8SGwhsJnIcgB3FAwAMypoyHXB+2wBZnUaejkVjyXKSABBCjbaNGYbfFkCgGZ0ZG5vXz+IIhNaOAR3NLFDL8LDTDwAEVmAio5AA9KJYAt/Nx8Ih1PgXBoYKsGLBMKRWhgGF5sEp7AhtuIMERrpAkXRaG0WdtsFTxPgMM11PIRqygvgaItxF5hfM6uoSPUykT7pAzhgCZBtvAAB4HI7NRALSrcGVRAismiIULbTXiyV4eAy8TSInuHDTTVk5BA4LW82rW58AACiG4gwQOgt/gk8BIm1Fnoo+GwRFgXnkVqmJLJzOkkHBtv8CmYppITJZcM8NV9IRVNSzyGcHmxBIucvBPEGeOZ6GzHnEbH422OpyQeIFbtW/xrgT9vUwBY8sfjUWokFguFw3EQHFRqKI6lgJw0TGYqMRzGRmBI6NJ+FRM68qMhGQ0Rn0xnAUDI9HwI7QPBCFIcgqHbM82B6LheH4YRRHEKQZHkPplFUdQtB0T8TCgOBUFQRdAIIYgyGUcCWEgzg/DQTYHCcFwC2QxRULUTRtF0MBDC/UwDAxB8hgoCZUSUK8UVvMZtSIDRuFkDgDGieSDAsB4AEliNA6golo5hnHkf9GFzCk3BJAUJM6ah4AFLUnQ8Q5kBIHUaAwJR6AtNpqVpAdYH8DwNnsctkD0shY2TDAKMgCRnHgFRImQIsVm8yBOW2bZKHYeUaCVNUmN8F0KF3SAAGUynueoABpylieFlIAVTKcrRVqe4bAAcViIkADFNRINBSTCOMKHK21OjZcZ4BCsKIqeaL82bPxutoLgAANgGgZTYhsPQAH1amU6BYiVTaymSeF9sW8rltW9atp2vaDvuKxlM2gBpWIAE0zsgC61o27bdv2hpNvux6ADV1oK5TkgAOQ+0Uvqu37boBoHNoAIXuArYkWolYh62Bwsi6aUCbRB+PbC1OyGXByrIIsGC5dAtW0LwwG2QZQnBMlY01eAAIwEVizOEmuSJeICX8MMBU5Xt+lTSAvHwFlrQ0ijkHWVL0FoZy5XEsa5irXhpDIBhhic/GaXzPS4WWSaooEGKVU19QLMFHxZHKqt/FwJ4SAi3w9MWm7/rKTaAHVEmUgqrHWwGHuet7FvQU2A7+g6w4jqObEO47Tv1ZBmEUHm4xcrFZnwKREpIVnvBtZovAGBgAGtEHK30xvp+zTXgOnQiCsaBQmgm7Y8SmKybU3u0oCtrjGnhxaNkgiURARZflohVfVtBNaiOKrY8EOEDDSgwH8LwNPoG3puJg4SB8IlIYWCUZj4Veu/4Pgxi9/BfG2eXNlJcl8xqwFpvBeRhzCWHuF4GgYFnbIAtFWJQ1JnDmQFAFEcHc2xRE1P8O2r92BO2kB+B4W96C+SIPLAYvgdZEDMvyLAF8h6Uk+rxVE/FBLCWvGiahUlZCLVkpAXQn0iolVqAjIOWcTq1C4NaCgi0DACKgItYR9QxF3Vji9V60ivZyIUUI4qKjA5qJBmDCGkMtGyPkYIpR+jRGGKRrHNGGNzE6KsWUSqNVg52ODkdSRzjLGKLcVVWqqj7GPQ0X43Ri1AkeJCcHZGoMbDgyhhE1x7jgleJjo9RxsQUmKMai1WIsSJH7VyZ9fJrUinI3CfYbR/iylNQqRk+JJjkk1IsZE8phSmkOPRjktpcioBlALtqKI9kkBT3IYIZoCgdjwBoTAyyDCYpbGYfeVhlB2FIlEqibh0k+F1OTojYOadI7RyqW9LgyRuB0OaAAbRkQAXQTv/PM9BdgsE+riDQvcQoaFILgAAFNEDJJyM6ZLjq9aI5VkhR0hvcR65zXoAEoE4WkWl8n5k5/lApBeHU5mdEXRBRQcI8vxSbOw1s5A5uL07Rx8SUyAVybleHuV7J5+kAFvOTHMdFiBvmjV+di4FKcAagrpdnWoUKYjrAPtwI+aQiWooWLy/lwUsXlhxSK45eKwX0slcS9YpKJ6TApZvKl8logfm4iwthlAhJbJvOiK8HJhbSVkhaxSEDVIgVIppRw2l6J6ReYZAw7p2SO17DyPkzshS4HkLZewhrcCkm3tyxYPYWRoEYEsAgPKOGiQ0PgOVgp4CnjND0JVrJFrOhvl4ZgCcZTXkJMSDw4apYsk5JBCc7RUG4EwKEBN2BuAvHbO8uYaQACsXwABscovhZAAOwFCnB4Ra4bNo0AcgnKNYgKUaj/rXZA1aFQ+GYN851EaiAJyTXjDuMpu5dioGwaBsVNRA0gA3EgbsCyDA8NVGwtR6qmwmN2+wKYKDGy5R86htCKVLOkEB+gtw4GrHzkoXwi1ohoAAF6dFvJu3AYA21cjAF8MAp8KCkGiFjFtc0JYYGWOGPK9Ng2po+T1Y2QtexYxHWgW5WQHnnvBJehO1AvbwGXpKC0nJd0Ch0mAa0shIjoAYJxo94scQMZILc6IvHogPP4wZ6IxHmT6Zo+6A2XMlgoFLG2YcrIv6jHLQOo4/h85SHoKKbEp9IMkumDZnEkw0Gsl8jBhZWBP2yDHi5VYx6aCnoTmh7wC9chdC9CMGUvYvPlnoi/Bgc1tJtEWFSAydB3yeoeFA0isC5QINEBRlBDH7MYMmFgvgOD70jXCHGRARD74crzHAhYVY8REEFLgPDb8K5IPC/ZqsLXR28lk1gNdF6N32UBQR8xSLdB6HlNaW5P98DUHZUVtdazbUUHtSJR1JnJJ7PfB6gwEAwBGBtRsu1+bHX2hIJtR0Xg+VurkgpJS8K1K+voFpHS9nWO9dDcZOk3meozRqAwQ9tXpwxgstZq5ZB4WMGc1saBrNjaQABYtakcYeinjJJoM8ZY6F8oYP4DSomk6U/YBocEfaafUFPCwBnsCpLOCGItHbN6mgtDlFm3yFOBeRDodupbdCAuYKg3rWLNaEtdkbmgUgU58LJXgnnau8AyxamVzV6Y64fVgQ8NgkXHh6cK5q2Ohmp9mTYD1x4NDN9ypuRpO2z6p7NphmdOtrd502hOyFNh37K51gfUWt2OZGBNqDDlWITaxsejJngLQJPtuNLp8PlnqHLhC8kTt5tBPm0NtUEQJX9SNBNptFG5uOvDkG8wyTkXlvYwQONdD/6nS5nVihfR7yHmBJmAZnTU7kyczYOWTGErHoyByedKKXqpPyjbFauKbUJPW/ulhPjudPflT1Hn/qQUq/WTem75sff1Gj/zon4P80xJpin8iJfwkpJaGHbOKX3XwIYSIZbROegd9ZnOgfBQ0LYf0TeezMkCgWgFCegUAtYV4fMGUT9T6aIIga5MAYofAajKbTDIgwjUgsAdUeAajFdOjTTZYXMWgTLFkfCWgIQHNOgLgOKAAKQKihjACYyGHoA01QQQ1ZBQjWHwE2DvS7nUDn3lk3mQEEKhi1DTUWgkK0xp3wC7mkEMw0DYBJm935x6HYATjbxoGQL0g42kAzQZixlELoFEy3CeEkxS3uEdmZQrmXhZFXkPF7E4K1jih0OWBMMQG92i0fTEJmUck0Fo3gFsyC3s0WlxwwHhViAvXbUbygKEVmH7S7niGPAwE/VkXQGLDczLjXFNjhBnmj3EFj0a2QAgnYCwTwDvgWFh0qJ9xAQxwX2LDy2pjSz4FYPYPKk1A/mTG/l/iXFkElkTXUFJHpirEyg5x6CJ0oBJxNnoCYxwOgXgGw3piwNFHfSUD7SdD5TAQq0gWgRaIGOmwa0Z2ax1EC3bGwROE63wRdDhygH6x6PgVWFG3Gzw2CwWza08FwXyx3UZ3KiQRJikM1Hg3XmASUEYLQ0LjRxeNmjaGgRdjlFhIpQEALikKRzpl7DWIaAJypwXCl1FE3m4OtHpnCI8FEOFmB0tWe2tQuw+yu0bFRCzE2h+z+2lEBxkk5NuO9SrzPmODol0hHFhyMg8GSPeJqWVnYCYSrCzCsmU0WiFJFP+3FOvXViHRHXNAWADw8k+j71+2wGwHz23S0K+3IA0CNLZ3oEWliDKBRkqiqkhmak2iqjKCdI+UWhdIXjuzH0NkpSdlQWOG4DVKrFgMazmxBB+1VlJSn2BEgGUnhBiLu2C2TLFngkSmSLIG7VjVkC6P4DhD4CJLkz9mfgVi7mmR6KAT6LpCK21JBBQnK3AUq3uJeKBI8EQWeJq0CjeLVymw6zwR6AITh3vldMlO5Nex4l5IEk+wdTREqHgEfGTBEDEB4XdRBy9XBzt0hxH0DUVNKzh3dApxZxb1bEmATl/BxDxIt2jUshBOoEmwTVY0uHkGqNjEpNi1oAEC4ERGLSUCpFkGUgwEPNwHhGoBUF/QTn2OfUoBFhLPo3bQzHdhJFrjlEWn+U2mfNwGzwFASM2iLE2l4HwEQqsIY3z08g8AbKwEHWHVlOjFLnLirHDIEFE0o0cHYHKn3RhDn3DC4xZEWnosQtbwL3yOrXJC91IGjLo1CEtlWHYrlhbIYH9ypED3blGI3Ay3pjikWlFgsIcmgHwHwFqAFCIFiB1GNmuWdg9MgCL18HslcrhIK20CayW1K1oEYJ0uZNnAmwoCa2TMfKiHIpQBoDmDd34s5H8DECWATiBnK1Byq3CyG1ZDHOQReMnPeMhNnPyx+J6yIXakt2/NT1/OLDNLPiWkGFkCpArkVNirIpIHeIBTkvggUvMXKnAsgp6t/CNjgoQvgmQr7QGCGHKhJNoBkkgHiFipsB6rbHWuBGtBRkUFkAKhTW0h2w4kgCXJIATkACTCT6Vq9qpQTq7qJ8jayYPqg8ga/PIagsPargVah6kgda94ra+cXapag6mYI63bU6gUc6lZc7TEdZTcq7CM1EXc/chi+CHhORJ7F7N7DczZG7NEbzWQUUq448yU0HaU5vP1eUmHW8owVUDNZTNozYzLVdSwdqZSAADTcS4EJuvQNBIRszYE5A0jn0LAJFYslxymVS1zrS0Gd3LHcsQJYqrVtO6tNCJtooH0oCiMV1+DaphNqowBrMfjuBkM1DywGwpGvjRNATARyqHKt2G1WEKryteNKr/Hay+LnO60IQMDOsewUlXJxrhsu2u04VvHIuJoB1Jo9XJvPNlPLwVMtt9vdB0pIvLDVrbEoosJ6BooJDoterED5rzmGSLjlHJNHLGlLIKgAEVGovBDr5BeM5qPBgRvZMz/MUpk0EBewta8oHjYws0Uqq70q8iYDnMiRlJQhxLPZvZEIZlB8Vdy8et8i+014yy0qIh5A3d4QR6Jtm4VMOhmBvBzJKT8BP0e1eRcBkAlh6Y4ta0z0CAL7KKr7KAE4r6nQGYlrBQzh8ttIdRkjHB4iNs5YyAiB/NUAWcU16AAUF0sgsgG55gL7WgRwqwsCf6F8egArkBohigNAFVJi6zDVlg0gCgMhEGn6KykVsLxhtaHib0UxQhl6ZpixEAG4zc5UQrCoryV7ZprRxb6Aiss0ZMbl6JIsogrg+6daKU8y/M8YmNxA6ZwxNiMH4SBRsSyBuK9cD7RRE6wgNtytIYNsCK2LDatRywU1YoAQwgUREBmczdr6tRNQRscCF66G/KzUUMSAZ5KHL719EN0BEzfw4FbH7Hrl8r5sAHmTexmGtTnau5RHv1+HEz6YONHAT6VceZLRz6yA5Zkj1AiY2QXKSA4CuH3QRGY16JUA4Qe1aHWti53dxknH2tuVrk5onJKAOT4c2TWnQhIHxr/B6AJdiyfB4IogpHUzeN8ihhNiRzg8T1Zb5bdb31rD5o5tMcfZscj0bqGB08egvIi0u5s9nN1LIjvdmntJp6Fh5Y8w+ArSlAvGyVxl+1Z5K4nRfAzn9daNWTgHNjUB68epR000qxtQ8pehMQDCAByPOBw85j+TnXM1B1YH5j+bBz6YAZwRRyIPQVFbRkx35vp5iukXeMIbRwpgFsQOgAAblrKfmIZICA3kAcF6jXqbFYfEyCfV2kKos5yMAcqICCJZATTNQtJseRzcf7pV3orUy5Hdhyb8evoCfhYclAeZH8xTUbj5Vo0sy2aKd8opSoCQHphXx5ZVciDVbxhqIoCO02ATWAqiCK1ZgkE1D0mRG6nYdAo8HliLVXooCW1PpZHhep1oxkI1F0tufn0efuH2eTG4Ffg2M0ruccd8YUA/sjUNoPp5FdiWO7tWIWFqAWbmH+wXOyogVyoeLmZdoeJKunM+OhK6wXKIQKnqsipuC4ouBasQH1o6s8ozsjqDYotov6rEBerRqz3eraWGogsKgcOdnKg9xoW922whpkUuuuq7dusrl7Yov7Z5dzqHcLsBWHYoonZkXnZUqXbaWOr21XZhvewRrDu2UjqNIxrJ3tc9PAvQuQUwooGoZXOxvXJDr5MfcdT7TYfFJPMtTjttMvOpqDVpu6eTpvowGnwoFnxFE1im0ZJzSrV4AsieDjS/afXLDuEskWmDFoCfXMgCDloIAqMQQJGoAJCbGZ1xBNygTN2UzA8QfYoPrchZ0cM1EQQE9WJmBnnslEEdEsjw4JFlBvW7LE74A03gkkFbuCD9CJAKkzyxNriSZ9mUF8EBeY7fjmTaF07kcYaKc7gfRk4I/kHEs5lXVs+FAAF550E5NRZKnhZO40XOJ0sgE5KZyoZVrIwMhwjO8pXmrNEA591xnO41IA4HejdKOyGn/PGD9TBgG5s8urj3NpG0qAK8VM6PoXIBgRghCmALNCwzohyBcB1gCQG5yC0VogGB/hyCJd4v5B0viFORmV8K5GFPV1uPNpVaMylcvyMBhqEnKnZAFM41lN7CSY5pTRkcmx3CJM8ASB5ulNhhVMHDmnPp6dIojQ088vMBmgoskAyLeRt04wvBaAN9FpHSlKUyLh09cAxcW5K4ICp4s1WNqu5hBBEK5QRH00zCVqcLNMg9Aj6YlZErNTkumqyYS4f5Szph8J9vlvWMazASnbBivW9KRjkwxizLmRJj34qKZitQ5jcTc9FBsBfNwQFi6iZhGNvO7ONYmTcAVZ8ipmlvkBFNlNkfXQbj7bqt4zK36siqJz0Epz6mZyvaKr5zfiiFEQmAwJNQReFpPoKOqPxAaOIuAVyvfsMGXPavywGuKAmviUrryPKPXhqPaOCRjf1PTecDze6uremvyouu3OshiUyORvl649s7F73LntIB1fGO6PFh23eDdeHfLn4BDexAXf+H/BuqHAoFEAXPoA/WGWyu3fNozfWv2vbfE/9eU/neKAAUM/ftxZq5c/8/ghyoTeS+Pey/sApU/f51A+VsRu2BKNfsJm6Fs9Sim4Bko/RAY+tf4+df7eq/U+6PXeSBggO+2Bze2vu+K/F/HeDea/V/1/S/t+e+OfXO++POB+subu09ywvA6Kzcb42hzrI/o/Ne+Btelo9f9/q+jf2/S+XvRroqkgB28f+yfZfi7wAGd8gB1vM/vhwv4B8r+n0EbtqBQ6IBYAYfdxs7E2iYhz6U/d/kx0/7z9v+SfJ3v/2L4n92uvvc/r5zga79wB5AtPrX2gGb8u+8AnzrID85ICYQKAm/mN2wAShpABAmfh/zj7mkF+jAg/hQLX7u82BsAm3muz34QDD+rAkgJ70t7ACaBCA3zv53758C2Go3GUi3gzwDUAWjeN/qIKIHiDmqlfX/pAJYGUCYBmguAQwLIHSDmBR/OQeoIt71ctBs8HQVwL0HIDMuhg1WrXnMEiCNe1gr/nYJUEyDj+nfU/toM4Eud6BSgqQX/08FqCt+1AgIakOCG8DQh2XFPGNhLyZ4KKOeT+I6UsHRDY+sQ5QUwJX45D2BKQuzmkJ4FgD3BWQ5oU4LYHJD8h7QwoUHxv7uR/QFAYUoIKtB/YM05QgalULzyPcohs/YgRINIFL9VBfQnwQoI4HtD0hoAuIU0KgFbCNBfguAW0Nc7DDr+hgt7vHjjCJ5ahKwmwR20OEeDehsgjftsJcE+9BhrnfYV0I2EJDvBpw73rsMuE8CRhhgtAbHl+x5cCuo+R4WIIaGZCHBXgz4SCP8F+9/hrwnoccI+FUDu+Fw3QRCOuHZdbhBdMdhRXhEV5/2PJIDg+3oqlgKKiVVblaFYRRZEqmdSYBoHRJCA2gQgNAJBylLx12wejeDgAjvKY4BcUwWYKyLYqag+8qTewByNmArc7MRWZqJQG0gcUjg2vBiLGSyxrB4wiBL2IzxVzW4e4bxe9EoXkD0UEALER5rcDSLABkQOLSAIkGgBlBagpLdekN34B4B/giReCkgVoD+5QgqVY3HKCdEs0CwLOJuIdyLCXMKwBuJsD4QpQzZC4+YEFjPjEwsk5wVdXnjvFWAwdKiWLGyEeEs7NBDiWWPmLSzuDWE/Wy2AKHwCrBWsiKMiM0ZNi0o9NpR/ZW4uW2HL48niMveMjWwV51tviKvKqn7ShoB0uSAHe9oJEZHXJ1spYU+GyK7b+hmAXIzQGQA0B8iEKgosmmeRLFiibyEo5UsnUpRNgjRmwBsWaIpRzNlxMotcRpHsDadtgsgUTuuDDbLRXRCcD0V6J9GC9MQ3gS4CqS0wK8/RjDQMZPQYw2FQxt4vJuQBAkpgHuS4efG8nNov9Yx7rLVrhBLpJQV62YtDrmN7A0J88mAUnEWI8AljMWXcZTAwzwDVBqxlYVYG2JqAdj0qxYKYnu1CCtw3MmpPseL1dpS8Zs1bOXu7UV71tKqvtM6rSLXKLi7Uz41cXKMQDsitxO475HygPECihR0HYwVTQDRJ0lSiHAHp4wZi+QFCD6e8ZASfG9N9Gr4mgOVFLw8wvxoFH8cqhdEUA3RrLekGhPAmIFJ4DrWEOxIDGdFcy8E+aPOxwlus0A8YjAQFIwm3AsJfAXyCzXwkHwTRjYzJk2CCYdNPMjIO4JE3GT0wKJLwKkPbiwCFYsG0eQ0fRRShCxGy8wcgPkWpDdQBoMyOmMsAXaqUUs/WE2nwExLahsSjtVkBbU1CJiyJHBeCTlJqyzRtI6JMXmWwdqS8hxVbYqpJNrae0ZJU4uSVDQUnB0+IwHFSSyPXHSANJnI8ihoF2D7j+RR42OieMMmwdjJNNC8WZNKzXjLJxo2yUvQ/HuS2J64QCd6L/E+TcW69RKWBNkLswbgcRaCeFKDFRTN4MU1qXFISmgT0JnYbEHPAsKAViuXuXwBsBjH/wqAlLPKCmIJyRQRkjzKaVuDzFBACxiPGiVZAJL0T2p/o7cMxKYmhAqxnTGsbpDrLZSHxo4lsexImDtjTR3E22gOTuIS9oq606Xq7THEfEdpk4n2ouQOlY06Rx0hkb01UnnT1Jm4q6U9U0AYAvAd0w8fpKemU0Xp0OcUYNkvHmTNYN43yL9MfFDjnxjkuUe+NEBuTvx/YDmYGP8lQyNQvQKigFXRb/j3Rno70ayzgkhiwxG9SMRaCdFEycJJMwFtrUO4xi0ZmUkInGXaAUZMxNMnMXTN7ACTZgTM5xsWOemliGJ9uBGRZ2Ym8yng/MusXc1mnCyms1c+3OLM4mSzW2wklaXLIibO1FZEkqzttKhJqzG2M48gLSM4jYQRof4ACEBBFHjNyI7AHmtRDlKvTJGTEKgGhFYiYQOIBgJeRBHUAKVEA3VBPHQFDx9pJg7ERed+C1ACBaApDDINkCyBfAGAAATgKDToVA2wDICQCyAvAJ0kGWgBkC+AkA0g2wCdNsC+BfBigAgNHE/LPkvyUo2wX+dOmnTJQ35DAYoLQCKBLoEFaQBgAIDSDzovg2wKEMUF/loA0gtAYoFkChAkAJ06CpeT1CyA4Lf59CzWL/LSALo0gAgGhQIAKAULtgWQfzhkFoBfABAE6CdHAsyAQKF0SXT8BgogAlgzgJ7R7jfPuF3zfwT8oAA -->

<!-- internal state end -->
<!-- finishing_touch_checkbox_start -->

<details open="true">
<summary>✨ Finishing Touches</summary>

- [ ] <!-- {"checkboxId": "7962f53c-55bc-4827-bfbf-6a18da830691"} --> 📝 Generate Docstrings

</details>

<!-- finishing_touch_checkbox_end -->
<!-- tips_start -->

---

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

<details>
<summary>❤️ Share</summary>

- [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai)
- [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai)
- [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai)
- [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)

</details>

<details>
<summary>🪧 Tips</summary>

### Chat

There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai?utm_source=oss&utm_medium=github&utm_campaign=Dembrane/echo&utm_content=148):

- Review comments: Directly reply to a review comment made by CodeRabbit. Example:
  - `I pushed a fix in commit <commit_id>, please review it.`
  - `Explain this complex logic.`
  - `Open a follow-up GitHub issue for this discussion.`
- Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples:
  - `@coderabbitai explain this code block.`
  -	`@coderabbitai modularize this function.`
- PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
  - `@coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.`
  - `@coderabbitai read src/utils.ts and explain its main purpose.`
  - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.`
  - `@coderabbitai help me debug CodeRabbit configuration file.`

### Support

Need help? Create a ticket on our [support page](https://www.coderabbit.ai/contact-us/support) for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

### CodeRabbit Commands (Invoked using PR comments)

- `@coderabbitai pause` to pause the reviews on a PR.
- `@coderabbitai resume` to resume the paused reviews.
- `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
- `@coderabbitai full review` to do a full review from scratch and review all the files again.
- `@coderabbitai summary` to regenerate the summary of the PR.
- `@coderabbitai generate docstrings` to [generate docstrings](https://docs.coderabbit.ai/finishing-touches/docstrings) for this PR.
- `@coderabbitai generate sequence diagram` to generate a sequence diagram of the changes in this PR.
- `@coderabbitai resolve` resolve all the CodeRabbit review comments.
- `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository.
- `@coderabbitai help` to get help.

### Other keywords and placeholders

- Add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed.
- Add `@coderabbitai summary` to generate the high-level summary at a specific location in the PR description.
- Add `@coderabbitai` anywhere in the PR title to generate the title automatically.

### CodeRabbit Configuration File (`.coderabbit.yaml`)

- You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository.
- Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information.
- If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json`

### Documentation and Community

- Visit our [Documentation](https://docs.coderabbit.ai) for detailed information on how to use CodeRabbit.
- Join our [Discord Community](http://discord.gg/coderabbit) to get help, request features, and share feedback.
- Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.

</details>

<!-- tips_end -->

@ArindamRoy23 ArindamRoy23 enabled auto-merge May 15, 2025 06:47
@coderabbitai coderabbitai bot requested a review from spashii May 15, 2025 06:47
@ArindamRoy23 ArindamRoy23 changed the title Shift to LiteLLM ECHO-180 Map all LLM calls in current codebase and convert them to use the new infra May 15, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bb74543 and fbd1288.

📒 Files selected for processing (4)
  • echo/server/dembrane/config.py (1 hunks)
  • echo/server/dembrane/embedding.py (2 hunks)
  • echo/server/dembrane/quote_utils.py (11 hunks)
  • echo/server/tests/test_quote_utils.py (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (2)
echo/server/tests/test_quote_utils.py (2)
echo/server/dembrane/utils.py (1)
  • generate_uuid (13-14)
echo/server/dembrane/database.py (4)
  • QuoteModel (447-490)
  • ProcessingStatusEnum (74-78)
  • ProjectAnalysisRunModel (168-200)
  • get_db (591-598)
echo/server/dembrane/quote_utils.py (3)
echo/server/dembrane/s3.py (1)
  • save_to_s3_from_url (86-116)
echo/server/dembrane/ner.py (1)
  • anonymize_sentence (25-48)
echo/server/dembrane/utils.py (2)
  • generate_uuid (13-14)
  • get_utc_timestamp (49-50)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci-build-servers (dbr-echo-server, ./echo/server, Dockerfile, dbr-echo-server)
🔇 Additional comments (7)
echo/server/dembrane/config.py (1)

176-190: Environment variables for LiteLLM config added correctly

These new environment variables align perfectly with the LiteLLM integration pattern. Clean implementation of config validation with assertions and consistent debug logging.

echo/server/tests/test_quote_utils.py (1)

10-17: Import additions look solid

Adding these imports enables proper test setup with UUID generation and correct embedding dimensionality - critical for maintaining test consistency after the LiteLLM migration.

echo/server/dembrane/embedding.py (1)

5-7: Clean litellm import and OpenAI client removal

Nice swap of imports. Properly commented out the old OpenAI import instead of just deleting it - makes rollback easier if needed.

echo/server/dembrane/quote_utils.py (4)

10-28: Well-structured imports for LiteLLM migration

Great job updating the imports to reflect the new LiteLLM dependency while maintaining backwards compatibility with commented-out imports. The config imports follow the standard pattern.


49-50: Properly commented out unused initializations

Clean approach to deprecate the old embeddings setup. This follows good practice of commenting out rather than just deleting for easier troubleshooting/rollback.


99-105: Solid implementation of litellm in llm_split_text function

The migration to litellm.completion is implemented correctly with all required parameters.


467-473: Consistent implementation of litellm across LLM interaction points

The LiteLLM migration is consistently applied across all completion endpoints. Code follows the same parameter pattern throughout, making future maintenance easier.

Also applies to: 606-612, 751-757, 777-783, 936-942, 972-978, 998-1004

return (
client.embeddings.create(input=[text], model="text-embedding-3-small").data[0].embedding
response = litellm.embedding(
model="text-embedding-3-small",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still openai

@spashii spashii changed the title ECHO-180 Map all LLM calls in current codebase and convert them to use the new infra ECHO-180 ECHO-224 Map all LLM calls in current codebase and convert them to use the new infra May 15, 2025
@linear
Copy link
Copy Markdown

linear bot commented May 15, 2025

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
echo/server/dembrane/reply_utils.py (1)

245-253: ⚠️ Potential issue

TODO comment without implementation change.

The "FIXME: reply" comment suggests this section needs further modification, but no actual changes were made to the LLM call. Model "anthropic/claude-3-5-sonnet-20240620" is still hardcoded rather than using the new LiteLLM configuration variables.

This hardcoded model name should be replaced with the appropriate LiteLLM configuration variable to complete the migration. Suggest implementing similar to other modules:

-    # FIXME: reply
-    response = await litellm.acompletion(
-        model="anthropic/claude-3-5-sonnet-20240620",
-        messages=[
-            {"role": "user", "content": prompt},
-            {"role": "assistant", "content": ""},
-        ],
-        stream=True,
-    )
+    response = await litellm.acompletion(
+        model=LARGE_LITELLM_MODEL,
+        api_key=LARGE_LITELLM_API_KEY,
+        api_base=LARGE_LITELLM_API_BASE,
+        api_version=LARGE_LITELLM_API_VERSION,
+        messages=[
+            {"role": "user", "content": prompt},
+            {"role": "assistant", "content": ""},
+        ],
+        stream=True,
+    )
♻️ Duplicate comments (2)
echo/server/dembrane/embedding.py (1)

24-31: 🛠️ Refactor suggestion

Model name still hardcoded despite parameterization elsewhere.

The model name "azure/text-embedding-3-large" remains hardcoded while the rest of the system uses environment variables for model selection. This inconsistency needs attention.

-        response = litellm.embedding(
-            api_key=str(LIGHTRAG_LITELLM_EMBEDDING_API_KEY),
-            api_base=str(LIGHTRAG_LITELLM_EMBEDDING_API_BASE),
-            api_version=str(LIGHTRAG_LITELLM_EMBEDDING_API_VERSION),
-            model="azure/text-embedding-3-large",
-            input=[text],
-        )
+        response = litellm.embedding(
+            api_key=str(LIGHTRAG_LITELLM_EMBEDDING_API_KEY),
+            api_base=str(LIGHTRAG_LITELLM_EMBEDDING_API_BASE),
+            api_version=str(LIGHTRAG_LITELLM_EMBEDDING_API_VERSION),
+            model=str(LIGHTRAG_LITELLM_EMBEDDING_MODEL),
+            input=[text],
+        )
echo/server/dembrane/quote_utils.py (1)

1044-1050: 🧹 Nitpick (assertive)

Remove redundant FIXME comment

The code has already been migrated to litellm, so the FIXME comment is now redundant.

-    # FIXME: use litellm
     response = completion(
         model=SMALL_LITELLM_MODEL,
         messages=messages,
         api_key=SMALL_LITELLM_API_KEY,
         api_version=SMALL_LITELLM_API_VERSION,
         api_base=SMALL_LITELLM_API_BASE,
     )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fbd1288 and fef966f.

📒 Files selected for processing (12)
  • echo/server/dembrane/api/project.py (1 hunks)
  • echo/server/dembrane/config.py (1 hunks)
  • echo/server/dembrane/embedding.py (2 hunks)
  • echo/server/dembrane/quote_utils.py (10 hunks)
  • echo/server/dembrane/reply_utils.py (1 hunks)
  • echo/server/dembrane/report_utils.py (2 hunks)
  • echo/server/dembrane/tasks.py (13 hunks)
  • echo/server/prompt_templates/system_report.de.jinja (1 hunks)
  • echo/server/prompt_templates/system_report.en.jinja (1 hunks)
  • echo/server/prompt_templates/system_report.es.jinja (1 hunks)
  • echo/server/prompt_templates/system_report.fr.jinja (1 hunks)
  • echo/server/prompt_templates/system_report.nl.jinja (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (3)
echo/server/dembrane/api/project.py (1)
echo/server/dembrane/report_utils.py (1)
  • get_report_content_for_project (36-137)
echo/server/dembrane/report_utils.py (3)
echo/server/dembrane/prompts.py (1)
  • render_prompt (55-88)
echo/server/dembrane/api/conversation.py (1)
  • get_conversation_transcript (292-322)
echo/server/dembrane/api/dependency_auth.py (1)
  • DirectusSession (13-22)
echo/server/dembrane/tasks.py (2)
echo/server/dembrane/quote_utils.py (1)
  • generate_quotes (118-258)
echo/server/dembrane/database.py (1)
  • QuoteModel (447-490)
🪛 Ruff (0.8.2)
echo/server/dembrane/report_utils.py

5-5: sqlalchemy.orm.Session imported but unused

Remove unused import: sqlalchemy.orm.Session

(F401)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci-check-server
🔇 Additional comments (29)
echo/server/prompt_templates/system_report.fr.jinja (1)

51-52: Clean formatting instruction added for report generation.

Adding explicit guidance to avoid HTML break tags in favor of proper newlines - solid move for consistent output formatting across the LiteLLM integration pipeline. Formatting consistency is key for downstream processing.

echo/server/dembrane/config.py (3)

176-191: Structured environment var configuration for tiered LLM models.

The implementation follows the right patterns - strict assertions for required vars, explicit debug logging, and proper isolation of different tiers (SMALL, MEDIUM, LARGE). This parameterization approach is crucial for the LiteLLM migration.


192-222: Environment configs for MEDIUM and LARGE tiers follow consistent pattern.

The symmetrical implementation across all tiers shows solid design - each tier has the same four variables (MODEL, API_KEY, API_VERSION, API_BASE) with identical validation and logging. This will scale well as model selection evolves.


226-228: Updates to Whisper config settings for consistency.

Removed fallbacks in favor of explicit environment variable requirements - good alignment with the fail-fast pattern established for other LiteLLM configs.

echo/server/dembrane/embedding.py (1)

14-14: Updated embedding dimension constant.

Bumped from 1536 to 3072 to match the new model capabilities - critical update for downstream vector operations.

echo/server/prompt_templates/system_report.en.jinja (1)

52-53: LGTM! Smart optimization on formatting instructions.

The addition of the explicit instruction to avoid <br> tags and use newlines instead standardizes the LLM output format across all language variants. This is a clean optimization that ensures consistent plain text formatting in the generated reports.

echo/server/prompt_templates/system_report.nl.jinja (1)

51-52: LGTM! Consistent formatting standardization.

Dutch version mirrors the same formatting optimization found in other language templates. This ensures consistent output handling across different languages. Solid engineered consistency.

echo/server/prompt_templates/system_report.es.jinja (1)

51-52: LGTM! Perfect formatting constraint.

Spanish template updated with the same formatting instruction pattern as other language variants. This standardization simplifies downstream content processing. Ship it.

echo/server/prompt_templates/system_report.de.jinja (1)

51-52: LGTM! Formatting consistency achieved.

German template gets the same formatting instruction as all other language variants. This cross-language standardization is exactly what we need for clean output processing.

echo/server/dembrane/api/project.py (1)

377-380: LGTM! Smart API abstraction refactor.

Clean removal of DB dependency from endpoint. This change elegantly abstracts DB access away from API layer, moving it to the service layer via the directus client in get_report_content_for_project(). The function signature is now more focused and follows separation of concerns.

echo/server/dembrane/quote_utils.py (6)

10-10: Solid move to litellm for completions

The switch to litellm provides a unified interface for multiple LLM providers. This is a strong architectural decision that will make future provider switching frictionless.


20-29: Config constants pattern is on point

Using environment-based config constants rather than hardcoded model names and API credentials is 100% the right move. This parameterization makes the codebase more maintainable and deployable across environments.


94-100: LGTM! Clean implementation of litellm

The migration from OpenAI client to litellm is implemented correctly. Using the SMALL model tier for this simple text splitting operation is resource-efficient.


462-469: Clean litellm implementation with JSON response handling

The replacement of OpenAI client with litellm is properly implemented including response format specification for structured JSON output. This standardizes our LLM interaction pattern.


473-488: Improved error handling and logging

The enhanced debug logging and more robust error handling for parsed responses is a great improvement. The explicit JSON parsing from response content is more reliable than depending on .parsed attribute.


604-610: Consistent migration pattern across all LLM calls

The systematic replacement of all OpenAI client calls with litellm.completion follows the same pattern throughout the file. This consistency is key for maintainability and ensures behavior remains predictable across different function calls.

Also applies to: 749-755, 775-781, 934-940, 970-976, 996-1002, 1044-1050

echo/server/dembrane/report_utils.py (8)

1-4: Clean imports and litellm integration

Good work importing the necessary modules from litellm. The token_counter is a nice addition for managing context size.


7-14: Well-organized config imports

The imports for litellm configuration are properly structured and follow the same pattern as in quote_utils.py. This consistency is key.


20-25: Smart dynamic context length management

Brilliant approach to dynamically set the context length based on model capabilities. The 4.1 model check with 700k tokens vs 128k tokens shows advanced knowledge of model constraints.


36-50: Refactored to use directus API instead of direct DB queries

The refactoring to use directus API instead of DB queries is a clean architectural improvement that decouples the report generation from database implementation details.


52-80: Token-aware conversation processing

The implementation now properly tracks token counts and limits context length, which is critical for reliable LLM operations. Breaking early when token limit is reached prevents wasted API calls.


81-99: Efficient transcript handling with token counting

This implementation intelligently accumulates transcripts while monitoring token count, ensuring we maximize context utilization without overflowing limits. The early break on limit exceeded is an optimization win.


108-121: Clean litellm integration with simplified message format

The migration from anthropic client to litellm is well-implemented. Removing the prefill message simplifies the code while maintaining functionality.


123-134: Robust content extraction

The pattern matching to extract content from article tags with a fallback to the raw content is a resilient approach that handles different model output formats.

echo/server/dembrane/tasks.py (5)

117-117: Task prioritization improvement

Setting explicit priority=30 for these actors ensures they have consistent execution priority. This is an important optimization for the task queue.

Also applies to: 139-139, 164-164, 181-181


334-334: Standardized priority for core processing tasks

Setting priority=50 for these critical processing tasks gives them appropriate precedence in the queue while still allowing urgent tasks to take priority. This balanced approach improves system responsiveness.

Also applies to: 446-446, 463-463, 490-490, 507-507, 524-524, 583-583, 651-651


651-651: Queue optimization for project library creation

Moving task_create_project_library from "network" to "cpu" queue is a smart resource allocation improvement, as this task is more CPU-bound than network-bound.


402-404: Updated data access pattern

Changed from attribute-style access to dictionary-style access to match the directus API response format. This is consistent with the broader migration to API-based data retrieval.


407-407: Consistent dictionary-style access

The consistent use of dictionary-style access pattern throughout the function ensures compatibility with the directus API response structure.

Also applies to: 415-415, 423-423

Comment on lines 4 to +12
import backoff
import litellm

from dembrane.openai import client
from dembrane.config import (
# FIXME: update to use dembrane embeddings
LIGHTRAG_LITELLM_EMBEDDING_API_KEY,
LIGHTRAG_LITELLM_EMBEDDING_API_BASE,
LIGHTRAG_LITELLM_EMBEDDING_API_VERSION,
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Inconsistent embedding configuration.

Using LIGHTRAG_LITELLM_EMBEDDING_* variables while the comment indicates these should be updated to use "dembrane embeddings". This appears to be a temporary implementation with a TODO marker.


🏁 Script executed:

#!/bin/bash
# Check if SMALL_LITELLM_* variables are used elsewhere for embeddings
rg "SMALL_LITELLM_" --type py | grep -v config.py

Length of output: 2457


🏁 Script executed:

#!/bin/bash
# List all embedding-related config variables to verify available dembrane-specific keys
rg "EMBEDDING" -n echo/server/dembrane/config.py

Length of output: 1402


Embeddings pipeline: wire up dembrane embedding config – LGTM pending fix

We’re still importing and using LIGHTRAG_LITELLM_EMBEDDING_* keys in
echo/server/dembrane/embedding.py (lines 4–12), but config.py only defines those. To complete the “FIXME: update to use dembrane embeddings”:

• In echo/server/dembrane/config.py (lines 329–353):
– Define new DEMBRANE_EMBEDDING_* variables alongside the existing LIGHTRAG_* ones, e.g.:
python DEMBRANE_EMBEDDING_MODEL = os.environ.get("DEMBRANE_EMBEDDING_MODEL") DEMBRANE_EMBEDDING_API_BASE = os.environ.get("DEMBRANE_EMBEDDING_API_BASE") DEMBRANE_EMBEDDING_API_KEY = os.environ.get("DEMBRANE_EMBEDDING_API_KEY") DEMBRANE_EMBEDDING_API_VERSION = os.environ.get("DEMBRANE_EMBEDDING_API_VERSION")
• In echo/server/dembrane/embedding.py (lines 4–12):
– Replace the LIGHTRAG_LITELLM_* imports with the new DEMBRANE_EMBEDDING_* keys and drop the # FIXME:
diff - from dembrane.config import ( - # FIXME: update to use dembrane embeddings - LIGHTRAG_LITELLM_EMBEDDING_API_KEY, - LIGHTRAG_LITELLM_EMBEDDING_API_BASE, - LIGHTRAG_LITELLM_EMBEDDING_API_VERSION, - ) + from dembrane.config import ( + DEMBRANE_EMBEDDING_API_KEY, + DEMBRANE_EMBEDDING_API_BASE, + DEMBRANE_EMBEDDING_API_VERSION, + )

Once those variables are in place and referenced, the embedding module will be fully aligned with the dembrane standard.

🤖 Prompt for AI Agents
In echo/server/dembrane/embedding.py lines 4 to 12, replace the imported
LIGHTRAG_LITELLM_EMBEDDING_API_KEY, LIGHTRAG_LITELLM_EMBEDDING_API_BASE, and
LIGHTRAG_LITELLM_EMBEDDING_API_VERSION with the new DEMBRANE_EMBEDDING_API_KEY,
DEMBRANE_EMBEDDING_API_BASE, and DEMBRANE_EMBEDDING_API_VERSION variables
respectively, and remove the FIXME comment. Also, ensure these
DEMBRANE_EMBEDDING_* variables are defined in echo/server/dembrane/config.py
around lines 329 to 353 by adding environment variable retrievals for
DEMBRANE_EMBEDDING_MODEL, DEMBRANE_EMBEDDING_API_BASE,
DEMBRANE_EMBEDDING_API_KEY, and DEMBRANE_EMBEDDING_API_VERSION.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
echo/server/dembrane/report_utils.py (1)

4-5: Unused import removal - thanks for the cleanup!

The Session import from sqlalchemy.orm has been properly removed. Codebase hygiene++

📜 Review details

Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fef966f and ac09694.

📒 Files selected for processing (1)
  • echo/server/dembrane/report_utils.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
echo/server/dembrane/report_utils.py (3)
echo/server/dembrane/prompts.py (1)
  • render_prompt (55-88)
echo/server/dembrane/api/conversation.py (1)
  • get_conversation_transcript (292-322)
echo/server/dembrane/api/dependency_auth.py (1)
  • DirectusSession (13-22)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: ci-check-server
🔇 Additional comments (6)
echo/server/dembrane/report_utils.py (6)

1-15: Solid imports for LiteLLM migration 🚀

Cleanly swapped in LiteLLM dependencies and config imports. Nicely organized, no cruft.


17-24: Dynamically scaling context window based on model version - clutch!

Great optimization to leverage larger context windows (700k) when using GPT-4-1106 while falling back to 128k for other models. This is exactly the kind of model-aware scaling that separates 10x from 100x engineering.


35-46: Function signature cleanup + Directus migration LGTM

Smart move removing the db parameter since we're using Directus API now. Query structure looks clean.


48-50: Proper debugging is production-ready engineering

Those debug logs will save someone's bacon during a 3am incident.


51-97: Token management game is strong

Solid implementation of token counting with the LiteLLM library. Love how you're checking the token count against MAX_REPORT_CONTEXT_LENGTH before adding each conversation. Early breaking on token limit is a performance win.


122-133: Robust article tag extraction

Smart fallback handling when article tags aren't found. This handles edge cases like a pro.

Comment on lines +107 to 120
# Use litellm.completion instead of anthropic client
response = completion(
model=MEDIUM_LITELLM_MODEL,
api_key=MEDIUM_LITELLM_API_KEY,
api_version=MEDIUM_LITELLM_API_VERSION,
api_base=MEDIUM_LITELLM_API_BASE,
# max tokens needed for "anthropic"
# max_tokens=4096,
messages=[
{"role": "user", "content": prompt_message},
# prefill message
{"role": "assistant", "content": "<article>"},
# prefill message only for "anthropic"
# {"role": "assistant", "content": "<article>"},
],
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick (assertive)

Clean LiteLLM client integration

Nice work converting from Anthropic to LiteLLM - kept the commented max_tokens and prefill stuff as documentation for future devs. 10/10.

Consider removing the comments if they're not needed after successful deployment.

🤖 Prompt for AI Agents
In echo/server/dembrane/report_utils.py around lines 107 to 120, the commented
lines related to max_tokens and prefill messages for the Anthropic client are no
longer needed after switching to LiteLLM. Remove these commented lines to clean
up the code and avoid confusion for future developers.

@spashii spashii disabled auto-merge May 15, 2025 11:59
@spashii spashii merged commit a94ae1e into main May 15, 2025
7 checks passed
@spashii spashii deleted the feature/echo-180-map-all-llm-calls-in-current-codebase-and-convert-them-to branch May 15, 2025 12:00
spashii added a commit that referenced this pull request Nov 18, 2025
…ength of Report (#148)

* Shift to LiteLLM

- Embedding shifted to LiteLLM
- Quote util shifted to LiteLLM
- Minor fix to test quote utils

* fix up bunch and also ECHO-224

* remove unused Session

---------

Co-authored-by: Sameer Pashikanti <sameer.pashikanti@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants