ECHO-497 Smart Auto Select by ussaama · Pull Request #317 · Dembrane/echo

ussaama · 2025-10-02T14:21:12Z

Summary by CodeRabbit

New Features
- Automatic conversation selection for chat context with multilingual prompt support (EN/DE/ES/FR/NL).
- Smart follow-up detection to reuse existing context when appropriate.
- Chat context refreshes after responses and newly added conversations are surfaced to the user.
Changes
- Updated label to: “The following conversations were automatically added to the context.”
- Streamlined chat UI by removing “Processing” indicators and in-progress warnings.
- Conversation selection labels now display more consistently.

…save cost

- add retry logic upto 3 times for LLM calls - add exception handling for LLM calls

linear · 2025-10-02T14:21:17Z

ECHO-497 "Smart" Auto-select

coderabbitai · 2025-10-02T14:21:22Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds LLM-driven auto-selection and follow-up detection to backend chat flows with multilingual prompts, updates conversation citation/selection utilities, and removes several frontend processing indicators/warnings while triggering chat-context refetch when auto-select adds context.

Changes

Cohort / File(s)	Summary of Edits
Frontend: UI text & processing indicators `echo/frontend/src/components/chat/Sources.tsx`, `echo/frontend/src/components/conversation/AutoSelectConversations.tsx`, `echo/frontend/src/components/conversation/ConversationAccordion.tsx`	Updated displayed Sources string; removed console logs and the audio-processing warning render; removed Processing badge/Tooltip and gating around chat-selection rightSection.
Frontend: chat route `echo/frontend/src/routes/project/chat/ProjectChatRoute.tsx`	Added conditional `chatContextQuery.refetch()` in `onResponse` when `ENABLE_CHAT_AUTO_SELECT` and `contextToBeAdded?.auto_select_bool` are truthy.
Backend: chat API & follow-up flow `echo/server/dembrane/api/chat.py`	Added `is_followup_question(...)` using a small LLM; refactored post_chat to branch follow-up vs. auto-select, reuse or lock conversations, stream a references payload of newly added conversations, enforce context length limits, and preserve streaming/error handling.
Backend: auto-select utilities `echo/server/dembrane/chat_utils.py`	Added `auto_select_conversations(...)` with batching, token-guarded prompts, parallel batch processing, LLM-call backoff/retries, recursive batch-splitting for token limits, and aggregation; updated `get_conversation_citations(...)` signature to accept `project_ids`.
Prompt templates: auto-select (multilingual) `echo/server/prompt_templates/auto_select_conversations.en.jinja`, `...de.jinja`, `...es.jinja`, `...fr.jinja`, `...nl.jinja`	New templates guiding selection of relevant conversations for queries; strict JSON-only output: `{"selected_conversation_ids":[...]}`.
Prompt templates: follow-up detection (multilingual) `echo/server/prompt_templates/is_followup_question.en.jinja`, `...de.jinja`, `...es.jinja`, `...fr.jinja`, `...nl.jinja`	New templates asking whether current question is a follow-up; strict JSON-only output: `{"is_followup": true

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Backend citation support #116 — Backend overlap: modifies chat flow, citations, and utilities used by the new auto-select and reference streaming logic.
ECHO-145 Block empty conversations from being selected in chat #126 — Frontend overlap: edits ConversationAccordion.tsx and chat-selection rendering/behavior.
Report frontend updates #222 — Frontend overlap: touches ConversationAccordion.tsx and related UI changes.

LGTM.

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Linked Issues Check	⚠️ Warning	The pull request delivers summary-based conversation selection logic and integrates it into the chat API, but it does not enforce limiting selected context to 80% of the window nor does it swap to the specified GPT5 Azure model as required by the ECHO-497 objectives.	Please implement the prioritization logic to cap selected context at 80% of the content window and configure the system to use the GPT5 model from Azure to fully satisfy the linked issue requirements.
Docstring Coverage	⚠️ Warning	Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title succinctly references the primary feature implemented—smart auto-select functionality tied to ECHO-497—and clearly communicates the core change without extraneous details.
Out of Scope Changes Check	✅ Passed	All modifications—including UI text updates, removal of audio-processing warnings, chat route adjustments, and prompt template additions—directly support the smart auto-select feature without introducing unrelated functionality.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch auto-select-changes

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between cd98d27 and dcb5b7a.

📒 Files selected for processing (1)

echo/server/dembrane/api/chat.py (6 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

echo/server/dembrane/api/chat.py (6)

echo/server/dembrane/prompts.py (1)

render_prompt (58-91)

echo/server/dembrane/chat_utils.py (2)

auto_select_conversations (256-375)

create_system_messages_for_chat (100-166)

echo/server/dembrane/quote_utils.py (1)

count_tokens (265-269)

echo/server/dembrane/api/conversation.py (1)

get_conversation_token_count (437-456)

echo/server/dembrane/audio_lightrag/utils/lightrag_utils.py (1)

get_project_id (363-365)

echo/server/dembrane/database.py (2)

ConversationModel (340-386)

ProjectChatMessageModel (256-283)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Cursor Bugbot
GitHub Check: ci-check-server

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 387119c and 7500202.

📒 Files selected for processing (16)

echo/frontend/src/components/chat/Sources.tsx (1 hunks)
echo/frontend/src/components/conversation/AutoSelectConversations.tsx (0 hunks)
echo/frontend/src/components/conversation/ConversationAccordion.tsx (0 hunks)
echo/frontend/src/routes/project/chat/ProjectChatRoute.tsx (1 hunks)
echo/server/dembrane/api/chat.py (6 hunks)
echo/server/dembrane/chat_utils.py (5 hunks)
echo/server/prompt_templates/auto_select_conversations.de.jinja (1 hunks)
echo/server/prompt_templates/auto_select_conversations.en.jinja (1 hunks)
echo/server/prompt_templates/auto_select_conversations.es.jinja (1 hunks)
echo/server/prompt_templates/auto_select_conversations.fr.jinja (1 hunks)
echo/server/prompt_templates/auto_select_conversations.nl.jinja (1 hunks)
echo/server/prompt_templates/is_followup_question.de.jinja (1 hunks)
echo/server/prompt_templates/is_followup_question.en.jinja (1 hunks)
echo/server/prompt_templates/is_followup_question.es.jinja (1 hunks)
echo/server/prompt_templates/is_followup_question.fr.jinja (1 hunks)
echo/server/prompt_templates/is_followup_question.nl.jinja (1 hunks)

💤 Files with no reviewable changes (2)

echo/frontend/src/components/conversation/AutoSelectConversations.tsx
echo/frontend/src/components/conversation/ConversationAccordion.tsx

🧰 Additional context used

🧬 Code graph analysis (3)

echo/server/dembrane/api/chat.py (4)

echo/server/dembrane/prompts.py (1)

render_prompt (58-91)

echo/server/dembrane/chat_utils.py (2)

auto_select_conversations (256-375)

create_system_messages_for_chat (100-166)

echo/server/dembrane/audio_lightrag/utils/lightrag_utils.py (1)

get_project_id (363-365)

echo/server/dembrane/database.py (2)

ConversationModel (340-386)

ProjectChatMessageModel (256-283)

echo/frontend/src/routes/project/chat/ProjectChatRoute.tsx (1)

echo/frontend/src/config.ts (1)

ENABLE_CHAT_AUTO_SELECT (31-32)

echo/server/dembrane/chat_utils.py (4)

echo/server/dembrane/database.py (1)

ConversationModel (340-386)

echo/server/dembrane/api/conversation.py (1)

get_conversation_transcript (398-429)

echo/server/dembrane/api/dependency_auth.py (1)

DirectusSession (13-22)

echo/server/dembrane/prompts.py (1)

render_prompt (58-91)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: ci-check-server

🔇 Additional comments (19)

echo/frontend/src/components/chat/Sources.tsx (1)

21-21: LGTM! 🚀

Clean UI text update that properly reflects the auto-select behavior. Ships it.

echo/server/prompt_templates/is_followup_question.de.jinja (1)

1-21: LGTM! 🔥

Solid German template. Structure is clean, JSON-only output requirement is crystal clear, examples are on point. This will ship.

echo/server/prompt_templates/auto_select_conversations.nl.jinja (1)

1-35: LGTM! 💯

Dutch template is fire. Structure mirrors the other language variants perfectly. JSON-only constraint is explicit, relevance instructions are comprehensive, and the fallback to empty list is handled. Ships it.

echo/server/prompt_templates/auto_select_conversations.es.jinja (1)

1-35: LGTM! ⚡

Spanish template is clean. Consistent structure, clear JSON output format, comprehensive relevance guidance. This will ship without issues.

echo/frontend/src/routes/project/chat/ProjectChatRoute.tsx (1)

124-126: LGTM with a minor thought! 🤔

The conditional refetch logic is clean and properly gated. One thing to verify: since this fires in onResponse (before onFinish), make sure the backend has actually committed the auto-selected conversations to the context by this point. If there's any async delay on the server side, you might be refetching too early and getting stale data.

Looks solid though—should ship.

echo/server/prompt_templates/is_followup_question.fr.jinja (1)

1-21: LGTM! 🇫🇷

French template is solid. Structure is consistent, JSON output is strictly enforced, examples are clear. This will ship.

echo/server/prompt_templates/is_followup_question.es.jinja (1)

1-21: LGTM! 🌮

Spanish template is clean. Consistent structure across all language variants, JSON-only output is explicit, examples are helpful. Ships it.

echo/server/prompt_templates/is_followup_question.en.jinja (1)

1-20: LGTM! 🎯

English base template is fire. Clear instructions, JSON-only output constraint is explicit, examples are on point. This is the template all the others are based on and it's solid. Ships it.

echo/server/prompt_templates/auto_select_conversations.de.jinja (1)

22-25: Consider tightening inclusion criteria.
Saying “include indirect relevance and anything potentially useful” invites the model to grab way too much context; we need the 80 % cap to stick. Tune the copy (e.g., preference order or max slots) to weigh direct matches first or give a hard selection budget so we don’t blow the window.
[ suggest_recommended_refactor ]

echo/server/prompt_templates/auto_select_conversations.en.jinja (1)

22-25: Dial back the looseness.
Same story: “partial or indirect relevance” + “if it might help, include it” is basically carte blanche. Tighten instructions so the assistant keeps us under the 80 % allotment—e.g., cap selections or require justification before opting in marginal threads.
[ suggest_recommended_refactor ]

echo/server/prompt_templates/auto_select_conversations.fr.jinja (1)

21-25: Limit the sprawl.
Le texte français laisse aussi trop de marge (“pertinence partielle ou indirecte” + “incluez si ça pourrait aider”). Reformule ton prompt pour prioriser la pertinence directe et imposer un plafond explicite, sinon le modèle va selectionner en vrac.
[ suggest_recommended_refactor ]

echo/server/dembrane/chat_utils.py (8)

292-368: Batch fan-out nukes context compliance.
You kick off every batch via asyncio.gather with the same batch_num. If even two batches sneak through, each LLM call sees only ~20 conversations—no global limit—so the merged union happily reassembles 200 IDs and torpedoes the “≤80 % of window” mandate from ECHO‑497. We need either (a) sequential selection with a running token budget, or (b) global arbitration (e.g., take top-N scores) before flattening results. Right now the first model that says “sure, everything’s relevant” wins.
[ raise_major_issue ]

319-335: Parallelism = N concurrent LLM calls.
asyncio.gather launches the whole fleet at once. If a project has 500 convos, we’re now spamming 25 simultaneous LLM hits—no backpressure, no semaphore, nothing. That’s a self-inflicted DDOS against Azure GPT5 and will blow rate limits. Gate the concurrency (semaphore / worker pool) or throttle via streaming pipeline.
[ raise_major_issue ]

463-513: Recursive split drops observability & ordering.
When you recurse, both halves reuse the parent batch_num, so logs become useless—can’t tell which subbatch bombed. Worse, you recombine raw IDs without dedup priorities, so order becomes “whatever recursion returns,” not stable by created_at or similarity. Track subbatch identifiers and enforce deterministic ordering (e.g., original index) before merging.
[ suggest_recommended_refactor ]

524-548: LLM output contract unchecked.
You json.loads and trust the shape without schema validation, then silently drop bogus IDs. At minimum run pydantic / jsonschema to fail fast, otherwise the selection path can degrade quietly (empty lists) with zero upstream signal. Even better: surface failures so the caller retries or falls back to KG search.
[ raise_major_issue ]

530-538: ID filter slips integers.
isinstance(id, (int, str)) lets “1” through even though all conv IDs are UUID strings (see ConversationModel.id). If the model emits 1, validation keeps it, the later id in valid_ids rejects, and we silently drop it—masking the prompt or parsing error. Tighten to strings and log hard failures so we can fix upstream prompt drift instead of papering over it.
[ suggest_essential_refactor ]

553-568: Context overflow path dead-ends.
If one batch hits ContextWindowExceededError, we return {error: "context_exceeded"} but the caller just tallies failed_batches and moves on—no retry with smaller chunk, no escalation. That contradicts the issue requirement to keep context ≤80 %: we’re simply discarding coverage. Handle this like the earlier recursive split (or fallback to KG) instead of dropping everything.
[ raise_major_issue ]

560-571: Unrecoverable errors get swallowed.
After retries you return empty selected_ids, and upstream happily ships an empty recommendations list. We need to bubble an exception or at least propagate error up so the API can alert the user/UI to rerun auto-select rather than pretending nothing matched.
[ raise_major_issue ]

296-374: Project loop assumes single ID.
Docstring says project_id_list “contains a single project ID,” but the code loops arbitrary length and aggregates independently. If callers ever pass >1 (the API already supports multi-project queries), we’re doing multiple simultaneous gathers with additive load. Either enforce len==1 or design the scheduling to handle true multi-project selection.
[ suggest_essential_refactor ]

echo/server/dembrane/api/chat.py

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: ASSERTIVE

Plan: Pro

Disabled knowledge base sources:

Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 7500202 and cd98d27.

📒 Files selected for processing (1)

echo/server/dembrane/api/chat.py (6 hunks)

🧰 Additional context used

🧬 Code graph analysis (1)

echo/server/dembrane/api/chat.py (6)

echo/server/dembrane/prompts.py (1)

render_prompt (58-91)

echo/server/dembrane/chat_utils.py (2)

auto_select_conversations (256-375)

create_system_messages_for_chat (100-166)

echo/server/dembrane/quote_utils.py (1)

count_tokens (264-268)

echo/server/dembrane/api/conversation.py (1)

get_conversation_token_count (437-456)

echo/server/dembrane/audio_lightrag/utils/lightrag_utils.py (1)

get_project_id (363-365)

echo/server/dembrane/database.py (2)

ConversationModel (340-386)

ProjectChatMessageModel (256-283)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: ci-check-server

🔇 Additional comments (5)

echo/server/dembrane/api/chat.py (5)

17-19: LGTM!

Solid imports for the new follow-up detection and auto-select flow. SMALL_LITELLM_* config for lightweight calls, render_prompt for multilingual templates, auto_select_conversations for LLM-driven selection, get_project_id for project extraction. All checks out.

Also applies to: 26-26, 39-39, 45-45

503-531: LGTM.

The follow-up detection gate is wired up correctly: you only check for follow-ups when locked conversations exist (line 524), call the detection function with the conversation history, and set the should_reuse_locked flag based on the result. Logging at lines 527 and 530 is clutch for debugging. No issues here.

560-632: Auto-select branch looks solid.

The auto-select logic is clean: you call auto_select_conversations (line 565), extract the selected IDs from the nested result dict (lines 575-579), add them to chat.used_conversations with a duplicate check (lines 582-587), create a "dembrane" message to lock the newly selected conversations (lines 590-601), and then rebuild the context and formatted messages with the updated set (lines 604-621).

Token counting at lines 624-626 runs after building formatted_messages, and you throw a 400 with a clear error message ("Auto select returned too many conversations") if the context is too large. The error message here is more actionable than the follow-up branch message, which is good.

One subtle thing: line 585 checks conversation not in chat.used_conversations to avoid duplicates. This relies on SQLAlchemy's relationship membership check, which should work correctly. If you wanted to be extra paranoid, you could track IDs in a set, but the current implementation is fine.

634-690: References payload and streaming logic ship-ready.

Lines 634-643 correctly build the references payload from only the newly added conversations (conversations_added), not all conversations in the context. This is the right behavior per your comment at line 636 and the AI summary. Each reference includes the conversation ID and participant name.

The streaming function (lines 647-690) emits the references payload as an "h:" frame (line 649), then streams the LLM response. The timeout adjustments at lines 661-662 (5-minute overall timeout, 3-minute stream timeout) are clutch for preventing hanging requests—this is a solid reliability improvement over having no explicit timeout.

Error handling at lines 673-684 is correct: you catch exceptions, log them, delete the in-flight user message from the DB, and yield a protocol-appropriate error payload. All good here.

691-784: Manual-select path LGTM.

The manual-select flow (when auto-select is disabled) builds system messages from locked conversations (line 692-694), filters the conversation history (lines 714-728), constructs the outgoing messages (lines 730-743), and streams the LLM response (lines 746-763).

The timeout adjustments at lines 753-754 (5-minute overall timeout, 3-minute stream timeout) match the auto-select path—good for consistency and reliability. Error handling at lines 764-774 is correct: you delete the in-flight user message and yield a protocol-appropriate error.

One observation: the manual-select path doesn't check token length before streaming. This could theoretically blow up if the user manually adds too many conversations, but this behavior existed before your changes, so it's not a regression. If you wanted to add a guard here, you could, but it's not critical.

coderabbitai · 2025-10-02T14:35:42Z

echo/server/dembrane/api/chat.py

+async def is_followup_question(
+    conversation_history: List[Dict[str, str]], language: str = "en"
+) -> bool:
+    """
+    Determine if the current question is a follow-up to previous messages.
+    Uses a small LLM call to check semantic relationship.
+
+    Returns:
+        True if it's a follow-up question, False if it's a new independent question
+    """
+    if len(conversation_history) < 2:
+        # No previous context, can't be a follow-up
+        return False
+
+    # Take last 4 messages for context (2 exchanges)
+    recent_messages = conversation_history[-4:]
+
+    # Format messages for the prompt
+    previous_messages = [
+        {"role": msg["role"], "content": msg["content"]} for msg in recent_messages[:-1]
+    ]
+    current_question = recent_messages[-1]["content"]
+
+    prompt = render_prompt(
+        "is_followup_question",
+        language,
+        {
+            "previous_messages": previous_messages,
+            "current_question": current_question,
+        },
+    )
+
+    try:
+        response = await litellm.acompletion(
+            model=SMALL_LITELLM_MODEL,
+            api_key=SMALL_LITELLM_API_KEY,
+            api_base=SMALL_LITELLM_API_BASE,
+            messages=[{"role": "user", "content": prompt}],
+            temperature=0,  # Deterministic
+            timeout=60,  # 1 minute timeout for quick decision
+        )
+
+        result_text = response.choices[0].message.content.strip()
+        result = json.loads(result_text)
+        is_followup = result.get("is_followup", False)
+
+        logger.info(f"Follow-up detection: {is_followup} for query: {current_question[:50]}...")
+        return is_followup
+    except Exception as e:
+        logger.warning(f"Follow-up detection failed: {e}. Defaulting to False (run auto-select)")
+        return False
+


🧹 Nitpick | 🔵 Trivial

Ship it.

The follow-up detection logic is clean: grabs the last 4 messages, renders a prompt, hits the small LLM with temperature=0 for deterministic classification, parses the JSON response, and defaults to False on error. The 60-second timeout is generous but not problematic for a quick classification call.

One optional nitpick: lines 94-96 assume the LLM returns valid JSON. If the model hallucinates and emits prose instead of {"is_followup": true}, json.loads throws and you catch it (safe fallback). If you want to be extra paranoid, you could add explicit JSON validation or use structured outputs if the model supports it, but the current error handling is solid enough.

🤖 Prompt for AI Agents

In echo/server/dembrane/api/chat.py around lines 52 to 103, add defensive JSON validation around the LLM response parsing: after getting response.choices[0].message.content, attempt to safely parse JSON (e.g., use json.JSONDecoder().raw_decode to find a JSON object at the start) and fall back to treating it as non-followup if parsing fails; log the raw response when parsing fails for debugging and keep the existing exception handler as a final fallback.

coderabbitai · 2025-10-02T14:35:42Z

echo/server/dembrane/api/chat.py

+        if should_reuse_locked:
+            # Reuse existing locked conversations for follow-up questions
+            updated_conversation_id_list = locked_conversation_id_list
+
+            system_messages = await create_system_messages_for_chat(
+                updated_conversation_id_list, db, language, project_id
+            )
+
            formatted_messages = []
-            top_k = max(5, top_k - 10)
-            query = filtered_messages[-1]["content"]
-            conversation_history = filtered_messages
-            rag_prompt = await get_lightrag_prompt_by_params(
-                query=query,
-                conversation_history=conversation_history,
-                echo_conversation_ids=chat_context.conversation_id_list,
-                echo_project_ids=[project_id],
-                auto_select_bool=chat_context.auto_select_bool,
-                get_transcripts=True,
-                top_k=top_k,
+            if isinstance(system_messages, list):
+                for msg in system_messages:
+                    formatted_messages.append({"role": "system", "content": msg["text"]})
+                formatted_messages.extend(conversation_history)
+            else:
+                formatted_messages = [
+                    {"role": "system", "content": system_messages}
+                ] + conversation_history
+
+            # Check context length
+            prompt_len = token_counter(
+                model=LIGHTRAG_LITELLM_INFERENCE_MODEL, messages=formatted_messages
+            )
+
+            if prompt_len > MAX_CHAT_CONTEXT_LENGTH:
+                raise HTTPException(
+                    status_code=400,
+                    detail="The conversation context with the new message exceeds the maximum context length.",
+                )


🧹 Nitpick | 🔵 Trivial

Token guard is in, ship it.

The follow-up branch now has the token counter check (lines 551-559) that the past review flagged. You're counting tokens and throwing a 400 if the context blows past MAX_CHAT_CONTEXT_LENGTH. This plugs the regression where reused locked conversations could bypass the guard and cause LiteLLM to throw a generic streaming error.

One UX nitpick: the error message at line 558 is correct but generic. The original review suggested something more actionable like "Locked conversations exceed the maximum context length. Remove conversations or rerun auto-select." Your current message doesn't tell the user what to do. Consider making it more specific so users know they can remove conversations or rerun auto-select to fix the issue.

Also, you could fail-fast by checking token count before building formatted_messages, but the current order is fine—it's just a micro-optimization.

🤖 Prompt for AI Agents

In echo/server/dembrane/api/chat.py around lines 532 to 559, the token-length check correctly prevents oversized contexts for reused locked conversations but the HTTPException message is generic; update the raised exception to provide an actionable message (e.g., "Locked conversations exceed the maximum context length. Remove conversations or rerun auto-select.") so users know how to resolve it; optionally, for micro-optimization, move the token-length estimate to operate on a lightweight representation (or pre-check estimated lengths) before building formatted_messages, but the primary fix is to replace the generic detail string with the more specific, actionable guidance.

cursor · 2025-10-08T08:51:13Z

echo/server/dembrane/api/chat.py

        async def stream_response_async_autoselect() -> AsyncGenerator[str, None]:
-            conversation_references_yeild = f"h:{json.dumps(conversation_references)}\n"
+            # Send conversation references (selected conversations)
+            conversation_references_yeild = f"h:{json.dumps([conversation_references])}\n"


Bug: Chat Context Loss and Conversation Reference Issues

The post_chat endpoint has two issues: auto-selected conversations added to chat.used_conversations are not persisted, leading to their loss. Also, for follow-up questions reusing locked conversations, the frontend doesn't receive references to these conversations, obscuring the active context from users.

## Summary by CodeRabbit - **New Features** - Automatic conversation selection for chat context with multilingual prompt support (EN/DE/ES/FR/NL). - Smart follow-up detection to reuse existing context when appropriate. - Chat context refreshes after responses and newly added conversations are surfaced to the user. - **Changes** - Updated label to: “The following conversations were automatically added to the context.” - Streamlined chat UI by removing “Processing” indicators and in-progress warnings. - Conversation selection labels now display more consistently.

ussaama added 12 commits September 30, 2025 08:48

- placeholder auto select

db776c6

- logic update

639ac45

- remove processing message from conversations

4ae0c7a

- first level implementation for 'smart auto select' with batching

b8f80c1

- add parallel batching for processing conversations

585d631

- add LLM validation for output conversation_ids against valid ids

53d8e61

- include manually added and auto selected conversations both

2508d4d

- add follow up question check function and logic for auto-select to …

98c58df

…save cost

- remove redundant code and add checks for context length

53267c4

- add optional batch size field

df058ee

- add context length exhaustion handling logic

a7d6130

- add retry logic upto 3 times for LLM calls - add exception handling for LLM calls

- donot show references if no new conversations were added

7500202

coderabbitai bot added Feature improvement labels Oct 2, 2025

ussaama requested a review from spashii October 2, 2025 14:24

coderabbitai bot reviewed Oct 2, 2025

View reviewed changes

echo/server/dembrane/api/chat.py Show resolved Hide resolved

- add context length check in reuse locked conversations

cd98d27

vercel bot deployed to Preview October 2, 2025 14:32 View deployment

coderabbitai bot reviewed Oct 2, 2025

View reviewed changes

ussaama enabled auto-merge October 2, 2025 14:37

Merge branch 'main' into auto-select-changes

61a31d1

vercel bot deployed to Preview October 7, 2025 08:16 View deployment

This comment was marked as outdated.

Sign in to view

Merge branch 'main' into auto-select-changes

cf903e5

vercel bot deployed to Preview October 7, 2025 10:49 View deployment

This comment was marked as outdated.

Sign in to view

Merge branch 'main' into auto-select-changes

dcb5b7a

vercel bot deployed to Preview October 8, 2025 08:50 View deployment

cursor bot reviewed Oct 8, 2025

View reviewed changes

spashii approved these changes Oct 8, 2025

View reviewed changes

ussaama added this pull request to the merge queue Oct 8, 2025

Merged via the queue into main with commit 87653d6 Oct 8, 2025
12 checks passed

ussaama deleted the auto-select-changes branch October 8, 2025 09:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ECHO-497 Smart Auto Select#317

ECHO-497 Smart Auto Select#317
ussaama merged 16 commits intomainfrom
auto-select-changes

ussaama commented Oct 2, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

linear bot commented Oct 2, 2025

Uh oh!

coderabbitai bot commented Oct 2, 2025 •

edited

Loading

Other AI code review bot(s) detected

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 2, 2025

Uh oh!

coderabbitai bot Oct 2, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

cursor bot Oct 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ussaama commented Oct 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

linear bot commented Oct 2, 2025

Uh oh!

coderabbitai bot commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

Uh oh!

cursor bot Oct 8, 2025

Choose a reason for hiding this comment

Bug: Chat Context Loss and Conversation Reference Issues

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ussaama commented Oct 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 2, 2025 •

edited

Loading