Skip to content

feat(wren-ai-service): Implement Chat History Management with Max History Limit#1377

Merged
paopa merged 12 commits into
mainfrom
feat/last-10-question-in-the-thread
Mar 13, 2025
Merged

feat(wren-ai-service): Implement Chat History Management with Max History Limit#1377
paopa merged 12 commits into
mainfrom
feat/last-10-question-in-the-thread

Conversation

@paopa
Copy link
Copy Markdown
Contributor

@paopa paopa commented Mar 7, 2025

This PR introduces improvements to the chat history management system by implementing a maximum history limit and restructuring how chat histories are handled.

Key Updates:

  1. Added max_histories configuration setting (default: 10) to limit the number of stored chat histories
  2. Refactored history handling from single AskHistory to list-based histories approach
  3. Simplified AskHistory model by removing steps and keeping essential fields (sql, question)
  4. Updated prompt templates to better handle multiple history entries

Technical Details:

  • Added max_histories field to Settings class with a default value of 10
  • Modified service container to pass max_histories configuration
  • Updated pipeline interfaces (DataAssistance, IntentClassification, Retrieval, FollowUpSQLGeneration) to handle list of histories
  • Enhanced prompt templates to display previous queries and their corresponding SQL statements
  • Implemented history truncation in AskService to respect max_histories limit

Impact

  • Better memory management by limiting chat history size
  • Improved context handling for follow-up questions
  • More structured and maintainable history tracking

Testing

Please ensure to test:

  • Chat history limitation works as expected
  • Follow-up questions work correctly with multiple history entries
  • Prompt generation with various history scenarios

Summary by CodeRabbit

  • New Features

    • Introduced support for processing multiple historical queries, enhancing context management and response quality.
    • Added a configurable limit (default: 10) to control the number of histories considered.
  • Refactor

    • Updated query and prompt processing across the service to transition from single-entry history handling to multi-history support.
    • Improved handling of the histories attribute in the AskRequest class to ensure instance-specific lists.
  • Bug Fixes

    • Improved logic for handling historical query data, ensuring consistent processing across various components.
  • UI Improvements

    • Updated labels in user prompts for clarity, changing "Previous Queries:" and "User's previous SQLs:" to "User's query history:".

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 7, 2025

Walkthrough

This pull request updates the Wren AI service by introducing a new max_histories configuration parameter in the Settings class (default value 10) and integrating it into the service container. Across multiple pipeline modules and web services, the code replaces a single history parameter with a list of histories, and adjusts class definitions (e.g., in AskHistory and AskRequest) and method signatures to support multiple historical entries.

Changes

Files Change Summary
wren-ai-service/src/config.py, wren-ai-service/src/globals.py Added new max_histories parameter to Settings and integrated it into AskService instantiation.
wren-ai-service/src/pipelines/generation/{data_assistance.py, followup_sql_generation.py, intent_classification.py}, wren-ai-service/src/pipelines/retrieval/retrieval.py Updated method signatures and internal logic by replacing a single AskHistory with a list (histories) for handling multiple historical entries.
wren-ai-service/src/web/v1/services/ask.py Modified AskHistory (replacing steps with question) and AskRequest (switching from history to histories), and updated AskService initialization to include max_histories.
wren-ai-service/tests/pytest/services/mocks.py Updated IntentClassificationMock to accept a list of AskHistory instances in the run method.
wren-ai-service/src/pipelines/generation/sql_expansion.py, wren-ai-service/src/web/v1/routers/sql_expansions.py, wren-ai-service/src/web/v1/services/sql_expansion.py, wren-ai-service/src/web/v1/routers/ask.py Renamed parameters and made minor adjustments to improve clarity without changing functionality.

Sequence Diagram(s)

sequenceDiagram
    participant C as Client
    participant AS as AskService
    participant S as Settings
    participant P as Pipeline

    C->>AS: Send AskRequest with histories[]
    AS->>S: Retrieve max_histories
    AS->>AS: Slice histories list (limit to max_histories)
    AS->>P: Forward request with updated histories
    P->>AS: Return generated response
    AS->>C: Respond with final output
Loading

Possibly related PRs

  • chore(wren-ai-service): fix historical question query input #1064: The changes in the main PR, which involve adding a new max_histories parameter to the Settings class, are related to the modifications in the retrieved PR that adjust how the ask method in the AskService class processes historical questions, as both involve handling the histories concept within the service's configuration and functionality.
  • chore(wren-ai-service): improve text2sql #1208: The changes in the main PR, which involve adding a new configuration parameter max_histories to the Settings class, are related to the modifications in the retrieved PR that update the AskService instantiation to utilize this parameter, indicating a direct connection at the code level.
  • chore(wren-ai-service): improve text2sql process #1070: The changes in the main PR, which involve adding a new max_histories parameter to the Settings class, are related to the modifications in the retrieved PR that enhance the ask method in the AskService class by integrating historical question processing, as both involve handling the concept of histories in the context of user queries.

Poem

I’m a rabbit hopping through lines of code so fine,
With max_histories now set, the histories align.
From one to many, our service takes a joyful leap,
In pipelines and in classes, new memories we keep.
Celebrating change with a twitch of my ears 🐰!

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@paopa paopa marked this pull request as ready for review March 10, 2025 09:10
@paopa paopa added module/ai-service ai-service related ci/ai-service ai-service related do-not-merge labels Mar 10, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
wren-ai-service/src/pipelines/generation/followup_sql_generation.py (1)

188-190: ⚠️ Potential issue

Test code needs to be updated to match new parameter type.

The dry run test code still uses a single history parameter instead of a list of histories. This should be updated to maintain consistency with the changes made to the class.

- history=AskHistory(sql="SELECT * FROM table", summary="Summary", steps=[]),
+ histories=[AskHistory(sql="SELECT * FROM table", summary="Summary", steps=[])],
🧹 Nitpick comments (1)
wren-ai-service/src/web/v1/services/ask.py (1)

209-209: Add a comment explaining the history truncation.

The truncation of histories to respect the maximum limit is implemented correctly, but lacks documentation explaining this important behavior.

-        histories = ask_request.histories[: self._max_histories]
+        # Truncate histories to respect the configured maximum limit
+        histories = ask_request.histories[: self._max_histories]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3f0f7a6 and be68ef4.

📒 Files selected for processing (7)
  • wren-ai-service/src/config.py (1 hunks)
  • wren-ai-service/src/globals.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/data_assistance.py (5 hunks)
  • wren-ai-service/src/pipelines/generation/followup_sql_generation.py (4 hunks)
  • wren-ai-service/src/pipelines/generation/intent_classification.py (4 hunks)
  • wren-ai-service/src/pipelines/retrieval/retrieval.py (4 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (9 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: pytest
  • GitHub Check: pytest
  • GitHub Check: Analyze (go)
🔇 Additional comments (26)
wren-ai-service/src/config.py (1)

39-39: LGTM! Configuration parameter for limiting chat history added.

The new max_histories parameter with a default value of 10 is well-placed in the generation config section and follows the established pattern for configuration parameters in this class.

wren-ai-service/src/globals.py (1)

120-120: LGTM! Configuration parameter correctly passed to service.

The change properly integrates the new max_histories parameter from settings into the AskService instantiation.

wren-ai-service/src/web/v1/services/ask.py (6)

18-18: LGTM! Simplified history model.

The AskHistory class has been streamlined by replacing complex steps with a simpler question attribute.


168-168: LGTM! Service constructor updated with max_histories parameter.

Properly added the max_histories parameter with a reasonable default value of 10.


181-181: LGTM! Initialized the instance variable properly.

The instance variable _max_histories is correctly initialized with the constructor parameter.


255-255: LGTM! Pipeline calls correctly updated to use histories.

All pipeline calls have been consistently updated to use the new histories parameter instead of the old history parameter.

Also applies to: 283-283, 320-320, 415-415


408-408: LGTM! Condition check updated for new history structure.

The condition has been properly updated to check for the existence of ask_request.histories instead of the old history attribute.


31-31:

Details

❓ Verification inconclusive

Empty list as default might cause unexpected behavior.

Using an empty list ([]) as a default value for a parameter can lead to unexpected behavior since lists are mutable objects. However, since this is a Pydantic model which handles defaults differently than regular Python functions, this may not be an issue, but it's worth noting.

Consider using None as the default and initializing an empty list in the code:

-    histories: Optional[list[AskHistory]] = []
+    histories: Optional[list[AskHistory]] = None

Then, in the code where you use it:

histories = ask_request.histories or []

You can check Pydantic's behavior with mutable defaults by running:


🏁 Script executed:

#!/bin/bash
python -c "from pydantic import BaseModel; class Test(BaseModel): a: list = []; t1 = Test(); t1.a.append(1); t2 = Test(); print(t2.a)"

Length of output: 359


Action Required: Revisit the Default Value for the histories Field

While Pydantic’s default handling may shield you from the usual pitfalls of mutable defaults (by deep-copying the default value on instantiation), the use of an empty list ([]) still obfuscates the intended behavior. For clarity and to avoid any potential issues, please consider one of the following alternatives:

  • Option 1: Change the default to None and then initialize to an empty list when using the field:

    -    histories: Optional[list[AskHistory]] = []
    +    histories: Optional[list[AskHistory]] = None

    And later in the code:

    histories = ask_request.histories or []
  • Option 2: Use Pydantic’s Field with a default_factory to guarantee a fresh list for each instance:

    from pydantic import Field
    
    histories: Optional[list[AskHistory]] = Field(default_factory=list)

A previous verification attempt using a one-liner Python command produced a syntax error. Please run a multi-line verification script (for example, using a heredoc) to confirm that each model instance indeed gets its own default list and that no unintended sharing occurs. This manual check is recommended to ensure that the change does not affect the existing behavior.

wren-ai-service/src/pipelines/generation/data_assistance.py (3)

56-56: LGTM! Updated parameter from single history to list of histories.

Parameter type correctly changed from Optional[AskHistory] to Optional[list[AskHistory]] to support multiple history entries.


143-143: LGTM! Run method signature updated consistently.

The run method signature has been correctly updated to use histories instead of history.


153-153: LGTM! Pipeline execution updated to use histories.

The pipeline execution correctly passes histories to the underlying pipe execution.

wren-ai-service/src/pipelines/retrieval/retrieval.py (4)

295-295: Parameter type updated to support multiple history entries.

The change from history: Optional[AskHistory] = None to histories: Optional[list[AskHistory]] = None aligns with the PR objective of implementing multiple chat history support. This is a necessary modification to handle a list of histories instead of a single history object.


306-307: Simplified history processing logic.

The code has been refactored to directly use the question attribute from each history item, which is more straightforward than the previous implementation that had to iterate through steps. This is a cleaner approach that makes the code more readable.


480-480: Updated run method parameter to support multiple histories.

The run method signature has been properly updated to reflect the new approach of using a list of histories.


489-489: Updated pipeline input to use histories collection.

The input dictionary properly passes the histories parameter to the pipeline execution.

wren-ai-service/src/pipelines/generation/intent_classification.py (5)

106-111: Enhanced template to display previous queries with context.

The template has been updated to properly iterate through multiple history entries, displaying both questions and their corresponding SQL queries. This provides better context for the intent classification.


229-230: Parameter type updated to support multiple history entries.

The function signature has been properly updated to accept a list of histories instead of a single history object.


236-236: Updated prompt builder to use the histories collection.

The prompt builder now correctly uses the histories parameter, consistent with the updated template.


323-323: Updated run method parameter to support multiple histories.

The parameter in the run method has been updated to match the new approach of using a list of histories.


332-332: Updated pipeline input to use histories collection.

The input dictionary properly passes the histories parameter to the pipeline execution.

wren-ai-service/src/pipelines/generation/followup_sql_generation.py (6)

51-55: Enhanced template to display multiple previous queries.

The template has been updated to iterate through the histories list, displaying both questions and SQL queries for each entry. This provides better context for SQL generation on follow-up questions.


74-74: Parameter type updated to support multiple history entries.

The function signature has been properly updated to accept a list of histories instead of a single history object.


81-81: Updated query summary extraction to work with multiple histories.

The code now correctly extracts questions from all history entries to build the previous query summaries.


87-87: Updated prompt builder to use the histories collection.

The prompt builder now correctly uses the histories parameter, consistent with the updated template.


155-155: Updated run method parameter to support multiple histories.

The run method signature has been properly updated to accept a list of histories instead of a single history object.


169-169: Updated pipeline input to use histories collection.

The input dictionary properly passes the histories parameter to the pipeline execution.

Comment thread wren-ai-service/src/pipelines/generation/data_assistance.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
wren-ai-service/tests/pytest/services/mocks.py (1)

32-32: Be careful with mutable default arguments in Python

The parameter has been changed from history: Optional[AskHistory] = None to histories: Optional[list[AskHistory]] = []. Using an empty list as a default parameter can lead to unexpected behavior if the list is modified within the function in future changes. Consider using None as the default value and initializing the list inside the function if needed.

-    histories: Optional[list[AskHistory]] = [],
+    histories: Optional[list[AskHistory]] = None,

Then handle the None case inside the function if needed.

wren-ai-service/src/pipelines/generation/data_assistance.py (3)

56-56: Be consistent with default parameter values

You're using [] as the default value here but None on line 143. Default values should be consistent across similar parameters to prevent confusion and potential bugs.

-    histories: Optional[list[AskHistory]] = [],
+    histories: Optional[list[AskHistory]] = None,

58-59: Consider limiting concatenated history length

With the introduction of a max_histories limit, the concatenated query string could still become quite long. Consider adding a mechanism to limit the total length or truncate older histories if needed to prevent hitting model token limits.


143-143: Use consistent parameter defaults

This default value (None) differs from the same parameter's default value on line 56 ([]). Maintain consistency to avoid confusion.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between be68ef4 and f9904d7.

📒 Files selected for processing (2)
  • wren-ai-service/src/pipelines/generation/data_assistance.py (5 hunks)
  • wren-ai-service/tests/pytest/services/mocks.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: pytest
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (2)
wren-ai-service/src/pipelines/generation/data_assistance.py (2)

58-59: Add null check to prevent potential TypeError.

The code assumes histories is not None when constructing previous_query_summaries. Since the parameter is declared as Optional, it could be None, which would result in a TypeError.

-    previous_query_summaries = [history.question for history in histories]
+    previous_query_summaries = [history.question for history in (histories or [])]
     query = "\n".join(previous_query_summaries) + "\n" + query

147-156: LGTM: Properly updated pipeline execution parameters

You've correctly updated the input parameter from history to histories in the pipeline execution call, maintaining consistency with the parameter changes.

@paopa paopa requested a review from cyyeh March 10, 2025 09:32
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
wren-ai-service/src/web/v1/services/ask.py (1)

666-671: Consistent code formatting throughout AskService.

The code has been reformatted for better readability while maintaining functionality. This improves code maintenance and readability.

Also applies to: 743-747

wren-ai-service/src/pipelines/generation/intent_classification.py (1)

123-123: Parameter type inconsistency in embedding function.

There's a naming inconsistency in the embedding function: the parameter is named history but represents a list of histories (as indicated by its type annotation and usage).

Consider renaming the parameter for consistency:

-async def embedding(query: str, embedder: Any, history: Optional[list[AskHistory]] = None) -> dict:
+async def embedding(query: str, embedder: Any, histories: Optional[list[AskHistory]] = None) -> dict:

And update its usage:

-    previous_query_summaries = (
-        [history_step.question for history_step in history] if history else []
-    )
+    previous_query_summaries = (
+        [history_step.question for history_step in histories] if histories else []
+    )

Also applies to: 126-126

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f9904d7 and fec3ff2.

📒 Files selected for processing (3)
  • wren-ai-service/src/pipelines/generation/intent_classification.py (5 hunks)
  • wren-ai-service/src/pipelines/retrieval/retrieval.py (5 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (11 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • wren-ai-service/src/pipelines/retrieval/retrieval.py
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: pytest
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (9)
wren-ai-service/src/web/v1/services/ask.py (6)

18-18: AskHistory model simplified to focus on question content.

The AskHistory model now stores the question string instead of the previously detailed steps, which simplifies the model and focuses on the essential information needed for chat history management.


31-31: AskRequest refactored to support multiple chat histories.

The model now uses a list of AskHistory objects instead of a single history entry, which enables maintaining conversation context across multiple interactions.


168-168: Implementation of max_histories parameter for history management.

The AskService class now includes a configurable max_histories parameter with a default value of 10, which is stored as an instance variable. This is a good approach to prevent unbounded growth of chat history.

Also applies to: 181-181


207-208: Proper initialization and truncation of history list.

The code correctly initializes an empty list when histories is None and appropriately slices the list to respect the max_histories limit, ensuring memory management and preventing the accumulation of too many history entries.

Also applies to: 211-211


257-257: Pipeline parameters updated to use histories list.

The parameters passed to various pipeline components have been consistently updated to use the new histories parameter instead of the single history entry, maintaining consistency throughout the codebase.

Also applies to: 285-285, 322-322, 417-417


410-410: Conditional logic updated for the new history structure.

The conditional check has been updated to verify if ask_request.histories exists before determining the appropriate pipeline to use, ensuring proper flow control with the new data structure.

wren-ai-service/src/pipelines/generation/intent_classification.py (3)

106-110: Template updated to iterate through multiple history entries.

The prompt template has been enhanced to support multiple history entries, displaying both question and SQL for each history item in the list, which improves the context available to the language model.


229-229: Prompt function updated to support multiple histories.

The prompt function parameter has been renamed from history to histories and the implementation updated to pass the entire histories list to the prompt builder, maintaining consistency with the new data structure.

Also applies to: 236-236


323-323: Pipeline run method updated to support multiple histories.

The run method of the IntentClassification class now accepts a list of histories instead of a single history entry, and correctly passes this parameter to the pipeline execution, ensuring consistent implementation throughout the codebase.

Also applies to: 332-332

@paopa paopa force-pushed the feat/last-10-question-in-the-thread branch from fec3ff2 to 1dad68d Compare March 11, 2025 07:45
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
wren-ai-service/src/pipelines/generation/followup_sql_generation.py (1)

189-190: ⚠️ Potential issue

Update example in __main__ section to use new parameter name.

The example in the __main__ section still uses the old history parameter name with a single AskHistory object. This should be updated to match the new interface that uses a list of histories.

-        history=AskHistory(sql="SELECT * FROM table", summary="Summary", steps=[]),
+        histories=[AskHistory(sql="SELECT * FROM table", question="Show me the table")],
♻️ Duplicate comments (1)
wren-ai-service/src/pipelines/generation/data_assistance.py (1)

56-59: ⚠️ Potential issue

Add null check to prevent potential TypeError.

The code assumes histories is not None when constructing previous_query_summaries. Since the parameter is declared as Optional, it could be None, which would result in a TypeError.

-    previous_query_summaries = [history.question for history in histories]
+    previous_query_summaries = [history.question for history in (histories or [])]
     query = "\n".join(previous_query_summaries) + "\n" + query
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fec3ff2 and 1dad68d.

📒 Files selected for processing (19)
  • wren-ai-service/src/pipelines/generation/data_assistance.py (5 hunks)
  • wren-ai-service/src/pipelines/generation/followup_sql_generation.py (4 hunks)
  • wren-ai-service/src/pipelines/generation/intent_classification.py (4 hunks)
  • wren-ai-service/src/pipelines/retrieval/retrieval.py (4 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (7 hunks)
  • wren-ai-service/src/config.py (1 hunks)
  • wren-ai-service/src/globals.py (1 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (3 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/data_assistance.py (1 hunks)
  • wren-ai-service/tests/pytest/services/mocks.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/sql_expansion.py (1 hunks)
  • wren-ai-service/src/web/v1/routers/sql_expansions.py (2 hunks)
  • wren-ai-service/src/web/v1/services/sql_expansion.py (8 hunks)
  • wren-ai-service/src/pipelines/generation/intent_classification.py (1 hunks)
  • wren-ai-service/src/web/v1/routers/ask.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/data_assistance.py (1 hunks)
  • wren-ai-service/src/pipelines/retrieval/retrieval.py (2 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (2 hunks)
✅ Files skipped from review due to trivial changes (1)
  • wren-ai-service/src/web/v1/routers/ask.py
🚧 Files skipped from review as they are similar to previous changes (8)
  • wren-ai-service/src/web/v1/services/ask.py
  • wren-ai-service/src/config.py
  • wren-ai-service/tests/pytest/services/mocks.py
  • wren-ai-service/src/globals.py
  • wren-ai-service/src/web/v1/services/ask.py
  • wren-ai-service/src/web/v1/services/ask.py
  • wren-ai-service/src/pipelines/retrieval/retrieval.py
  • wren-ai-service/src/pipelines/retrieval/retrieval.py
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: pytest
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (16)
wren-ai-service/src/web/v1/routers/sql_expansions.py (3)

91-92: Parameter name updated for better consistency

The parameter name has been updated from sql_expansion_request to request for better readability and consistency with similar methods in the codebase.


97-98: Reference updated to match new parameter name

The reference has been correctly updated to use the new parameter name.


106-107: Task parameter updated to match new parameter name

The parameter passed to the background task has been correctly updated to use the new parameter name.

wren-ai-service/src/web/v1/services/ask.py (3)

168-169: Added max_histories parameter to implement history limitation

This new parameter is a key part of the PR's objective to limit the number of chat histories stored. The default value of 10 aligns with the PR requirements.


181-182: Storing max_histories as instance variable

Correctly storing the parameter as an instance variable for later use in the class.


209-210: Implementing history limitation with configurable max_histories

This line replaces the previous hardcoded limit of 10 with the configurable _max_histories value, completing the implementation of the history limitation feature.

wren-ai-service/src/web/v1/services/sql_expansion.py (2)

100-101: Parameter name updated for better consistency

The parameter name has been updated from sql_expansion_request to request for better readability and consistency with similar methods in the codebase.


114-115: References updated to match new parameter name

All references to the former parameter name have been correctly updated to use the new parameter name, maintaining consistent functionality while improving code readability.

Also applies to: 128-129, 131-132, 140-141, 162-167, 184-185, 200-203, 210-211, 248-249

wren-ai-service/src/pipelines/generation/data_assistance.py (2)

143-153: LGTM: Successfully updated run method parameter to match interface changes.

The run method has been properly updated to accept histories instead of a single history object, consistent with the broader PR changes to support multiple history entries.


103-105: Formatting changes look good.

The minor reformatting of the queue initialization code doesn't affect functionality and maintains consistency with the project style.

Also applies to: 116-118

wren-ai-service/src/pipelines/generation/intent_classification.py (4)

106-110: LGTM: Improved template to display multiple history entries.

The template has been updated effectively to iterate through all history entries and display both the question and SQL for each, providing better context for the intent classification.


123-127: Well-implemented null check for histories.

Good job implementing the null check for histories before accessing its elements. This prevents potential TypeErrors and handles the optional parameter properly.


229-237: LGTM: Prompt function updated correctly.

The prompt function has been properly updated to use the new histories parameter, and the query_history is now passed correctly to the prompt builder.


323-333: LGTM: Run method signature and implementation updated consistently.

The run method has been properly updated to accept histories instead of a single history object, and the parameter is correctly passed to the pipeline execution.

wren-ai-service/src/pipelines/generation/followup_sql_generation.py (2)

51-55: LGTM: Template updated to display multiple history entries.

The template now correctly iterates through all history entries and displays both the question and SQL for each one, providing better context for follow-up SQL generation.


155-170: LGTM: Run method signature and implementation updated consistently.

The run method has been properly updated to accept histories instead of a single history object, and the parameter is correctly passed to the pipeline execution.

Comment thread wren-ai-service/src/pipelines/generation/sql_expansion.py
Comment thread wren-ai-service/src/pipelines/generation/followup_sql_generation.py
Comment thread wren-ai-service/src/pipelines/generation/intent_classification.py Outdated
Comment thread wren-ai-service/src/pipelines/generation/followup_sql_generation.py Outdated
Copy link
Copy Markdown
Member

@cyyeh cyyeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks!

@paopa paopa force-pushed the feat/last-10-question-in-the-thread branch from 5fd48d0 to c002739 Compare March 11, 2025 09:13
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
wren-ai-service/src/pipelines/generation/followup_sql_generation.py (1)

189-189: 🛠️ Refactor suggestion

Update sample code to use new histories parameter.

The sample code at the end of the file still uses the old history parameter format, which should be updated to use the new histories list parameter.

-        history=AskHistory(sql="SELECT * FROM table", summary="Summary", steps=[]),
+        histories=[AskHistory(sql="SELECT * FROM table", question="Sample question")],
♻️ Duplicate comments (1)
wren-ai-service/src/pipelines/generation/followup_sql_generation.py (1)

81-81: ⚠️ Potential issue

Add null check to prevent potential TypeError.

The code assumes histories is not None when constructing previous_query_summaries. Since it's possible to have None as a parameter value, you should add a null check.

-    previous_query_summaries = [history.question for history in histories]
+    previous_query_summaries = [history.question for history in (histories or [])]
🧹 Nitpick comments (1)
wren-ai-service/src/pipelines/retrieval/retrieval.py (1)

309-311: Remove duplicate line of code.

Line 310 appears to be a duplicate of line 309, which might be a merge artifact.

        query = "\n".join(previous_query_summaries) + "\n" + query
        return prompt_builder.run(question=query, db_schemas=db_schemas)
-        query = "\n".join(previous_query_summaries) + "\n" + query
-        return prompt_builder.run(question=query, db_schemas=db_schemas)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5fd48d0 and c002739.

📒 Files selected for processing (22)
  • wren-ai-service/src/pipelines/generation/data_assistance.py (5 hunks)
  • wren-ai-service/src/pipelines/generation/followup_sql_generation.py (4 hunks)
  • wren-ai-service/src/pipelines/generation/intent_classification.py (4 hunks)
  • wren-ai-service/src/pipelines/retrieval/retrieval.py (4 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (7 hunks)
  • wren-ai-service/src/config.py (1 hunks)
  • wren-ai-service/src/globals.py (1 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (3 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/data_assistance.py (1 hunks)
  • wren-ai-service/tests/pytest/services/mocks.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/sql_expansion.py (1 hunks)
  • wren-ai-service/src/web/v1/routers/sql_expansions.py (2 hunks)
  • wren-ai-service/src/web/v1/services/sql_expansion.py (8 hunks)
  • wren-ai-service/src/pipelines/generation/intent_classification.py (1 hunks)
  • wren-ai-service/src/web/v1/routers/ask.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/data_assistance.py (1 hunks)
  • wren-ai-service/src/pipelines/retrieval/retrieval.py (2 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (2 hunks)
  • wren-ai-service/src/pipelines/generation/followup_sql_generation.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/intent_classification.py (1 hunks)
  • wren-ai-service/tests/pytest/services/mocks.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (18)
  • wren-ai-service/src/pipelines/generation/followup_sql_generation.py
  • wren-ai-service/src/pipelines/generation/intent_classification.py
  • wren-ai-service/src/web/v1/routers/ask.py
  • wren-ai-service/src/web/v1/routers/sql_expansions.py
  • wren-ai-service/src/pipelines/generation/sql_expansion.py
  • wren-ai-service/src/web/v1/services/ask.py
  • wren-ai-service/src/globals.py
  • wren-ai-service/tests/pytest/services/mocks.py
  • wren-ai-service/src/config.py
  • wren-ai-service/src/web/v1/services/ask.py
  • wren-ai-service/src/pipelines/generation/data_assistance.py
  • wren-ai-service/src/pipelines/generation/data_assistance.py
  • wren-ai-service/src/web/v1/services/ask.py
  • wren-ai-service/src/pipelines/generation/data_assistance.py
  • wren-ai-service/src/pipelines/retrieval/retrieval.py
  • wren-ai-service/src/web/v1/services/sql_expansion.py
  • wren-ai-service/src/pipelines/generation/intent_classification.py
  • wren-ai-service/tests/pytest/services/mocks.py
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: pytest
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (12)
wren-ai-service/src/pipelines/generation/intent_classification.py (4)

106-110: Template correctly updated for multiple history entries but consider addressing previous feedback.

The template has been properly modified to iterate over multiple history entries, showing both question and SQL for each entry. This aligns well with the PR objective of moving from a single history to a list-based approach.

Consider changing "User's previous SQLs:" to "User's query history:" as suggested in a previous review.


229-230: LGTM! Parameter type correctly updated to handle multiple histories.

The parameter has been properly renamed from history to histories and its type changed from Optional[AskHistory] to Optional[list[AskHistory]], which aligns with the PR's objective of using a list-based approach for chat histories.


236-237: LGTM! Query history assignment updated to match parameter type.

The query_history parameter now correctly uses the histories list rather than a single history object, which matches the new template implementation.


323-324: LGTM! Run method signature correctly updated for consistency.

The run method parameter has been properly renamed from history to histories to maintain consistency with the prompt method and align with the list-based approach.

wren-ai-service/src/web/v1/services/ask.py (4)

17-18: Model change to replace steps with question field looks good.

The AskHistory model has been simplified by replacing the steps attribute with a question attribute, which aligns with the PR's goal to simplify history tracking.


31-31: Good update to support multiple history entries.

Changing from a single history to a list of histories enables better management of chat history and supports the multi-history approach described in the PR objectives.


168-168: Configuration parameter addition for max histories is well implemented.

Adding the max_histories parameter to the constructor allows for configurable history limits, which is a key part of this PR's objectives.


406-420: Update to use histories parameter is well-integrated.

The conditional flow to check for history and use the appropriate pipeline has been correctly updated to use the new histories list approach.

wren-ai-service/src/pipelines/retrieval/retrieval.py (3)

118-130: ⚠️ Potential issue

Add null check to prevent potential TypeError.

The code assumes histories is not None when constructing previous_query_summaries. Since it's possible to have None as a parameter value, you should add a null check.

-        if histories:
-            previous_query_summaries = [
-                history.question for history in histories
-            ]
+        if histories:
+            previous_query_summaries = [
+                history.question for history in (histories or [])
+            ]

477-490: Update to use histories parameter is properly integrated.

The function signature has been appropriately updated to handle a list of AskHistory objects.


306-307: ⚠️ Potential issue

Add null check for histories in prompt function.

Similar to the embedding function, you should add a null check here to prevent potential errors when accessing elements of histories.

-        previous_query_summaries = [history.question for history in histories]
+        previous_query_summaries = [history.question for history in (histories or [])]

Likely an incorrect or invalid review comment.

wren-ai-service/src/pipelines/generation/followup_sql_generation.py (1)

51-55: Template update for multiple histories is well-implemented.

The template has been effectively updated to iterate through multiple history entries, displaying both the question and SQL for each entry.

@@ -205,6 +206,7 @@ async def ask(
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Use the configured max_histories value instead of hardcoding.

The code currently hardcodes the history limit to 10 instead of using the configured self._max_histories value.

-        histories = ask_request.histories[:10]
+        histories = ask_request.histories[:self._max_histories]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
histories = ask_request.histories[:self._max_histories]

@paopa paopa merged commit 37b13ed into main Mar 13, 2025
@paopa paopa deleted the feat/last-10-question-in-the-thread branch March 13, 2025 02:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/ai-service ai-service related do-not-merge module/ai-service ai-service related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants