Skip to content

feat(wren-ai-service): user guide and misleading streaming(ai-env-changed)#1015

Merged
cyyeh merged 41 commits into
mainfrom
feat/ai-service/user-guide
Apr 10, 2025
Merged

feat(wren-ai-service): user guide and misleading streaming(ai-env-changed)#1015
cyyeh merged 41 commits into
mainfrom
feat/ai-service/user-guide

Conversation

@cyyeh
Copy link
Copy Markdown
Member

@cyyeh cyyeh commented Dec 18, 2024

Summary by CodeRabbit

  • New Features

    • Added new assistance capabilities for misleading queries and user guidance.
    • Updated configuration options now include a documentation endpoint and an open-source indicator.
    • Enhanced intent classification to incorporate a dedicated user-guidance category for more tailored responses.
    • Improved streaming of assistance responses for a more interactive experience.
  • Tests

    • Expanded test coverage to ensure the new assistance features operate reliably.

@cyyeh cyyeh added module/ai-service ai-service related ci/ai-service ai-service related labels Dec 18, 2024
@cyyeh cyyeh force-pushed the feat/ai-service/user-guide branch 8 times, most recently from 3c8107f to dabf183 Compare December 25, 2024 09:30
@cyyeh cyyeh force-pushed the feat/ai-service/user-guide branch from dabf183 to 173fc85 Compare December 26, 2024 06:00
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 26, 2024

Walkthrough

This pull request introduces new pipeline entries for misleading and user guide assistance across various configuration files and Python modules. It adds new settings (doc_endpoint and is_oss), integrates a documentation fetching utility into service container initialization, and updates intent classification to include a new "USER_GUIDE" category. Dedicated Python modules implement new pipelines, while test fixtures and web service response models are updated to accommodate these changes.

Changes

File(s) Summary of Changes
deployment/kustomizations/base/cm.yaml, docker/config.example.yaml, wren-ai-service/tools/config/config.example.yaml, wren-ai-service/tools/config/config.full.yaml, wren-ai-service/tests/data/config.test.yaml Added new pipeline entries (misleading_assistance, user_guide_assistance/data_assistance) and new settings (doc_endpoint, is_oss) to enhance configuration features.
wren-ai-service/docs/config_examples/config.*.yaml (Anthropic, Azure, Deepseek, Google AI Studio, Groq, LM Studio, Ollama) Added new pipeline entries misleading_assistance and user_guide_assistance, both using litellm_llm.default.
wren-ai-service/src/config.py Introduced new fields (doc_endpoint, is_oss) in the Settings class.
wren-ai-service/src/globals.py Integrated call to fetch_wren_ai_docs to supply documentation data to service components.
wren-ai-service/src/pipelines/generation/__init__.py Exported new classes: MisleadingAssistance and UserGuideAssistance.
wren-ai-service/src/pipelines/generation/intent_classification.py Updated intent classification: added a new "USER_GUIDE" category and incorporated the wren_ai_docs parameter into prompt processing.
wren-ai-service/src/pipelines/generation/misleading_assistance.py Introduced a new pipeline for handling misleading queries with streaming response support.
wren-ai-service/src/pipelines/generation/user_guide_assistance.py Added a new pipeline for user guide assistance with dedicated prompt formatting and streaming logic.
wren-ai-service/src/utils.py Added the fetch_wren_ai_docs function to retrieve and parse documentation based on settings.
wren-ai-service/src/web/v1/services/ask.py Updated response models and pipeline selection by adding a new general_type attribute and related logic.
wren-ai-service/tests/pytest/services/test_ask.py Enhanced the ask service fixture by adding the user_guide_assistance component to both the real and mock services.

Sequence Diagram(s)

sequenceDiagram
    participant S as Service Container
    participant C as create_service_container
    participant U as fetch_wren_ai_docs
    participant IC as IntentClassification
    participant UA as UserGuideAssistance

    S->>C: Initialize container with settings
    C->>U: Call fetch_wren_ai_docs(doc_endpoint, is_oss)
    U-->>C: Return documentation data
    C->>IC: Instantiate with wren_ai_docs
    C->>UA: Instantiate with wren_ai_docs
Loading
sequenceDiagram
    participant U as User
    participant A as Ask Service
    participant IC as IntentClassification Pipeline
    participant UA as UserGuideAssistance Pipeline

    U->>A: Send query
    A->>A: Determine intent (USER_GUIDE vs MISLEADING_QUERY)
    alt Intent is USER_GUIDE
      A->>UA: Route query to UserGuideAssistance
    else Intent is MISLEADING_QUERY
      A->>IC: Route query to IntentClassification
    end
    UA-->>A: Return result
    IC-->>A: Return result
    A->>U: Respond with processed output
Loading

Possibly related PRs

Suggested reviewers

  • paopa

Poem

I'm a bunny coding through the night,
Adding pipelines with pure delight.
Misleading and guiding, new paths we pave,
With docs and settings, our code we save.
Hopping through functions in a rhythmic beat,
Our service leaps ahead—oh so sweet!
🐇✨


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 57fa724 and 1d47cca.

📒 Files selected for processing (7)
  • wren-ai-service/docs/config_examples/config.anthropic.yaml (2 hunks)
  • wren-ai-service/docs/config_examples/config.azure.yaml (2 hunks)
  • wren-ai-service/docs/config_examples/config.deepseek.yaml (2 hunks)
  • wren-ai-service/docs/config_examples/config.google_ai_studio.yaml (2 hunks)
  • wren-ai-service/docs/config_examples/config.groq.yaml (2 hunks)
  • wren-ai-service/docs/config_examples/config.lm_studio.yaml (2 hunks)
  • wren-ai-service/docs/config_examples/config.ollama.yaml (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (6)
  • wren-ai-service/docs/config_examples/config.anthropic.yaml
  • wren-ai-service/docs/config_examples/config.ollama.yaml
  • wren-ai-service/docs/config_examples/config.deepseek.yaml
  • wren-ai-service/docs/config_examples/config.lm_studio.yaml
  • wren-ai-service/docs/config_examples/config.google_ai_studio.yaml
  • wren-ai-service/docs/config_examples/config.azure.yaml
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: pytest
🔇 Additional comments (2)
wren-ai-service/docs/config_examples/config.groq.yaml (2)

113-114: Validate "misleading_assistance" Pipeline Entry

The new pipeline entry for misleading_assistance is defined in the same pattern as the other pipelines—specifying only the llm field with the value litellm_llm.default. Please ensure that downstream processing or any integration tests reflect that this entry behaves as expected.


128-129: Validate "user_guide_assistance" Pipeline Entry

The addition of the user_guide_assistance pipeline, with its llm field set to litellm_llm.default, aligns well with the existing pipeline definitions. Make sure that any related documentation or service components (especially those supporting user guide functionality) are updated accordingly.


🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai plan to trigger planning for file edits and PR creation.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@cyyeh cyyeh force-pushed the feat/ai-service/user-guide branch 4 times, most recently from 3c81896 to 8256540 Compare December 27, 2024 09:04
@cyyeh cyyeh changed the title feat(wren-ai-service): user guide (ai-env-changed) feat(wren-ai-service): user guide and misleading streaming(ai-env-changed) Apr 10, 2025
@cyyeh cyyeh marked this pull request as ready for review April 10, 2025 04:00
@cyyeh cyyeh requested a review from imAsterSun April 10, 2025 04:00
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (9)
wren-ai-service/docs/config_examples/config.deepseek.yaml (1)

132-133: Misleading Assistance Pipeline in Deepseek Configuration
The new misleading_assistance pipeline entry is added with the default LLM configuration. Given that Deepseek configurations sometimes use their own models (as seen with other entries), please verify whether using litellm_llm.default is intentional here or if a Deepseek-specific model should be applied.

wren-ai-service/tests/data/config.test.yaml (1)

78-82: LGTM: New pipeline entries added, but remove trailing whitespace.

The new user_guide_assistance and data_assistance pipelines have been correctly added with the appropriate model, but there's trailing whitespace on line 82 that should be removed.

-  - name: data_assistance
-    llm: openai_llm.gpt-4o-mini
-    
+  - name: data_assistance
+    llm: openai_llm.gpt-4o-mini
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 82-82: trailing spaces

(trailing-spaces)

wren-ai-service/src/pipelines/generation/misleading_assistance.py (3)

49-68: Consider adding docstrings for clarity.
This prompt function constructs a prompt for handling a potentially misleading user query. Adding a short docstring would help future contributors understand its purpose and expected input/output at a glance.


99-108: Confirm no memory leaks in _streaming_callback.
Each new query_id spawns a fresh queue. If the user disconnects before reading, these queues could remain unconsumed. Consider a cleanup strategy for stale queues if not consumed within a certain time.


134-154: Confirm returned data shape in run.
The method returns the result from self._pipe.execute(...). Document the structure (e.g. dict with “replies” or similar) to clarify to consumers.

wren-ai-service/src/pipelines/generation/user_guide_assistance.py (2)

43-55: Consider adding docstrings for prompt construction.
The prompt function merges user input, language, and docs. Including a brief docstring specifying parameters and returns would help maintainability.


90-99: Evaluate potential queue growth.
Like in misleading_assistance, consider a cleanup policy if user queues are never drained. Otherwise, these queues could accumulate indefinitely in long-running services.

wren-ai-service/src/pipelines/generation/intent_classification.py (2)

28-28: Clarify classification instructions.
This added line mentions four conditions but lumps them into a single sentence. Consider providing a concise bullet list for readability.


337-338: Neat approach for flexible config injection.
Accepting wren_ai_docs via _configs is an elegant way to keep the pipeline modular. Just confirm that each consumer expects these docs.

Also applies to: 361-363, 390-390

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c49905 and 4202441.

📒 Files selected for processing (21)
  • deployment/kustomizations/base/cm.yaml (3 hunks)
  • docker/config.example.yaml (3 hunks)
  • wren-ai-service/docs/config_examples/config.anthropic.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.azure.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.deepseek.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.google_ai_studio.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.groq.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.lm_studio.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.ollama.yaml (1 hunks)
  • wren-ai-service/src/config.py (1 hunks)
  • wren-ai-service/src/globals.py (3 hunks)
  • wren-ai-service/src/pipelines/generation/__init__.py (3 hunks)
  • wren-ai-service/src/pipelines/generation/intent_classification.py (10 hunks)
  • wren-ai-service/src/pipelines/generation/misleading_assistance.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1 hunks)
  • wren-ai-service/src/utils.py (2 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (4 hunks)
  • wren-ai-service/tests/data/config.test.yaml (1 hunks)
  • wren-ai-service/tests/pytest/services/test_ask.py (2 hunks)
  • wren-ai-service/tools/config/config.example.yaml (3 hunks)
  • wren-ai-service/tools/config/config.full.yaml (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (5)
wren-ai-service/src/globals.py (2)
wren-ai-service/src/utils.py (1)
  • fetch_wren_ai_docs (163-185)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
  • UserGuideAssistance (65-141)
wren-ai-service/tests/pytest/services/test_ask.py (1)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
  • UserGuideAssistance (65-141)
wren-ai-service/src/pipelines/generation/__init__.py (2)
wren-ai-service/src/pipelines/generation/misleading_assistance.py (2)
  • misleading_assistance (71-72)
  • MisleadingAssistance (78-154)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (2)
  • user_guide_assistance (58-59)
  • UserGuideAssistance (65-141)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (5)
wren-ai-service/src/core/pipeline.py (1)
  • BasicPipeline (15-21)
wren-ai-service/src/core/provider.py (1)
  • LLMProvider (6-15)
wren-ai-service/src/pipelines/generation/intent_classification.py (2)
  • prompt (262-284)
  • run (370-392)
wren-ai-service/src/pipelines/generation/data_assistance.py (5)
  • prompt (51-67)
  • run (135-154)
  • _streaming_callback (99-107)
  • get_streaming_results (109-132)
  • _get_streaming_results (110-111)
wren-ai-service/src/pipelines/common.py (1)
  • dry_run_pipeline (36-59)
wren-ai-service/src/web/v1/services/ask.py (3)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
  • run (125-141)
wren-ai-service/src/pipelines/generation/intent_classification.py (1)
  • run (370-392)
wren-ai-service/src/pipelines/generation/followup_sql_generation_reasoning.py (1)
  • run (173-196)
🪛 GitHub Actions: AI Service Test
wren-ai-service/tests/pytest/services/test_ask.py

[error] 39-39: TypeError: IntentClassification.init() missing 1 required positional argument: 'wren_ai_docs'


[error] 233-233: AssertionError: assert 'failed' == 'finished'


[error] 623-623: KeyError: 'misleading_assistance'

🪛 YAMLlint (1.35.1)
wren-ai-service/tests/data/config.test.yaml

[error] 82-82: trailing spaces

(trailing-spaces)

🔇 Additional comments (40)
wren-ai-service/docs/config_examples/config.anthropic.yaml (1)

104-105: New Pipeline Entry for Misleading Assistance Added
The introduction of the misleading_assistance pipeline entry with llm: litellm_llm.default is correctly implemented. Please verify that this new entry is aligned with the overall intent of the PR. Additionally, since the PR title mentions a second component (user guide assistance), confirm whether a corresponding user_guide_assistance entry should be added here or is handled elsewhere.

wren-ai-service/docs/config_examples/config.azure.yaml (1)

107-108: New Misleading Assistance Pipeline Entry in Azure Configuration
The newly added misleading_assistance entry appears properly formatted and positioned within the pipes list. Please verify that its placement and the use of litellm_llm.default are intentional for this Azure-specific configuration. Consistency with other environments is critical.

wren-ai-service/docs/config_examples/config.groq.yaml (1)

113-114: Addition of Misleading Assistance Pipeline in GROQ Config
The update adds the misleading_assistance pipeline with the appropriate LLM configuration. This change looks consistent with the other configuration files. As with the other files, please ensure that if a user_guide_assistance component is expected per PR objectives, it is added where appropriate.

wren-ai-service/docs/config_examples/config.google_ai_studio.yaml (1)

118-119: Misleading Assistance Pipeline Added to Google AI Studio Config
The new pipeline entry for misleading_assistance using litellm_llm.default is correctly introduced. Please double-check that this entry meets the intended configuration requirements for the Google AI Studio environment. Also, confirm if a similar entry for user_guide_assistance should be integrated following the PR objectives.

wren-ai-service/docs/config_examples/config.lm_studio.yaml (1)

112-113: New misleading_assistance pipeline properly integrated.

The addition of the misleading_assistance pipeline with litellm_llm.default as the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.

wren-ai-service/docs/config_examples/config.ollama.yaml (1)

110-111: New misleading_assistance pipeline properly integrated.

The addition of the misleading_assistance pipeline with litellm_llm.default as the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.

wren-ai-service/tools/config/config.example.yaml (3)

122-123: New misleading_assistance pipeline properly integrated.

The addition of the misleading_assistance pipeline with litellm_llm.default as the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.


137-138: New user_guide_assistance pipeline properly integrated.

The addition of the user_guide_assistance pipeline with litellm_llm.default as the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.


166-167: New documentation settings added correctly.

The new settings doc_endpoint and is_oss are properly defined with appropriate values. These settings will be used by the fetch_wren_ai_docs function as seen in other files.

wren-ai-service/tools/config/config.full.yaml (3)

122-123: New misleading_assistance pipeline properly integrated.

The addition of the misleading_assistance pipeline with litellm_llm.default as the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.


137-138: New user_guide_assistance pipeline properly integrated.

The addition of the user_guide_assistance pipeline with litellm_llm.default as the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.


166-167: New documentation settings added correctly.

The new settings doc_endpoint and is_oss are properly defined with appropriate values. These settings will be used by the fetch_wren_ai_docs function as seen in other files.

wren-ai-service/src/globals.py (5)

10-10: Import for new documentation fetching utility added correctly.

The import for fetch_wren_ai_docs from src.utils is properly added to support the new documentation-related functionality.


48-49: Documentation fetching implemented correctly.

The code correctly calls the fetch_wren_ai_docs function with the appropriate settings parameters to retrieve the documentation needed for the new assistance pipelines.


86-89: Documentation data properly passed to IntentClassification component.

The wren_ai_docs variable is correctly passed to the IntentClassification pipeline component, enabling it to access the documentation data for user intent analysis.


90-92: MisleadingAssistance component properly integrated.

The initialization of the MisleadingAssistance pipeline component is correctly implemented using the appropriate pipe components configuration.


96-99: UserGuideAssistance component properly integrated with documentation data.

The UserGuideAssistance pipeline component is correctly initialized with both the pipe components configuration and the wren_ai_docs data needed for providing documentation-based assistance.

wren-ai-service/src/config.py (1)

57-60: LGTM: User guide configuration added correctly.

The new user guide configuration section with is_oss and doc_endpoint fields is well documented and follows the established pattern in the codebase. The default values look appropriate.

docker/config.example.yaml (3)

109-111: LGTM: New misleading_assistance pipeline added.

The new pipeline entry has been correctly configured to use the litellm_llm.default model.


128-129: LGTM: New user_guide_assistance pipeline added.

The new pipeline entry has been correctly configured to use the litellm_llm.default model.


153-154: LGTM: New configuration settings added.

The doc_endpoint and is_oss settings are correctly added to the example configuration file with appropriate values.

wren-ai-service/tests/data/config.test.yaml (1)

88-89: LGTM: New settings configuration added.

The doc_endpoint and is_oss settings are correctly added to the test configuration file with appropriate values.

wren-ai-service/src/pipelines/generation/__init__.py (4)

7-7: LGTM: Added import for MisleadingAssistance.

The import for the new MisleadingAssistance class has been correctly added in alphabetical order.


20-20: LGTM: Added import for UserGuideAssistance.

The import for the new UserGuideAssistance class has been correctly added in alphabetical order.


38-39: LGTM: Added UserGuideAssistance to all list.

The UserGuideAssistance class has been correctly added to the __all__ list.


42-42: LGTM: Added MisleadingAssistance to all list.

The MisleadingAssistance class has been correctly added to the __all__ list.

deployment/kustomizations/base/cm.yaml (3)

157-159: LGTM! Added pipeline component for misleading assistance.

The new pipeline component misleading_assistance has been properly configured to use the litellm_llm.default model.


176-178: LGTM! Added pipeline component for user guide assistance.

The new pipeline component user_guide_assistance has been properly configured to use the litellm_llm.default model.


200-201: LGTM! Added new settings for documentation endpoint and OSS flag.

The settings doc_endpoint and is_oss have been properly added to the configuration. These will be used by the fetch_wren_ai_docs function to determine which documentation to fetch.

wren-ai-service/src/web/v1/services/ask.py (3)

102-104: LGTM! Added general_type field to response models.

The general_type field correctly categorizes the type of assistance provided in the response.

Also applies to: 109-111


307-327: LGTM! Added handling for the MISLEADING_QUERY intent.

The code now properly handles the MISLEADING_QUERY intent by creating an asynchronous task to run the misleading_assistance pipeline.


354-371: LGTM! Added handling for the USER_GUIDE intent.

The code now properly handles the USER_GUIDE intent by creating an asynchronous task to run the user_guide_assistance pipeline.

wren-ai-service/src/pipelines/generation/misleading_assistance.py (2)

70-73: Validate the existence of prompt data before usage.
In misleading_assistance, ensure prompt.get("prompt") always returns a valid string. If it’s None, the generator call might raise an exception.

Do we have upstream checks guaranteeing "prompt" is populated?


109-133: Assess the infinite loop risk.
The while True: loop in get_streaming_results terminates on either a <DONE> token or a 120s timeout. While this works, verify that large or slow streams won’t surpass 120 seconds, leading to unintended breaks.

wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)

58-60: Check for null or empty prompt string.
In user_guide_assistance, verify the prompt dictionary actually contains "prompt". Otherwise, the generator call may fail.

wren-ai-service/src/pipelines/generation/intent_classification.py (5)

59-72: Ensure the “GENERAL” category covers only non-schema questions.
The updated definition for GENERAL might overlap with MISLEADING_QUERY. Double-check the boundary conditions so the pipeline’s logic remains consistent.


73-84: Good expansion for USER_GUIDE category.
Introducing “USER_GUIDE” clarifies queries focused on Wren AI usage. This is a well-structured addition that helps separate schema-related from guide-related queries.


106-107: Consistent naming for JSON output.
The expanded "results" field includes "USER_GUIDE". Ensure all downstream parsers handle this new value without errors.


264-264: Validate wren_ai_docs usage.
You’ve introduced wren_ai_docs to the prompt and pass it to docs=wren_ai_docs. Verify these docs are relevant in all flows and that no sensitive internal data is unintentionally exposed.

Also applies to: 283-283


315-315: Allow for new classification results.
The expanded results literal supports the new “USER_GUIDE” category. This is a clean approach that helps ensure type consistency with Pydantic.

Comment thread wren-ai-service/tests/pytest/services/test_ask.py Outdated
Comment thread wren-ai-service/tests/pytest/services/test_ask.py
Comment thread wren-ai-service/src/utils.py
Comment thread wren-ai-service/src/web/v1/services/ask.py
Comment thread wren-ai-service/src/pipelines/generation/user_guide_assistance.py Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (18)
wren-ai-service/docs/config_examples/config.ollama.yaml (1)

110-111: New Pipeline Entry for Misleading Assistance:
A new pipeline entry named misleading_assistance (using litellm_llm.default) has been added. Please verify that its structure (with only the llm field) is sufficient for the intended assistance functionality. Also, the PR title suggests that a “user guide” related feature is expected—confirm if a corresponding user_guide_assistance entry should also be introduced.

wren-ai-service/docs/config_examples/config.groq.yaml (1)

113-114: New Pipeline Entry for Misleading Assistance:
The file now includes a misleading_assistance pipeline entry with litellm_llm.default. Ensure this addition is consistent with similar entries in other configuration examples, and check if additional similar pipelines (e.g. for user guide assistance) are expected according to the PR objectives.

wren-ai-service/docs/config_examples/config.lm_studio.yaml (1)

112-113: New Misleading Assistance Pipeline Entry:
A new pipeline named misleading_assistance using litellm_llm.default has been added. Please confirm that this entry meets the design requirements for assistance features and consider whether a separate user_guide_assistance pipeline is also required.

wren-ai-service/docs/config_examples/config.anthropic.yaml (1)

104-105: Introduced Misleading Assistance Pipeline:
The newly introduced misleading_assistance pipeline using litellm_llm.default aligns with the ongoing assistance enhancements. Confirm that this configuration fits into the overall strategy for handling misleading queries and that no additional assistance pipelines (e.g. user_guide_assistance) are missed.

wren-ai-service/docs/config_examples/config.azure.yaml (1)

107-108: New Misleading Assistance Pipeline Entry:
The file now includes a misleading_assistance entry with litellm_llm.default. Ensure that this entry is consistent with its counterparts in other configuration files and supports the intended misleading streaming functionality described in the PR.

wren-ai-service/tests/data/config.test.yaml (1)

82-82: Remove trailing whitespace.

There's trailing whitespace at the end of line 82 that should be removed.

-  - name: data_assistance
-    llm: openai_llm.gpt-4o-mini
-    
+  - name: data_assistance
+    llm: openai_llm.gpt-4o-mini
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 82-82: trailing spaces

(trailing-spaces)

wren-ai-service/tests/pytest/services/test_ask.py (1)

176-177: Good mock addition but check if more implementation is needed.

You've added user_guide_assistance to the mock service, but the empty string might not be sufficient if tests need to interact with it meaningfully. Consider creating a proper mock class similar to other mocks in the file.

-"user_guide_assistance": "",
+"user_guide_assistance": UserGuideAssistanceMock(),

Then add a UserGuideAssistanceMock class to the mocks.py file:

class UserGuideAssistanceMock:
    async def run(self, query: str, language: str, query_id: Optional[str] = None):
        return {"response": "Mock user guide assistance response"}
        
    async def get_streaming_results(self, query_id):
        yield "Mock streaming response"
wren-ai-service/src/web/v1/services/ask.py (1)

109-111: Field usage in AskResultResponse.
While _AskResultResponse and AskResultResponse share similar fields, ensure that combining them doesn’t cause confusion. The exclude=True directive might help keep the public fields uncluttered.

wren-ai-service/src/pipelines/generation/user_guide_assistance.py (3)

17-28: Maintain clarity in system instructions.
The system prompt is self-explanatory. Consider referencing any disclaimers in the doc if user queries fall outside the coverage of your user guide.


100-123: Streaming results logic.
Enforcing a 120-second timeout is good, but consider logging or returning partial results if the user queue is empty or delayed.


144-153: Dry run example.
Demonstrates pipeline usage with a sample query. Consider adding a short docstring explaining how to execute this module.

wren-ai-service/src/pipelines/generation/misleading_assistance.py (7)

15-15: Use module-specific logger.
Consider using logging.getLogger(__name__) to match the module name and improve log traceability.


18-33: Consider externalizing the system prompt.
Externalizing lengthy prompt text (e.g., in a config file or constants module) can enhance maintainability, especially if it's reused or localized.


49-68: Prevent potential token overflow when merging history.
If histories can become very large, consider bounding or summarizing past questions to prevent exceeding token limits.


70-73: Add error handling for missing or invalid prompt fields.
A try-except block or validations on prompt["prompt"] can help prevent runtime errors if prompt is malformed.


78-94: Add docstring to clarify pipeline usage.
A brief docstring explaining initialization parameters and _components would enhance readability.


99-108: Handle null or empty chunk content gracefully.
Consider verifying chunk.content isn't empty or None and logging errors for unexpected streaming issues.


109-133: Log streaming timeouts.
When asyncio.wait_for times out, logging a warning or error can help diagnose stream disconnections or long response times.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2c49905 and 4202441.

📒 Files selected for processing (21)
  • deployment/kustomizations/base/cm.yaml (3 hunks)
  • docker/config.example.yaml (3 hunks)
  • wren-ai-service/docs/config_examples/config.anthropic.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.azure.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.deepseek.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.google_ai_studio.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.groq.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.lm_studio.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.ollama.yaml (1 hunks)
  • wren-ai-service/src/config.py (1 hunks)
  • wren-ai-service/src/globals.py (3 hunks)
  • wren-ai-service/src/pipelines/generation/__init__.py (3 hunks)
  • wren-ai-service/src/pipelines/generation/intent_classification.py (10 hunks)
  • wren-ai-service/src/pipelines/generation/misleading_assistance.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1 hunks)
  • wren-ai-service/src/utils.py (2 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (4 hunks)
  • wren-ai-service/tests/data/config.test.yaml (1 hunks)
  • wren-ai-service/tests/pytest/services/test_ask.py (2 hunks)
  • wren-ai-service/tools/config/config.example.yaml (3 hunks)
  • wren-ai-service/tools/config/config.full.yaml (3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (6)
wren-ai-service/src/pipelines/generation/__init__.py (2)
wren-ai-service/src/pipelines/generation/misleading_assistance.py (2)
  • misleading_assistance (71-72)
  • MisleadingAssistance (78-154)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (2)
  • user_guide_assistance (58-59)
  • UserGuideAssistance (65-141)
wren-ai-service/src/globals.py (4)
wren-ai-service/src/utils.py (1)
  • fetch_wren_ai_docs (163-185)
wren-ai-service/src/pipelines/generation/misleading_assistance.py (1)
  • MisleadingAssistance (78-154)
wren-ai-service/src/pipelines/generation/data_assistance.py (1)
  • DataAssistance (78-154)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
  • UserGuideAssistance (65-141)
wren-ai-service/tests/pytest/services/test_ask.py (1)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
  • UserGuideAssistance (65-141)
wren-ai-service/src/web/v1/services/ask.py (3)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
  • run (125-141)
wren-ai-service/src/pipelines/generation/intent_classification.py (1)
  • run (370-392)
wren-ai-service/src/pipelines/generation/followup_sql_generation_reasoning.py (1)
  • run (173-196)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (5)
wren-ai-service/src/core/pipeline.py (1)
  • BasicPipeline (15-21)
wren-ai-service/src/core/provider.py (1)
  • LLMProvider (6-15)
wren-ai-service/src/pipelines/generation/intent_classification.py (2)
  • prompt (262-284)
  • run (370-392)
wren-ai-service/src/pipelines/generation/data_assistance.py (5)
  • prompt (51-67)
  • run (135-154)
  • _streaming_callback (99-107)
  • get_streaming_results (109-132)
  • _get_streaming_results (110-111)
wren-ai-service/src/pipelines/common.py (1)
  • dry_run_pipeline (36-59)
wren-ai-service/src/pipelines/generation/misleading_assistance.py (3)
wren-ai-service/src/core/pipeline.py (1)
  • BasicPipeline (15-21)
wren-ai-service/src/core/provider.py (1)
  • LLMProvider (6-15)
wren-ai-service/src/web/v1/services/ask.py (2)
  • ask (204-639)
  • AskHistory (16-18)
🪛 YAMLlint (1.35.1)
wren-ai-service/tests/data/config.test.yaml

[error] 82-82: trailing spaces

(trailing-spaces)

🪛 GitHub Actions: AI Service Test
wren-ai-service/tests/pytest/services/test_ask.py

[error] 39-39: TypeError: IntentClassification.init() missing 1 required positional argument: 'wren_ai_docs'


[error] 233-233: AssertionError: assert 'failed' == 'finished'

wren-ai-service/src/web/v1/services/ask.py

[error] 623-623: KeyError: 'misleading_assistance'


[error] 103-103: SQL is not executable: No project found


[warning] 155-155: SQL pairs file not found: sql_pairs.json


[warning] 291-291: Calling QdrantDocumentStore.write_documents() with empty list


[error] 51-51: Failed to parse MDL: unexpected character: line 1 column 1 (char 0)


[error] 58-58: An error occurred during relationship recommendation generation: Pipeline error


[error] 139-139: Relationship Recommendation Resource with ID 'non_existent_id' not found.


[error] 51-51: An error occurred during semantics description generation: Test exception


[warning] 155-155: SQL pairs file not found: sql_pairs.json


[warning] 291-291: Calling QdrantDocumentStore.write_documents() with empty list


[warning] 155-155: SQL pairs file not found: sql_pairs.json


[warning] 155-155: SQL pairs file not found: sql_pairs.json


[warning] 291-291: Calling QdrantDocumentStore.write_documents() with empty list


[warning] 155-155: SQL pairs file not found: sql_pairs.json

🔇 Additional comments (42)
wren-ai-service/docs/config_examples/config.ollama.yaml (1)

147-166: Settings Block Review:
The settings block in this file remains unchanged. According to the PR objectives and high-level summary, new parameters like doc_endpoint (set to https://docs.getwren.ai) and is_oss (set to true) should be introduced. Please verify whether these settings are omitted by design in the docs example or if they need to be added here.

wren-ai-service/docs/config_examples/config.azure.yaml (1)

159-178: Verify Settings Consistency:
The current settings block does not include the new parameters (doc_endpoint and is_oss) described in the PR objectives/summary. Please confirm if these settings are only meant for production configurations (e.g. in deployment/kustomizations/base/cm.yaml, docker/config.example.yaml, or wren-ai-service/src/config.py) or if they should also be reflected in the docs example.

wren-ai-service/docs/config_examples/config.deepseek.yaml (1)

132-133: LGTM: Clean misleading_assistance pipeline implementation

The addition of the misleading_assistance pipeline is properly configured with the litellm_llm.default model. This matches the PR objectives for addressing misleading queries.

wren-ai-service/tools/config/config.full.yaml (3)

122-123: LGTM: Consistent misleading_assistance implementation

The misleading_assistance pipeline implementation aligns with the configuration in other files, using the appropriate LLM model.


137-138: LGTM: user_guide_assistance pipeline addition

The user_guide_assistance pipeline is properly added with the correct LLM model reference, supporting the new documentation capabilities mentioned in the PR objectives.


166-167: LGTM: Settings for documentation endpoint and OSS status

The addition of doc_endpoint and is_oss settings provides the necessary configuration for documentation fetching functionality, aligning with the PR objectives.

wren-ai-service/src/pipelines/generation/intent_classification.py (5)

28-28: LGTM: Expanded intent classification categories

The addition of "USER_GUIDE" as a fourth classification category is consistently implemented across the system prompt, output format, and result model class.

Also applies to: 106-106, 315-315


59-84: LGTM: Well-defined intent categories

The updated GENERAL category and new USER_GUIDE category are clearly defined with appropriate characteristics and examples, making the intent classification more robust and capable of handling documentation requests.


141-144: LGTM: User guide integration in prompt template

The prompt template now properly includes the USER GUIDE section, enabling the model to consider documentation content when classifying user intents.


264-265: LGTM: Documentation parameter in prompt function

The prompt function correctly accepts and utilizes the wren_ai_docs parameter, passing it to the prompt builder.

Also applies to: 283-283


337-338: LGTM: wren_ai_docs integration in IntentClassification class

The IntentClassification class has been properly updated to accept and store the wren_ai_docs parameter, integrating documentation content into the intent classification pipeline.

Also applies to: 361-363

wren-ai-service/docs/config_examples/config.google_ai_studio.yaml (1)

118-119: LGTM: Consistent misleading_assistance implementation

The addition of the misleading_assistance pipeline matches the pattern established in other configuration files, maintaining consistency across different model providers.

wren-ai-service/src/config.py (1)

57-59: Looking good: New user guide configuration parameters added.

The new configuration parameters for the user guide feature are well-structured and properly typed with appropriate defaults.

wren-ai-service/src/pipelines/generation/__init__.py (2)

7-7: Well implemented: New pipeline imports for assistance features.

The imports for MisleadingAssistance and UserGuideAssistance classes are properly added.

Also applies to: 20-20


38-38: Correctly exposed: New pipeline components added to all.

The new pipeline components have been properly exported in the __all__ list, maintaining the module's public interface.

Also applies to: 42-42

deployment/kustomizations/base/cm.yaml (2)

157-158: Complete configuration: New pipeline entries registered.

The misleading_assistance and user_guide_assistance pipeline configurations are properly defined, following the same pattern as other pipeline entries.

Also applies to: 176-177


200-201: Good configuration: User guide settings added.

The doc_endpoint and is_oss settings are properly configured, matching the defaults from the Settings class.

wren-ai-service/tests/data/config.test.yaml (2)

78-81: Well tested: Pipeline entries added to test configuration.

The test configuration has been updated with the new pipeline entries to ensure proper testing of the new functionality.


88-89: Test settings updated correctly: Added user guide configuration.

The doc_endpoint and is_oss settings have been properly added to the test configuration, aligning with the main configuration.

docker/config.example.yaml (3)

109-110: New pipeline addition looks good.

The misleading_assistance pipeline has been properly added and configured with the litellm_llm.default model.


128-129: New pipeline addition looks good.

The user_guide_assistance pipeline has been properly added and configured with the litellm_llm.default model.


153-154:

Details

✅ Verification successful

Verify that the documentation endpoint is correct and accessible.

The new settings for doc_endpoint and is_oss have been added correctly, but it's important to ensure the endpoint is accessible and returns the expected format.


🏁 Script executed:

#!/bin/bash
# Check if the documentation endpoint is accessible
curl -s -o /dev/null -w "%{http_code}" https://docs.getwren.ai/oss/llms.md

Length of output: 80


Documentation endpoint verified successfully
The documentation endpoint at https://docs.getwren.ai (tested via https://docs.getwren.ai/oss/llms.md) returned an HTTP 200 response, indicating it is accessible and functioning as expected. No changes are needed for these settings.

wren-ai-service/tools/config/config.example.yaml (3)

122-123: New pipeline addition looks good.

The misleading_assistance pipeline has been properly added and configured with the litellm_llm.default model.


137-138: New pipeline addition looks good.

The user_guide_assistance pipeline has been properly added and configured with the litellm_llm.default model.


166-167: Settings additions look good.

The new settings for doc_endpoint and is_oss have been added correctly.

wren-ai-service/src/globals.py (3)

10-10: Import statement looks good.
Importing fetch_wren_ai_docs at the top level is a clear and readable approach to keep all dependencies visible.


96-99: Passing documentation to UserGuideAssistance.
This is consistent with the design of your new pipeline. Confirm that providing the same wren_ai_docs instance here doesn’t lead to duplication or memory overhead issues.


88-89:

Details

❓ Verification inconclusive

Pass wren_ai_docs carefully.
You're passing documentation data to the IntentClassification pipeline. Confirm that the pipeline properly uses this data and handles any edge cases if the docs are empty.


🏁 Script executed:

#!/bin/bash
# Verify that the `IntentClassification` class references `wren_ai_docs` safely
rg 'wren_ai_docs' -A 10

Length of output: 12191


Attention: Verify Handling of Empty Documentation in the IntentClassification Pipeline

It appears that wren_ai_docs is passed from globals.py directly into the pipelines (including IntentClassification in wren-ai-service/src/pipelines/generation/intent_classification.py). Please confirm that the downstream components—particularly the prompt_builder.run call within the IntentClassification pipeline—gracefully handle the scenario where wren_ai_docs is empty. If the documentation data could ever be empty, ensure that appropriate fallback behavior or error checking is implemented to prevent runtime issues.

wren-ai-service/src/web/v1/services/ask.py (5)

102-104: Introduction of general_type in _AskResultResponse.
This optional field helps categorize general assistance queries (MISLEADING_QUERY, DATA_ASSISTANCE, USER_GUIDE). The addition is clear and consistent.

🧰 Tools
🪛 GitHub Actions: AI Service Test

[error] 103-103: SQL is not executable: No project found


326-326: Populating general_type="MISLEADING_QUERY".
This properly flags the assistance pipeline behind the scenes. No issues found here.


350-351: Assigning general_type="DATA_ASSISTANCE".
Consistent with the approach for misleading queries. Good alignment with the pipeline naming.


353-354: Returning results early.
Exiting the function right after setting metadata means no further processing. This is intentional for “GENERAL” type queries, so it looks fine.


682-687: Determining SQL reasoning pipeline.
The fallback to “followup_sql_generation_reasoning” or “sql_generation_reasoning” based on is_followup is consistent with your architecture.

wren-ai-service/src/pipelines/generation/user_guide_assistance.py (5)

1-5: Imports and typing are correct.
The standard library, logging, and async usage look good. The imports from the hamilton framework and haystack library are well organized.


30-39: User prompt template.
The usage of Jinja-like placeholders and loops is straightforward. Good practice to handle multi-doc scenarios.


43-55: prompt function usage.
This function integrates well with the PromptBuilder. Looks consistent with other pipeline patterns.


124-142: Async run method orchestrating pipeline.
Your pipeline steps are well-structured. The usage of AsyncDriver plus the Hamilton node references is consistent. Looks good overall.


65-99:

Details

✅ Verification successful

Constructor sets up LLM with streaming callback.
Using llm_provider.get_generator with a _streaming_callback is consistent with other pipelines. Watch for concurrency issues if multiple queries share the same queue keys.


🏁 Script executed:

#!/bin/bash
# Confirm that query_id uniqueness is enforced throughout the codebase
rg 'query_id'

Length of output: 56680


LLM Streaming Callback and Query ID Concurrency Verified

  • The _streaming_callback implementation in wren-ai-service/src/pipelines/generation/user_guide_assistance.py is consistent with similar patterns in other pipelines.
  • A repository-wide search confirms that query IDs are generated as unique UUIDs and are enforced as unique both in database migrations and throughout the service endpoints.
  • This uniform enforcement minimizes the risk of concurrency issues due to shared queue keys, assuming the upstream components consistently supply unique query IDs.
wren-ai-service/src/pipelines/generation/misleading_assistance.py (4)

1-4: All import statements appear necessary and correctly used.


35-46: Well-structured template prompt.
No issues found with logic or formatting usage.


135-154: Pipeline execution flow looks correct.
No functional issues spotted. The approach to passing parameters into execute is clear and consistent.


157-167: Dry run pipeline section is appropriate.
Good for quick local testing. No concerns identified.

Comment thread wren-ai-service/src/utils.py
Comment thread wren-ai-service/tests/pytest/services/test_ask.py
Comment thread wren-ai-service/src/globals.py
Comment thread wren-ai-service/src/globals.py
Comment thread wren-ai-service/src/web/v1/services/ask.py
Comment thread wren-ai-service/src/web/v1/services/ask.py
Comment thread wren-ai-service/src/web/v1/services/ask.py
Comment thread wren-ai-service/src/pipelines/generation/user_guide_assistance.py
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (9)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (9)

17-28: Consider limiting the scope of system instructions.
This multi-line string contains broad instructions. If the codebase supports more modular prompts, consider extracting or condensing the instructions to focus specifically on user guide tasks while delegating generic instructions (e.g., language adaptation) elsewhere. This separation can simplify maintenance and testing.


30-39: Add more context to user prompt template.
The template outlines the user question and docs, but you might consider including user context (e.g., previous user messages if relevant) or clarifying placeholders. This helps the model provide more accurate responses in multi-turn settings.


42-55: Specify a docstring for the prompt function.
While the function signature is clear, adding a docstring explaining the transformation from inputs to the prompt helps future maintainers understand how queries and documentation are formatted.


57-59: Add error handling for failed LLM calls.
If generator(prompt=...) raises an exception (e.g., network failure, timeout), the function will fail silently. Consider using try/except to gracefully handle errors and possibly return a structured error response.


65-89: Document the constructor arguments.
Adding docstrings for __init__ clarifies how llm_provider and wren_ai_docs are intended to be used and what extra configuration should be provided in **kwargs.


90-99: Consider queue size limit to avoid potential memory issues.
Prolonged streaming or a large volume of data can lead to unbounded growth in _user_queues. While this might be acceptable in certain deployments, you might consider a maximum queue size or backpressure mechanism to prevent potential memory exhaustion.


100-123: Validate presence of streaming callback content.
The loop correctly yields chunks until <DONE> is encountered. However, you could add an additional check if chunk content is unexpectedly empty or invalid. That ensures more robust error handling.


125-142: Provide a return type annotation for clarity.
Although it returns await self._pipe.execute(...), the signature does not specify a return type. Updating it to reflect the actual structure (e.g., -> dict or -> Any) helps future maintainers and static analysis tools.


144-153: Consider removing or guarding the main block in production code.
The dry_run_pipeline main block is helpful for local testing or demonstrations. In production, ensure it's not inadvertently invoked or consider moving this example usage to a dedicated script or test.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b8314c4 and fb0c533.

📒 Files selected for processing (3)
  • wren-ai-service/src/globals.py (3 hunks)
  • wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1 hunks)
  • wren-ai-service/src/utils.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • wren-ai-service/src/utils.py
  • wren-ai-service/src/globals.py

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
wren-ai-service/src/utils.py (1)

163-189: Handle potential ValueError when splitting docs.

This block is missing a safety check or error handling for the case where no newline exists in the doc string, which can raise a ValueError. Similar concerns have been raised and remain relevant.

    for doc in docs:
        if doc:
-           path, content = doc.split("\n")
+           try:
+               path, content = doc.split("\n", 1)
+           except ValueError:
+               logger.warning(f"Malformed doc format: {doc[:100]}...")
+               continue
            results.append(
                {
                    "path": f'{doc_endpoint_base}/{path.replace(".md", "")}',
                    "content": content,
                }
            )
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)

57-60: Ensure robust error handling in asynchronous generation.

The pipeline currently awaits the generator but doesn't robustly handle failures like timeouts or network errors from the generator. This was previously flagged and still warrants consideration.

🧹 Nitpick comments (2)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (2)

17-28: Refine grammar and style in system prompt.

Some instructions could be rewritten for improved clarity, e.g. “please respond to the user that you couldn’t find a relevant answer…”. Consider updating the text to be more polished.

-You need to understand the user question and the user guide, and then answer the user question.
+Please read the user question and the user guide carefully, and formulate a concise, accurate response.

-3. If you think you cannot answer the user question given the user guide, please kindly respond user that you don't find relevant answer in the user guide.
+3. If there is insufficient information in the user guide to answer the question, kindly inform the user that no relevant answer was found.

125-142: Add explicit return type annotation for clarity.

Although the method returns a dictionary from _pipe.execute(), adding a return type annotation helps maintain clarity.

-    async def run(
+    async def run(
+        self,
+        query: str,
+        language: str,
+        query_id: Optional[str] = None,
+    ) -> dict:
         ...
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b8314c4 and fb0c533.

📒 Files selected for processing (3)
  • wren-ai-service/src/globals.py (3 hunks)
  • wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1 hunks)
  • wren-ai-service/src/utils.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • wren-ai-service/src/globals.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (3)
wren-ai-service/src/core/pipeline.py (1)
  • BasicPipeline (15-21)
wren-ai-service/src/core/provider.py (1)
  • LLMProvider (6-15)
wren-ai-service/src/pipelines/common.py (1)
  • dry_run_pipeline (36-59)
🔇 Additional comments (2)
wren-ai-service/src/utils.py (2)

6-6: No additional feedback on the new import.


161-162: No meaningful changes in blank lines.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
wren-ai-service/src/web/v1/services/ask.py (3)

354-373: Add validation for user query before calling user_guide_assistance.

The USER_GUIDE intent handler doesn't validate the user's query before calling the pipeline, which could lead to processing empty or malformed inputs.


687-690: Add validation for selected pipeline in get_ask_streaming_result.

The method dynamically selects a pipeline based on the general_type but doesn't verify if the pipeline exists before trying to access it.


307-318: Add exception handling for asynchronous tasks.

The code creates asynchronous tasks for different pipelines but doesn't implement any error handling. If the tasks fail, exceptions will be unhandled and tasks may be orphaned.

Also applies to: 354-362

🧹 Nitpick comments (1)
wren-ai-service/src/web/v1/services/ask.py (1)

673-686: Consider using a dictionary mapping for pipeline selection.

The current if-elif structure for selecting the pipeline is becoming harder to maintain as more general types are added. A dictionary mapping would be cleaner and more maintainable.

Refactor the pipeline selection code:

- _pipeline_name = ""
- if self._ask_results.get(query_id).type == "GENERAL":
-    if self._ask_results.get(query_id).general_type == "USER_GUIDE":
-        _pipeline_name = "user_guide_assistance"
-    elif self._ask_results.get(query_id).general_type == "DATA_ASSISTANCE":
-        _pipeline_name = "data_assistance"
-    elif self._ask_results.get(query_id).general_type == "MISLEADING_QUERY":
-        _pipeline_name = "misleading_assistance"
- elif self._ask_results.get(query_id).status == "planning":
-    if self._ask_results.get(query_id).is_followup:
-        _pipeline_name = "followup_sql_generation_reasoning"
-    else:
-        _pipeline_name = "sql_generation_reasoning"
+ result = self._ask_results.get(query_id)
+ pipeline_mapping = {
+     "GENERAL": {
+         "USER_GUIDE": "user_guide_assistance",
+         "DATA_ASSISTANCE": "data_assistance", 
+         "MISLEADING_QUERY": "misleading_assistance"
+     },
+     "planning": {
+         True: "followup_sql_generation_reasoning",  # is_followup = True
+         False: "sql_generation_reasoning"  # is_followup = False
+     }
+ }
+ 
+ _pipeline_name = ""
+ if result.type == "GENERAL" and result.general_type in pipeline_mapping["GENERAL"]:
+     _pipeline_name = pipeline_mapping["GENERAL"][result.general_type]
+ elif result.status == "planning":
+     _pipeline_name = pipeline_mapping["planning"][result.is_followup]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fb0c533 and 3a710a5.

📒 Files selected for processing (2)
  • wren-ai-service/src/web/v1/services/ask.py (4 hunks)
  • wren-ai-service/tests/pytest/services/test_ask.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • wren-ai-service/tests/pytest/services/test_ask.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
wren-ai-service/src/web/v1/services/ask.py (3)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
  • run (125-141)
wren-ai-service/src/pipelines/generation/intent_classification.py (1)
  • run (370-392)
wren-ai-service/src/pipelines/generation/sql_generation_reasoning.py (1)
  • run (162-183)
⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: pytest

Comment thread wren-ai-service/src/web/v1/services/ask.py
Copy link
Copy Markdown
Contributor

@imAsterSun imAsterSun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants