Skip to content

chore(wren-ai-service): minor updates#1389

Merged
paopa merged 2 commits into
mainfrom
chore/ai-service/minor-updates
Mar 12, 2025
Merged

chore(wren-ai-service): minor updates#1389
paopa merged 2 commits into
mainfrom
chore/ai-service/minor-updates

Conversation

@cyyeh
Copy link
Copy Markdown
Member

@cyyeh cyyeh commented Mar 12, 2025

  • remove redundant pipeline file
  • add allow_intent_classification: true and allow_sql_generation_reasoning: true to config.yaml
  • expose historical_question_retrieval_similarity_threshold in config.py

Summary by CodeRabbit

  • New Features
    • Enabled new configuration options for intent classification and SQL generation reasoning.
    • Introduced a configurable historical question retrieval threshold to enhance document filtering.
  • Refactor
    • Streamlined historical question processing by removing legacy components.
  • Documentation
    • Updated configuration examples across environments to reflect the new settings.

@cyyeh cyyeh added module/ai-service ai-service related ci/ai-service ai-service related labels Mar 12, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 12, 2025

Walkthrough

This pull request adds two new boolean configuration settings—allow_intent_classification: true and allow_sql_generation_reasoning: true—across multiple configuration files (deployment, docker, docs, and tools). Additionally, it introduces a new attribute historical_question_retrieval_similarity_threshold with a default value of 0.9 in the Python service configuration and updates related functions. The changes also remove an obsolete historical question pipeline implementation and update the historical question retrieval pipeline to accept a dynamic similarity threshold.

Changes

File(s) Change Summary
deployment/kustomizations/base/cm.yaml, docker/config.example.yaml, wren-ai-service/tools/config/config.example.yaml, wren-ai-service/tools/config/config.full.yaml Added new settings allow_intent_classification: true and allow_sql_generation_reasoning: true in the configuration files.
wren-ai-service/docs/config_examples/config.azure.yaml, .../config.deepseek.yaml, .../config.google_ai_studio.yaml, .../config.groq.yaml, .../config.ollama.yaml Introduced the same new settings (allow_intent_classification and allow_sql_generation_reasoning) in documentation example configuration files.
wren-ai-service/src/config.py Added a new attribute historical_question_retrieval_similarity_threshold: float with a default value of 0.9 to the Settings class.
wren-ai-service/src/globals.py Integrated the new historical_question_retrieval_similarity_threshold parameter during the instantiation of the HistoricalQuestionRetrieval service.
wren-ai-service/src/pipelines/retrieval/historical_question.py Removed the legacy historical question pipeline implementation including multiple classes and functions.
wren-ai-service/src/pipelines/retrieval/historical_question_retrieval.py Updated function signatures and the class initializer to accept a dynamic historical_question_retrieval_similarity_threshold.

Sequence Diagram(s)

sequenceDiagram
    participant Config as Settings
    participant Service as HistoricalQuestionRetrieval
    participant Filter as ScoreFilter

    Config->>Service: Provide historical_question_retrieval_similarity_threshold (default 0.9)
    Service->>Filter: Invoke score filtering with dynamic threshold
    Filter-->>Service: Return filtered documents
Loading

Possibly related PRs

Suggested Reviewers

  • paopa

Poem

I'm a little bunny, hopping with code today,
New settings sprout like carrots along the way.
Intent and SQL now dance in every file,
With thresholds set gently, making workflows smile.
I nibble on changes with joy and delight—
Happy code hops to a future so bright!

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
wren-ai-service/docs/config_examples/config.deepseek.yaml (1)

153-153: Remove Trailing Spaces
Line 153 contains trailing spaces that may potentially cause YAML parsing issues. Please remove them to ensure strict YAML compliance.

Apply this diff:

-      allow_sql_generation_reasoning: true  
+      allow_sql_generation_reasoning: true
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 153-153: trailing spaces

(trailing-spaces)

wren-ai-service/docs/config_examples/config.groq.yaml (1)

134-135: New Settings with Formatting Cleanup
The new flags allow_intent_classification: true and allow_sql_generation_reasoning: true have been added as expected. However, line 135 includes trailing spaces that should be removed to maintain proper YAML formatting.

Apply this diff:

-  allow_sql_generation_reasoning: true  
+  allow_sql_generation_reasoning: true
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 135-135: trailing spaces

(trailing-spaces)

wren-ai-service/docs/config_examples/config.google_ai_studio.yaml (1)

141-142: Google AI Studio Config: New Settings and Trailing Space Removal
The configuration now includes allow_intent_classification: true and allow_sql_generation_reasoning: true as required. Please remove the trailing spaces on line 142 to ensure consistency with YAML format standards.

Apply this diff:

-  allow_sql_generation_reasoning: true  
+  allow_sql_generation_reasoning: true
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 142-142: trailing spaces

(trailing-spaces)

wren-ai-service/docs/config_examples/config.azure.yaml (1)

142-143: Remove trailing spaces in configuration setting.

The configuration settings look good, but there are trailing spaces after allow_sql_generation_reasoning: true that should be removed for consistency.

-  allow_sql_generation_reasoning: true  
+  allow_sql_generation_reasoning: true
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 143-143: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d11477a and 9f25a25.

📒 Files selected for processing (13)
  • deployment/kustomizations/base/cm.yaml (1 hunks)
  • docker/config.example.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.azure.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.deepseek.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.google_ai_studio.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.groq.yaml (1 hunks)
  • wren-ai-service/docs/config_examples/config.ollama.yaml (1 hunks)
  • wren-ai-service/src/config.py (1 hunks)
  • wren-ai-service/src/globals.py (1 hunks)
  • wren-ai-service/src/pipelines/retrieval/historical_question.py (0 hunks)
  • wren-ai-service/src/pipelines/retrieval/historical_question_retrieval.py (4 hunks)
  • wren-ai-service/tools/config/config.example.yaml (1 hunks)
  • wren-ai-service/tools/config/config.full.yaml (1 hunks)
💤 Files with no reviewable changes (1)
  • wren-ai-service/src/pipelines/retrieval/historical_question.py
🧰 Additional context used
🪛 YAMLlint (1.35.1)
wren-ai-service/docs/config_examples/config.groq.yaml

[error] 135-135: trailing spaces

(trailing-spaces)

wren-ai-service/docs/config_examples/config.google_ai_studio.yaml

[error] 142-142: trailing spaces

(trailing-spaces)

wren-ai-service/docs/config_examples/config.azure.yaml

[error] 143-143: trailing spaces

(trailing-spaces)

wren-ai-service/docs/config_examples/config.deepseek.yaml

[error] 153-153: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: pytest
  • GitHub Check: Analyze (go)
  • GitHub Check: pytest
🔇 Additional comments (12)
docker/config.example.yaml (1)

139-140: New Configuration Settings Added Correctly
The new settings allow_intent_classification: true and allow_sql_generation_reasoning: true are added under the settings section as intended. These additions align well with the PR objectives by enabling specific features for intent classification and SQL generation reasoning.

wren-ai-service/docs/config_examples/config.ollama.yaml (1)

131-132: Intent Classification and SQL Generation Reasoning Settings Added
The configuration now includes the new flags allow_intent_classification: true and allow_sql_generation_reasoning: true in the settings section. This change is consistent with similar updates in related configuration files and improves feature toggling as intended.

wren-ai-service/docs/config_examples/config.deepseek.yaml (1)

152-153: New Configuration Options for DeepSeek
The settings allow_intent_classification: true and allow_sql_generation_reasoning: true are added to enhance configurability for intent classification and SQL generation reasoning. This update is correctly placed under the settings section and is consistent with similar changes across the codebase.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 153-153: trailing spaces

(trailing-spaces)

deployment/kustomizations/base/cm.yaml (1)

187-188: Looks good - configuration options added successfully.

These new configuration settings enable important features in the Wren AI service - intent classification and SQL generation reasoning capabilities.

wren-ai-service/tools/config/config.example.yaml (1)

157-158: Configuration options properly added to example config.

The new settings align with the PR objective to enable intent classification and SQL generation reasoning features.

wren-ai-service/tools/config/config.full.yaml (1)

158-159: Configuration options properly added to full config.

The new settings are consistently added across all configuration files, ensuring uniform feature availability throughout the system.

wren-ai-service/src/config.py (1)

33-33: Well-placed new configuration parameter for similarity threshold.

The addition of historical_question_retrieval_similarity_threshold with a default value of 0.9 successfully exposes this parameter in the configuration, making it more accessible as intended in the PR objectives. This enhances the flexibility of the document retrieval process.

wren-ai-service/src/globals.py (1)

89-89: LGTM! Good parameterization of the similarity threshold.

This change correctly passes the similarity threshold configuration from settings to the HistoricalQuestionRetrieval class, enabling dynamic configuration instead of using a hardcoded value.

wren-ai-service/src/pipelines/retrieval/historical_question_retrieval.py (4)

105-114: Good improvement to the filtered_documents function.

The function now accepts the similarity threshold as a parameter instead of using a hardcoded value, making it more configurable. This allows adjusting retrieval sensitivity at runtime through configuration.


137-137: LGTM! Appropriate default value.

Adding the parameter with a sensible default of 0.9 maintains backward compatibility while enabling configuration.


159-161: Good separation of configuration from components.

Storing the threshold in a separate _configs dictionary is a clean approach that keeps configuration values separate from component instances.


176-176: LGTM! Properly passing configuration to pipeline execution.

The configuration is correctly unpacked into the pipeline inputs, ensuring the threshold is available during execution.

@cyyeh cyyeh requested a review from paopa March 12, 2025 01:55
@paopa paopa merged commit dda00b2 into main Mar 12, 2025
@paopa paopa deleted the chore/ai-service/minor-updates branch March 12, 2025 09:41
pull Bot pushed a commit to nagyist/WrenAI that referenced this pull request May 4, 2026
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/ai-service ai-service related module/ai-service ai-service related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants