chore(wren-ai-service): minor updates by cyyeh · Pull Request #1389 · Canner/WrenAI

cyyeh · 2025-03-12T01:10:36Z

remove redundant pipeline file
add allow_intent_classification: true and allow_sql_generation_reasoning: true to config.yaml
expose historical_question_retrieval_similarity_threshold in config.py

Summary by CodeRabbit

New Features
- Enabled new configuration options for intent classification and SQL generation reasoning.
- Introduced a configurable historical question retrieval threshold to enhance document filtering.
Refactor
- Streamlined historical question processing by removing legacy components.
Documentation
- Updated configuration examples across environments to reflect the new settings.

coderabbitai · 2025-03-12T01:10:44Z

Walkthrough

This pull request adds two new boolean configuration settings—allow_intent_classification: true and allow_sql_generation_reasoning: true—across multiple configuration files (deployment, docker, docs, and tools). Additionally, it introduces a new attribute historical_question_retrieval_similarity_threshold with a default value of 0.9 in the Python service configuration and updates related functions. The changes also remove an obsolete historical question pipeline implementation and update the historical question retrieval pipeline to accept a dynamic similarity threshold.

Changes

File(s)	Change Summary
`deployment/kustomizations/base/cm.yaml`, `docker/config.example.yaml`, `wren-ai-service/tools/config/config.example.yaml`, `wren-ai-service/tools/config/config.full.yaml`	Added new settings `allow_intent_classification: true` and `allow_sql_generation_reasoning: true` in the configuration files.
`wren-ai-service/docs/config_examples/config.azure.yaml`, `.../config.deepseek.yaml`, `.../config.google_ai_studio.yaml`, `.../config.groq.yaml`, `.../config.ollama.yaml`	Introduced the same new settings (`allow_intent_classification` and `allow_sql_generation_reasoning`) in documentation example configuration files.
`wren-ai-service/src/config.py`	Added a new attribute `historical_question_retrieval_similarity_threshold: float` with a default value of `0.9` to the `Settings` class.
`wren-ai-service/src/globals.py`	Integrated the new `historical_question_retrieval_similarity_threshold` parameter during the instantiation of the `HistoricalQuestionRetrieval` service.
`wren-ai-service/src/pipelines/retrieval/historical_question.py`	Removed the legacy historical question pipeline implementation including multiple classes and functions.
`wren-ai-service/src/pipelines/retrieval/historical_question_retrieval.py`	Updated function signatures and the class initializer to accept a dynamic `historical_question_retrieval_similarity_threshold`.

Sequence Diagram(s)

sequenceDiagram
    participant Config as Settings
    participant Service as HistoricalQuestionRetrieval
    participant Filter as ScoreFilter

    Config->>Service: Provide historical_question_retrieval_similarity_threshold (default 0.9)
    Service->>Filter: Invoke score filtering with dynamic threshold
    Filter-->>Service: Return filtered documents

Possibly related PRs

chore(wren-ai-service): minor updates #1219: Introduces configuration settings for SQL generation reasoning similar to those added in this PR.
chore(wren-ai-service): update config.yaml examples #1277: Adds allow_intent_classification and allow_sql_generation_reasoning settings across different config files, aligning with this update.
chore(wren-ai-service): minor updates #1269: Focuses on integrating the allow_intent_classification setting at the code level, which is directly related to the changes here.

Suggested Reviewers

paopa

Poem

I'm a little bunny, hopping with code today,
New settings sprout like carrots along the way.
Intent and SQL now dance in every file,
With thresholds set gently, making workflows smile.
I nibble on changes with joy and delight—
Happy code hops to a future so bright!

✨ Finishing Touches

📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

wren-ai-service/docs/config_examples/config.deepseek.yaml (1)
153-153: Remove Trailing Spaces
Line 153 contains trailing spaces that may potentially cause YAML parsing issues. Please remove them to ensure strict YAML compliance.

Apply this diff:
-      allow_sql_generation_reasoning: true  
+      allow_sql_generation_reasoning: true
🧰 Tools

🪛 YAMLlint (1.35.1)

[error] 153-153: trailing spaces

(trailing-spaces)
wren-ai-service/docs/config_examples/config.groq.yaml (1)
134-135: New Settings with Formatting Cleanup
The new flags allow_intent_classification: true and allow_sql_generation_reasoning: true have been added as expected. However, line 135 includes trailing spaces that should be removed to maintain proper YAML formatting.

Apply this diff:
-  allow_sql_generation_reasoning: true  
+  allow_sql_generation_reasoning: true
🧰 Tools

🪛 YAMLlint (1.35.1)

[error] 135-135: trailing spaces

(trailing-spaces)
wren-ai-service/docs/config_examples/config.google_ai_studio.yaml (1)
141-142: Google AI Studio Config: New Settings and Trailing Space Removal
The configuration now includes allow_intent_classification: true and allow_sql_generation_reasoning: true as required. Please remove the trailing spaces on line 142 to ensure consistency with YAML format standards.

Apply this diff:
-  allow_sql_generation_reasoning: true  
+  allow_sql_generation_reasoning: true
🧰 Tools

🪛 YAMLlint (1.35.1)

[error] 142-142: trailing spaces

(trailing-spaces)
wren-ai-service/docs/config_examples/config.azure.yaml (1)
142-143: Remove trailing spaces in configuration setting.

The configuration settings look good, but there are trailing spaces after allow_sql_generation_reasoning: true that should be removed for consistency.
-  allow_sql_generation_reasoning: true  
+  allow_sql_generation_reasoning: true
🧰 Tools

🪛 YAMLlint (1.35.1)

[error] 143-143: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d11477a and 9f25a25.

📒 Files selected for processing (13)

deployment/kustomizations/base/cm.yaml (1 hunks)
docker/config.example.yaml (1 hunks)
wren-ai-service/docs/config_examples/config.azure.yaml (1 hunks)
wren-ai-service/docs/config_examples/config.deepseek.yaml (1 hunks)
wren-ai-service/docs/config_examples/config.google_ai_studio.yaml (1 hunks)
wren-ai-service/docs/config_examples/config.groq.yaml (1 hunks)
wren-ai-service/docs/config_examples/config.ollama.yaml (1 hunks)
wren-ai-service/src/config.py (1 hunks)
wren-ai-service/src/globals.py (1 hunks)
wren-ai-service/src/pipelines/retrieval/historical_question.py (0 hunks)
wren-ai-service/src/pipelines/retrieval/historical_question_retrieval.py (4 hunks)
wren-ai-service/tools/config/config.example.yaml (1 hunks)
wren-ai-service/tools/config/config.full.yaml (1 hunks)

💤 Files with no reviewable changes (1)

wren-ai-service/src/pipelines/retrieval/historical_question.py

🧰 Additional context used

🪛 YAMLlint (1.35.1)

wren-ai-service/docs/config_examples/config.groq.yaml

[error] 135-135: trailing spaces

(trailing-spaces)

wren-ai-service/docs/config_examples/config.google_ai_studio.yaml

[error] 142-142: trailing spaces

(trailing-spaces)

wren-ai-service/docs/config_examples/config.azure.yaml

[error] 143-143: trailing spaces

(trailing-spaces)

wren-ai-service/docs/config_examples/config.deepseek.yaml

[error] 153-153: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: pytest
GitHub Check: Analyze (go)
GitHub Check: pytest

🔇 Additional comments (12)

docker/config.example.yaml (1)

139-140: New Configuration Settings Added Correctly
The new settings allow_intent_classification: true and allow_sql_generation_reasoning: true are added under the settings section as intended. These additions align well with the PR objectives by enabling specific features for intent classification and SQL generation reasoning.

wren-ai-service/docs/config_examples/config.ollama.yaml (1)

131-132: Intent Classification and SQL Generation Reasoning Settings Added
The configuration now includes the new flags allow_intent_classification: true and allow_sql_generation_reasoning: true in the settings section. This change is consistent with similar updates in related configuration files and improves feature toggling as intended.

wren-ai-service/docs/config_examples/config.deepseek.yaml (1)

152-153: New Configuration Options for DeepSeek
The settings allow_intent_classification: true and allow_sql_generation_reasoning: true are added to enhance configurability for intent classification and SQL generation reasoning. This update is correctly placed under the settings section and is consistent with similar changes across the codebase.

🧰 Tools

🪛 YAMLlint (1.35.1)

[error] 153-153: trailing spaces

(trailing-spaces)

deployment/kustomizations/base/cm.yaml (1)

187-188: Looks good - configuration options added successfully.

These new configuration settings enable important features in the Wren AI service - intent classification and SQL generation reasoning capabilities.

wren-ai-service/tools/config/config.example.yaml (1)

157-158: Configuration options properly added to example config.

The new settings align with the PR objective to enable intent classification and SQL generation reasoning features.

wren-ai-service/tools/config/config.full.yaml (1)

158-159: Configuration options properly added to full config.

The new settings are consistently added across all configuration files, ensuring uniform feature availability throughout the system.

wren-ai-service/src/config.py (1)

33-33: Well-placed new configuration parameter for similarity threshold.

The addition of historical_question_retrieval_similarity_threshold with a default value of 0.9 successfully exposes this parameter in the configuration, making it more accessible as intended in the PR objectives. This enhances the flexibility of the document retrieval process.

wren-ai-service/src/globals.py (1)

89-89: LGTM! Good parameterization of the similarity threshold.

This change correctly passes the similarity threshold configuration from settings to the HistoricalQuestionRetrieval class, enabling dynamic configuration instead of using a hardcoded value.

wren-ai-service/src/pipelines/retrieval/historical_question_retrieval.py (4)

105-114: Good improvement to the filtered_documents function.

The function now accepts the similarity threshold as a parameter instead of using a hardcoded value, making it more configurable. This allows adjusting retrieval sensitivity at runtime through configuration.

137-137: LGTM! Appropriate default value.

Adding the parameter with a sensible default of 0.9 maintains backward compatibility while enabling configuration.

159-161: Good separation of configuration from components.

Storing the threshold in a separate _configs dictionary is a clean approach that keeps configuration values separate from component instances.

176-176: LGTM! Properly passing configuration to pipeline execution.

The configuration is correctly unpacked into the pipeline inputs, ensuring the threshold is available during execution.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

cyyeh added 2 commits March 12, 2025 08:49

expose some settings

06ca914

expose historical_question_retrieval_similarity_threshold as config

9f25a25

cyyeh added module/ai-service ai-service related ci/ai-service ai-service related labels Mar 12, 2025

coderabbitai Bot reviewed Mar 12, 2025

View reviewed changes

cyyeh requested a review from paopa March 12, 2025 01:55

paopa approved these changes Mar 12, 2025

View reviewed changes

paopa merged commit dda00b2 into main Mar 12, 2025

paopa deleted the chore/ai-service/minor-updates branch March 12, 2025 09:41

coderabbitai Bot mentioned this pull request Mar 18, 2025

chore(wren-ai-service): improve sql pairs and instructions #1422

Merged

coderabbitai Bot mentioned this pull request Mar 26, 2025

fix(wren-ai-service): fix sql expansion latency and retrieval issue #1469

Merged

pull Bot pushed a commit to nagyist/WrenAI that referenced this pull request May 4, 2026

feat(knowledge): introduce the generic SQL knowledge (Canner#1389)

3dec4b3

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore(wren-ai-service): minor updates#1389

chore(wren-ai-service): minor updates#1389
paopa merged 2 commits into
mainfrom
chore/ai-service/minor-updates

cyyeh commented Mar 12, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 12, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

cyyeh commented Mar 12, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested Reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cyyeh commented Mar 12, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 12, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)