Skip to content

chore(wren-ai-service): minor-updates#1253

Merged
paopa merged 9 commits into
mainfrom
chore/ai-service/minor-updates
Feb 3, 2025
Merged

chore(wren-ai-service): minor-updates#1253
paopa merged 9 commits into
mainfrom
chore/ai-service/minor-updates

Conversation

@cyyeh
Copy link
Copy Markdown
Member

@cyyeh cyyeh commented Feb 3, 2025

  • refine sql generation reasoning prompt
  • refine text to sql prompt
  • refine intent classification prompt
  • add allow_sql_generation_reasoning to config.yaml, default: True
  • support o3-mini
  • use litellm_embedder by default

Summary by CodeRabbit

  • New Features

    • Introduced a new generation model option in the user selection, broadening creative output choices.
    • Enabled configurable SQL generation reasoning with a dedicated section for detailed step plans.
  • Enhancements

    • Updated the embedding service configuration to ensure consistent performance.
    • Increased the default limit for SQL query results, providing more comprehensive data retrieval.
    • Improved SQL post-processing and intent classification logic for more accurate responses.

@cyyeh cyyeh added module/ai-service ai-service related ci/ai-service ai-service related labels Feb 3, 2025
@cyyeh cyyeh requested a review from paopa February 3, 2025 06:20
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 3, 2025

Walkthrough

This update introduces a new model, o3-mini-2025-01-31, across multiple configuration files while switching the embedder provider from openai_embedder to litellm_embedder and updating relevant API key names and embedder references. The SQL query limit in the demo script has been raised, and additional flexibility for SQL generation reasoning has been added via new configuration options and optional parameters in the pipelines and AskService. Minor logic refinements and typographical corrections in intent classification are also included, along with updates to model selection in the launcher.

Changes

File(s) Change Summary
deployment/kustomizations/base/cm.yaml, docker/config.example.yaml, wren-ai-service/tools/config/config.example.yaml, wren-ai-service/tools/config/config.full.yaml Added new model o3-mini-2025-01-31; updated embedder provider from openai_embedder to litellm_embedder, adjusted API key names (to EMBEDDER_OPENAI_API_KEY), and replaced embedder references across multiple pipeline stages.
wren-ai-service/demo/run_sql.py Increased SQL query limit from 10 to 50 in the get_data_from_wren_engine function call.
wren-ai-service/src/config.py, wren-ai-service/src/globals.py Added new configuration option allow_sql_generation_reasoning and updated the service container to accept and pass this parameter.
wren-ai-service/src/pipelines/generation/sql_generation.py, wren-ai-service/src/pipelines/generation/sql_generation_reasoning.py Made the sql_generation_reasoning parameter optional (allowing None) and updated the SQL prompt template to conditionally include a "REASONING PLAN" section with stricter formatting requirements.
wren-ai-service/src/pipelines/generation/utils/sql.py Introduced a new class SQLBreakdownGenPostProcessor with an asynchronous run method and enhanced the SQLGenPostProcessor to better handle varied reply formats and provide improved error handling.
wren-ai-service/src/web/v1/services/ask.py Modified the AskService by adding the allow_sql_generation_reasoning parameter to its initializer and adjusting the ask method to conditionally process SQL generation reasoning.
wren-launcher/commands/launch.go, wren-launcher/utils/docker.go Expanded the model selection list to include "o3-mini" and updated the mapping in generationModelToModelName accordingly.
wren-ai-service/src/pipelines/generation/intent_classification.py Refined logic to include time/date formats only when relevant and corrected minor typographical errors.

Sequence Diagram(s)

sequenceDiagram
    participant U as User
    participant AS as AskService
    participant PG as Pipeline Generator

    U->>AS: Submit query
    AS->>AS: Check allow_sql_generation_reasoning flag
    alt Reasoning Enabled
        AS->>PG: Trigger SQL generation with reasoning plan
        PG-->>AS: Return SQL result with reasoning
    else
        AS->>PG: Trigger SQL generation without reasoning
        PG-->>AS: Return SQL result
    end
    AS->>U: Return response
Loading

Possibly related PRs

Suggested reviewers

  • paopa

Poem

In a meadow of code, I hop and please,
Bringing new models and changes with ease.
Embedder switched, queries now soar,
SQL reasoning whispers, and pipelines explore.
With each hop, our project blooms more!

✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (2)
wren-ai-service/tools/config/config.example.yaml (1)

59-59: Missing Configuration Option: allow_sql_generation_reasoning
According to the PR objectives, a new configuration option (allow_sql_generation_reasoning with a default value of True) should be added to the configuration. It is currently missing from this file. Please add the option in the appropriate section (for example, under a new settings block or within the llm configuration if that is where it logically belongs).

deployment/kustomizations/base/cm.yaml (1)

190-200: Missing Configuration Option: allow_sql_generation_reasoning in Settings
The PR objectives mention the addition of a new configuration option, allow_sql_generation_reasoning (defaulting to True), yet it is absent from the settings section in this ConfigMap’s config.yaml. Please consider adding this option under the settings block to ensure consistency with the PR requirements.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 200-200: no new line character at the end of file

(new-line-at-end-of-file)

🧹 Nitpick comments (3)
wren-ai-service/src/web/v1/services/ask.py (1)

278-303: Consider adding a comment explaining the SQL generation reasoning flow.

The response handling for SQL generation reasoning is well-implemented. Consider adding a brief comment explaining when and why SQL generation reasoning might be skipped to improve maintainability.

Add this comment before line 273:

+# Skip SQL generation reasoning if disabled in config or if results are already available
 if (
     not self._is_stopped(query_id)
     and not api_results
     and self._allow_sql_generation_reasoning
 ):
wren-ai-service/tools/config/config.example.yaml (1)

36-36: YAML Indentation Note
Static analysis reports a potential indentation issue on line 36. Although the list item under models: is generally indented by two spaces, please double-check that this indentation meets your project’s YAML style guidelines or adjust your YAMLlint configuration if needed.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 36-36: wrong indentation: expected 0 but found 2

(indentation)

deployment/kustomizations/base/cm.yaml (1)

171-171: Trailing Spaces Detected
Line 171 contains trailing spaces after litellm_embedder.text-embedding-3-large. Removing these ensures better adherence to YAML style standards. For example, apply the following diff:

-        embedder: litellm_embedder.text-embedding-3-large 
+        embedder: litellm_embedder.text-embedding-3-large
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 171-171: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3543c9 and 566e372.

📒 Files selected for processing (12)
  • deployment/kustomizations/base/cm.yaml (3 hunks)
  • docker/.env.example (1 hunks)
  • docker/config.example.yaml (3 hunks)
  • wren-ai-service/demo/run_sql.py (1 hunks)
  • wren-ai-service/src/config.py (1 hunks)
  • wren-ai-service/src/globals.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/sql_generation.py (3 hunks)
  • wren-ai-service/src/pipelines/generation/sql_generation_reasoning.py (1 hunks)
  • wren-ai-service/src/pipelines/generation/utils/sql.py (0 hunks)
  • wren-ai-service/src/web/v1/services/ask.py (3 hunks)
  • wren-ai-service/tools/config/config.example.yaml (4 hunks)
  • wren-ai-service/tools/config/config.full.yaml (4 hunks)
💤 Files with no reviewable changes (1)
  • wren-ai-service/src/pipelines/generation/utils/sql.py
🧰 Additional context used
🪛 YAMLlint (1.35.1)
docker/config.example.yaml

[error] 123-123: trailing spaces

(trailing-spaces)

deployment/kustomizations/base/cm.yaml

[error] 171-171: trailing spaces

(trailing-spaces)

wren-ai-service/tools/config/config.example.yaml

[warning] 36-36: wrong indentation: expected 0 but found 2

(indentation)


[error] 141-141: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (1)
  • GitHub Check: pytest
🔇 Additional comments (16)
wren-ai-service/src/config.py (1)

34-36: LGTM! Well-structured configuration addition.

The new configuration option is appropriately placed in the generation config section with a clear default value.

wren-ai-service/src/pipelines/generation/sql_generation_reasoning.py (1)

28-29: LGTM! Enhanced reasoning plan structure.

The new requirements improve the clarity of the reasoning plan by:

  1. Ensuring SQL is not included in the reasoning plan
  2. Enforcing a numbered step format with explicit reasoning
wren-ai-service/src/pipelines/generation/sql_generation.py (2)

49-52: LGTM! Clean template update.

The conditional rendering of the REASONING PLAN section is well-implemented and maintains backward compatibility.


64-64: LGTM! Proper type hints for optional parameter.

The sql_generation_reasoning parameter is correctly typed as str | None with a default value of None in both prompt() and run() methods.

Also applies to: 143-143

wren-ai-service/src/globals.py (1)

124-125: LGTM! Proper configuration integration.

The allow_sql_generation_reasoning parameter is correctly passed from settings to AskService initialization.

wren-ai-service/src/web/v1/services/ask.py (2)

102-114: LGTM! Clean constructor changes.

The addition of the allow_sql_generation_reasoning parameter with a default value of True maintains backward compatibility while adding configuration flexibility.


147-147: LGTM! Well-structured control flow changes.

The addition of the sql_generation_reasoning variable and the modified condition properly integrate the new configuration option into the existing control flow.

Also applies to: 273-277

wren-ai-service/demo/run_sql.py (1)

69-69: Verify the increased query limit impact.

The query limit has been increased from 10 to 50 records. While this provides more comprehensive results, ensure that:

  1. The increased data volume aligns with the UI's rendering capabilities
  2. Memory usage remains within acceptable bounds
  3. Response times stay reasonable
✅ Verification successful

Verified: The increased limit is consistent with overall configuration limits.

  • The updated limit of 50 in wren-ai-service/demo/run_sql.py is modest compared to other demo and production settings (e.g., 100 in demo/utils.py and 500 in demo/app.py).
  • No conflicting configuration or evidence of performance/memory issues was found in the codebase.
  • It is advisable to test the UI and performance in practice, but current configurations indicate that retrieving 50 records is in line with the intended design.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if this limit aligns with other configuration limits
rg -A 1 "limit|max_" | grep -v "max_tokens\|max_completion_tokens"

Length of output: 56989

docker/.env.example (1)

29-29: Verify the development version naming convention.

The version has been changed to a development version with a future date (February 2025). While this might be intentional for testing, please ensure this aligns with your versioning strategy.

docker/config.example.yaml (1)

23-30: LGTM! New model configuration looks good.

The new model o3-mini-2025-01-31 is properly configured with appropriate parameters including reasoning effort settings.

wren-ai-service/tools/config/config.full.yaml (2)

5-30: LGTM! Model configurations are consistent.

The model configurations, including the new o3-mini-2025-01-31, match those in config.example.yaml.


34-39: Verify embedder configuration consistency.

The embedder configuration matches config.example.yaml. Please ensure all environment variables and endpoints are properly set up across different deployment environments.

✅ Verification successful

Embedder configuration consistency verified.

  • The config.full.yaml file matches config.example.yaml, with identical values for provider, model, api_base, api_key_name, and timeout.
  • Other configuration examples (ollama, groq, deepseek, google ai studio, azure) intentionally differ to suit their respective environments.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Compare embedder configurations across files
fd -e yaml -e yml | xargs rg "provider: litellm_embedder" -A 5

Length of output: 5262

wren-ai-service/tools/config/config.example.yaml (2)

23-30: New Model Addition: o3-mini-2025-01-31
The new model entry is added correctly with appropriate keys and values. Please verify that the parameters (e.g. using max_completion_tokens and reasoning_effort) align with the expected API behavior.


33-40: Embedder Provider Update
The embedder section has been updated to use litellm_embedder with the new API key name (EMBEDDER_OPENAI_API_KEY) and timeout. This is consistent with the intended provider change.

🧰 Tools
🪛 YAMLlint (1.35.1)

[warning] 36-36: wrong indentation: expected 0 but found 2

(indentation)

deployment/kustomizations/base/cm.yaml (2)

72-79: New Model Addition in ConfigMap
The new model o3-mini-2025-01-31 is integrated into the llm models list with the necessary properties. Ensure that this new entry is fully supported by downstream services.


82-89: Embedder Configuration Update
The embedder configuration now uses litellm_embedder and updates the API key to EMBEDDER_OPENAI_API_KEY. This update is consistent with the changes in the tools config.

Comment thread docker/config.example.yaml
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
wren-ai-service/src/pipelines/generation/intent_classification.py (2)

241-258: Consider enhancing error handling in post_process.

The current error handling catches all exceptions and returns a default TEXT_TO_SQL intent. This could mask specific issues and make debugging harder.

Consider this improvement:

     except Exception:
+        logger.exception("Failed to parse intent classification results")
+        if not classify_intent.get("replies"):
+            raise ValueError("No replies from intent classifier")
+        if not classify_intent.get("replies")[0]:
+            raise ValueError("Empty reply from intent classifier")
         return {
             "intent": "TEXT_TO_SQL",
             "rephrased_question": "",
             "reasoning": "",
             "db_schemas": construct_db_schemas,
         }

269-277: Document the model configuration structure.

The INTENT_CLASSIFICAION_MODEL_KWARGS configuration would benefit from documentation explaining the JSON schema structure and its purpose.

Consider adding a docstring:

 INTENT_CLASSIFICAION_MODEL_KWARGS = {
+    # Configuration for the LLM to ensure structured JSON output
+    # following the IntentClassificationResult schema
     "response_format": {
         "type": "json_schema",
         "json_schema": {
             "name": "intent_classification",
             "schema": IntentClassificationResult.model_json_schema(),
         },
     }
 }
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8771880 and a999d90.

📒 Files selected for processing (1)
  • wren-ai-service/src/pipelines/generation/intent_classification.py (1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: validate-pull-request-title
  • GitHub Check: pytest
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (1)
wren-ai-service/src/pipelines/generation/intent_classification.py (1)

34-34: Verify handling of time/date formats in user queries.

The change improves the logic by only adding time/date formats when relevant, preventing unnecessary format additions in non-temporal queries.

Let's verify the handling of time/date formats in user queries:

✅ Verification successful

Time/date conditional handling verified.
The changes in the intent classification pipeline correctly restrict time/date formatting to queries that include temporal expressions, thereby preventing unnecessary modifications to non-temporal questions. The test cases (e.g., in wren-ai-service/tests/pytest/services/test_semantics_description.py) and multiple similar implementations across the repository confirm that the updated logic behaves as expected.

  • wren-ai-service/src/pipelines/generation/intent_classification.py: Verifies that time/date formatting is only applied when relevant.
  • Test cases: Indicate proper handling of temporal queries.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for test cases and examples involving time/date formats
# to ensure proper handling of temporal queries.

# Test 1: Look for test cases involving time/date queries
rg -l "test.*time|test.*date" --type py

# Test 2: Look for example queries with temporal expressions
rg -i "last.*|next.*|previous.*|today|tomorrow|yesterday" --type py

# Test 3: Search for datetime parsing or formatting logic
rg -i "datetime|strftime|strptime|timestamp" --type py

Length of output: 14300

Copy link
Copy Markdown
Contributor

@paopa paopa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/ai-service ai-service related module/ai-service ai-service related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants