feat(wren-ai-service): Add SQL Correction Service and API Endpoints by paopa · Pull Request #1420 · Canner/WrenAI

paopa · 2025-03-18T03:20:03Z

This PR introduces a new SQL Correction Service that helps users fix invalid SQL queries through a dedicated API endpoint.

Key changes:

Added new SqlCorrectionService with background task processing
Implemented /sql-corrections endpoints for POST and GET operations
Integrated service with existing container and router infrastructure
Added error handling and caching mechanisms

The service accepts invalid SQL queries with error messages and returns corrected SQL statements through an asynchronous process. Users can track the correction status using the provided event ID.

Endpoints:

POST /sql-corrections: Submit invalid SQL for correction
GET /sql-corrections/{event_id}: Check correction status and results

Summary by CodeRabbit

New Features
- Introduced a new SQL correction capability that lets users submit SQL queries with errors for automatic correction.
- Users now receive a unique event ID upon submission, enabling them to track the status and results of their correction requests through dedicated endpoints.
- Added a new service for managing SQL correction requests, enhancing error management and event tracking.

coderabbitai · 2025-03-18T03:20:10Z

Walkthrough

This changeset introduces a new SQL correction service into the application. A new sql_correction_service is added to the ServiceContainer and instantiated in the create_service_container function with a dedicated pipeline. Additionally, the FastAPI routing is extended by adding a new router for SQL corrections, complete with POST and GET endpoints. A new service class, SqlCorrectionService, is implemented to process correction requests asynchronously, incorporating caching, error handling, and event tracking. Minor error instantiation adjustments were also made in the instructions service.

Changes

Files	Change Summary
`wren-ai-service/src/globals.py` `wren-ai-service/src/web/v1/services/__init__.py`	Added `sql_correction_service` attribute to `ServiceContainer` and updated `create_service_container` to instantiate it; exported `SqlCorrectionService` in the services module.
`wren-ai-service/src/web/v1/routers/__init__.py` `wren-ai-service/src/web/v1/routers/sql_corrections.py`	Integrated the new `sql_corrections` router into the main routing and added endpoints (POST for initiating corrections and GET for status retrieval).
`wren-ai-service/src/web/v1/services/sql_corrections.py`	Introduced `SqlCorrectionService` with asynchronous correction processing, error handling, caching, and event tracking; added nested models: `CorrectionRequest`, `Error`, and `Event`.
`wren-ai-service/src/web/v1/services/instructions.py`	Updated error instantiation in the error handling methods (switched from `self.Event.Error` to `self.Error`).

Possibly related PRs

fix(wren-ai-service): column pruning step in retrieval pipeline and correct code for sql generation in evaluation #1225: The changes in the main PR, which introduce the sql_correction_service to manage SQL correction tasks, are related to the retrieved PR that enhances SQL generation and retrieval pipelines, particularly through the introduction of sql_generation_reasoning, indicating a focus on SQL-related functionalities.
feat(wren-ai-service): Add invalid SQL tracking to AskResultResponse #1356: The changes in the main PR, which introduce a new sql_correction_service to handle SQL corrections, are related to the retrieved PR that adds an invalid_sql attribute to track invalid SQL queries in the AskResultResponse, as both involve enhancements to SQL handling and error reporting within the service.

Suggested labels

module/ui

Suggested reviewers

cyyeh

Poem

In a burrow of code, I hop with delight,
New SQL corrections shine so bright.
Endpoints spring to life with async might,
Caching each event like stars in the night.
With errors handled neat and dreams in flight,
I, the coding bunny, celebrate this upgrade right! 🐰✨

✨ Finishing Touches

📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR.
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (4)

wren-ai-service/src/web/v1/routers/sql_corrections.py (2)

76-97: Ensure pipeline availability and consider concurrency.

The code references service_container.sql_correction_service and background tasks without verifying that the sql_correction pipeline exists. If it’s misconfigured or missing, a KeyError might occur.

Consider whether unlimited concurrent SQL correction tasks might strain resources. You could implement a queue or concurrency limit to prevent potential load spikes.

Would you like me to provide a script to search for pipeline instantiation references in the codebase to ensure that "sql_correction" is always defined?

107-113: Consider returning a 404 for unknown events.

Currently, if the event_id is unknown, the service returns a default failed status. From an API design perspective, returning HTTP 404 might be more intuitive to indicate that the event does not exist.

wren-ai-service/src/web/v1/services/sql_corrections.py (2)

27-35: Reevaluate cache size and TTL.

maxsize=1_000_000 and ttl=120 can quickly consume memory if there are frequent requests with minimal cache rotation. Consider adjusting the limits or setting up monitoring to ensure stability under high load.

36-50: Consider enriching log details on exceptions.

_handle_exception currently logs only the message. It might be helpful to include more details (e.g., stack traces) to accelerate troubleshooting in production environments.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d4792e and 1a7906a.

📒 Files selected for processing (6)

wren-ai-service/src/globals.py (2 hunks)
wren-ai-service/src/web/v1/routers/__init__.py (2 hunks)
wren-ai-service/src/web/v1/routers/sql_corrections.py (1 hunks)
wren-ai-service/src/web/v1/services/__init__.py (2 hunks)
wren-ai-service/src/web/v1/services/instructions.py (2 hunks)
wren-ai-service/src/web/v1/services/sql_corrections.py (1 hunks)

🧰 Additional context used

🧬 Code Definitions (2)

wren-ai-service/src/globals.py (3)

wren-ai-service/src/pipelines/generation/sql_correction.py (1) (1)

SQLCorrection (99:143)

wren-ai-service/src/pipelines/generation/sql_correction.py (1) (1)

SQLCorrection (99:143)

wren-ai-service/src/pipelines/generation/sql_correction.py (1) (1)

SQLCorrection (99:143)

wren-ai-service/src/web/v1/services/sql_corrections.py (2)

wren-ai-service/src/web/v1/routers/sql_corrections.py (2) (2)

correct (77:96)
get (108:113)

wren-ai-service/src/web/v1/routers/sql_corrections.py (2) (2)

correct (77:96)
get (108:113)

⏰ Context from checks skipped due to timeout of 90000ms (4)

GitHub Check: pytest
GitHub Check: pytest
GitHub Check: Analyze (javascript-typescript)
GitHub Check: Analyze (go)

🔇 Additional comments (12)

wren-ai-service/src/web/v1/services/__init__.py (2)

72-72: Implementation looks good

The import for the new SqlCorrectionService is properly added, following the established pattern in the codebase.

87-87: Correctly exported the new service

The SqlCorrectionService is appropriately added to the all list, making it available for import by other modules.

wren-ai-service/src/web/v1/routers/__init__.py (2)

14-14: Import looks good

The sql_corrections module is properly imported, consistent with the pattern for other routers.

34-34: Router integration is correct

The sql_corrections router is properly integrated into the main router, maintaining the established pattern in the codebase.

wren-ai-service/src/web/v1/services/instructions.py (2)

54-54: Error instantiation refactoring looks good

Changed from self.Event.Error to self.Error, simplifying the error instantiation approach and likely aligning with patterns used in the new SqlCorrectionService.

154-154: Consistent error handling approach

Updated to use self.Error directly, matching the change made in the _handle_exception method and maintaining consistency throughout the service.

wren-ai-service/src/globals.py (2)

30-30: Service container attribute looks good

The SqlCorrectionService is properly added to the ServiceContainer class, maintaining consistent structure with other services.

261-269: Service instantiation looks good

The SqlCorrectionService is properly instantiated with the correct pipeline configuration, following the established pattern in the codebase. The service reuses the existing sql_correction pipeline that's already used elsewhere in the application (e.g., in sql_expansion_service), which is a good practice for code reuse.

wren-ai-service/src/web/v1/routers/sql_corrections.py (2)

19-63: Good documentation for the new endpoints.

The detailed docstring provides clarity regarding the endpoint usage and workflow, which is helpful for both developers and API consumers.

66-70: Validate optional fields.

project_id is optional, which is fine. However, consider validating or normalizing the project ID if it's a critical piece of metadata. This ensures that downstream services won't fail due to an unexpected format or missing value.

wren-ai-service/src/web/v1/services/sql_corrections.py (2)

15-25: Clear event modeling.

The Error and Event models neatly encapsulate all necessary properties, facilitating straightforward data handling. This approach makes it easy to track the status of each correction request and attach relevant error information.

107-120: Consistent approach for unknown events.

The fallback to a failed event is consistent with your design. However, if you decide to return a 404 in the router, ensure that the same logic is reflected here by throwing a custom exception or returning a sentinel value that the router can interpret.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (9)

wren-ai-service/src/globals.py (2)

30-30: Add a concise docstring or comment for “sql_correction_service.”
Currently, there is no description for this attribute. Including a docstring or comment clarifying its purpose would improve maintainability and readability.

261-269: Consider consolidating repeated pipeline instances.
A sql_correction pipeline is added here, but a similar pipeline reference appears in other services (e.g., ask_service at line 116). Reusing or consolidating this pipeline might avoid duplication and reduce overhead.

wren-ai-service/src/web/v1/routers/sql_corrections.py (2)

76-97: Background task usage is correct; consider adding error response.
While the primary logic is in the background task, you may optionally provide an immediate check that the pipeline key exists (like you do in the service). If missing, you might return a 400 or informative response. Currently, the code always defers checking to the background task.

107-113: Return 404 or similar for missing events.
Currently, if an event is missing, the code returns a failed status with OTHERS error. Consider using HTTP status 404 to reflect that the resource (event) does not exist.

wren-ai-service/src/web/v1/services/sql_corrections.py (5)

16-19: Error model is sufficient for general failures.
Provides a helpful structure (code, message). If you anticipate more error codes, consider enumerating them.

27-35: Cache-based storage is straightforward but be mindful of concurrency.
TTLCache is not inherently threadsafe if used in a multi-worker environment. If concurrency grows, you may need a lock or a thread-safe cache variant.

36-50: Exception handling logs the error but consider advanced logging strategies.
When an exception occurs, you set the event to “failed” and store the error. This is good. For large-scale debugging, consider structured logging or linking a correlation ID.

57-106: Check for missing pipeline keys before usage.
You do self._pipelines["sql_correction"] directly. If sql_correction is unexpectedly absent, you raise a KeyError. You have a fallback in _handle_exception, but a pre-check with a friendlier error might improve resilience.

107-120: Graceful handling of missing or expired events.
Returns a “failed” event with message. This is user-friendly, but consider also returning an HTTP 404 in the router if an event is absent.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a7906a and caa3f87.

📒 Files selected for processing (6)

wren-ai-service/src/globals.py (2 hunks)
wren-ai-service/src/web/v1/routers/__init__.py (2 hunks)
wren-ai-service/src/web/v1/routers/sql_corrections.py (1 hunks)
wren-ai-service/src/web/v1/services/__init__.py (2 hunks)
wren-ai-service/src/web/v1/services/instructions.py (2 hunks)
wren-ai-service/src/web/v1/services/sql_corrections.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

wren-ai-service/src/web/v1/services/instructions.py
wren-ai-service/src/web/v1/services/init.py
wren-ai-service/src/web/v1/routers/init.py

🧰 Additional context used

🧬 Code Definitions (3)

wren-ai-service/src/web/v1/routers/sql_corrections.py (1)

wren-ai-service/src/web/v1/services/sql_corrections.py (4) (4)

SqlCorrectionService (15:122)

correct (59:105)

Event (20:25)

CorrectionRequest (51:55)

wren-ai-service/src/globals.py (1)

wren-ai-service/src/pipelines/generation/sql_correction.py (1) (1)

SQLCorrection (99:143)

wren-ai-service/src/web/v1/services/sql_corrections.py (1)

wren-ai-service/src/web/v1/routers/sql_corrections.py (2) (2)

correct (77:96)

get (108:113)

⏰ Context from checks skipped due to timeout of 90000ms (3)

GitHub Check: pytest
GitHub Check: Analyze (javascript-typescript)
GitHub Check: Analyze (go)

🔇 Additional comments (9)

wren-ai-service/src/web/v1/routers/sql_corrections.py (5)

1-14: Imports and module setup look good.
All necessary imports for FastAPI, Pydantic, and service dependencies are properly organized. No issues noted here.

16-63: Docstring thoroughly documents the router’s purpose.
The router-level docstring is comprehensive and outlines usage, endpoints, and request/response models. Nicely done for maintainability.

66-70: Validate nullable fields in PostRequest.
project_id is optional, but consider confirming that an empty string vs. null is handled appropriately downstream.

72-74: PostResponse data structure is minimal but sufficient.
Only event_id is returned. This is clear for asynchronous processes. No issues found.

99-105: GetResponse model is straightforward.
It closely parallels the event’s structure in SqlCorrectionService. No issues found.

wren-ai-service/src/web/v1/services/sql_corrections.py (4)

1-4: Imports and logging setup are fine.
No extraneous imports or unused references detected here.

20-26: Event model design is clear.
Covers typical states for asynchronous tasks, including an optional trace_id for debugging. Good approach.

51-56: CorrectionRequest structure is well-defined.
Captures essentials (sql, error, project_id). This is adequate for passing correction data.

121-123: setitem usage is consistent with your caching approach.
Directly assigning the Event object is fine; just keep concurrency considerations in mind if usage increases.

cyyeh

TODO: we'll improve this PR by adding sql functions, instructions etc.

paopa added module/ai-service ai-service related ci/ai-service ai-service related labels Mar 18, 2025

github-actions Bot added the wren-ai-service label Mar 18, 2025

coderabbitai Bot reviewed Mar 18, 2025

View reviewed changes

Comment thread wren-ai-service/src/web/v1/services/sql_corrections.py

paopa added 9 commits March 18, 2025 17:25

feat: impl svc for sql correction

ea1503f

feat: add service into global container

c3f7b15

feat: impl sql correction router

46482d5

chore: modify the endpoint spec

24e8961

chore: refactor the code for service

665be7d

fix: invoking the error type

79a8078

feat: remove document class to avoid PydanticSchemaGenerationError

ea2f419

feat: modify the interface spec for sql correction router and svc

54ef445

feat: simplify the output spec

caa3f87

paopa force-pushed the feat/sql-correcting-endpoint branch from 383d804 to caa3f87 Compare March 18, 2025 09:25

coderabbitai Bot reviewed Mar 18, 2025

View reviewed changes

cyyeh approved these changes Mar 19, 2025

View reviewed changes

cyyeh merged commit 450f94b into main Mar 19, 2025

cyyeh deleted the feat/sql-correcting-endpoint branch March 19, 2025 05:51

coderabbitai Bot mentioned this pull request May 6, 2025

feat(wren-ai-service): fix by ai endpoint (ai-env-changed) #1628

Merged

coderabbitai Bot mentioned this pull request Jun 30, 2025

chore(wren-ai-service): add retrieved_tables to sql correction api #1804

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(wren-ai-service): Add SQL Correction Service and API Endpoints#1420

feat(wren-ai-service): Add SQL Correction Service and API Endpoints#1420
cyyeh merged 9 commits into
mainfrom
feat/sql-correcting-endpoint

paopa commented Mar 18, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 18, 2025 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

cyyeh left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

paopa commented Mar 18, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

cyyeh left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

paopa commented Mar 18, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 18, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)