Skip to content

feat(wren-ai-service): Add SQL Correction Service and API Endpoints#1420

Merged
cyyeh merged 9 commits into
mainfrom
feat/sql-correcting-endpoint
Mar 19, 2025
Merged

feat(wren-ai-service): Add SQL Correction Service and API Endpoints#1420
cyyeh merged 9 commits into
mainfrom
feat/sql-correcting-endpoint

Conversation

@paopa
Copy link
Copy Markdown
Contributor

@paopa paopa commented Mar 18, 2025

This PR introduces a new SQL Correction Service that helps users fix invalid SQL queries through a dedicated API endpoint.

Key changes:

  • Added new SqlCorrectionService with background task processing
  • Implemented /sql-corrections endpoints for POST and GET operations
  • Integrated service with existing container and router infrastructure
  • Added error handling and caching mechanisms

The service accepts invalid SQL queries with error messages and returns corrected SQL statements through an asynchronous process. Users can track the correction status using the provided event ID.

Endpoints:

  • POST /sql-corrections: Submit invalid SQL for correction
  • GET /sql-corrections/{event_id}: Check correction status and results

Summary by CodeRabbit

  • New Features
    • Introduced a new SQL correction capability that lets users submit SQL queries with errors for automatic correction.
    • Users now receive a unique event ID upon submission, enabling them to track the status and results of their correction requests through dedicated endpoints.
    • Added a new service for managing SQL correction requests, enhancing error management and event tracking.

@paopa paopa added module/ai-service ai-service related ci/ai-service ai-service related labels Mar 18, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 18, 2025

Walkthrough

This changeset introduces a new SQL correction service into the application. A new sql_correction_service is added to the ServiceContainer and instantiated in the create_service_container function with a dedicated pipeline. Additionally, the FastAPI routing is extended by adding a new router for SQL corrections, complete with POST and GET endpoints. A new service class, SqlCorrectionService, is implemented to process correction requests asynchronously, incorporating caching, error handling, and event tracking. Minor error instantiation adjustments were also made in the instructions service.

Changes

Files Change Summary
wren-ai-service/src/globals.py
wren-ai-service/src/web/v1/services/__init__.py
Added sql_correction_service attribute to ServiceContainer and updated create_service_container to instantiate it; exported SqlCorrectionService in the services module.
wren-ai-service/src/web/v1/routers/__init__.py
wren-ai-service/src/web/v1/routers/sql_corrections.py
Integrated the new sql_corrections router into the main routing and added endpoints (POST for initiating corrections and GET for status retrieval).
wren-ai-service/src/web/v1/services/sql_corrections.py Introduced SqlCorrectionService with asynchronous correction processing, error handling, caching, and event tracking; added nested models: CorrectionRequest, Error, and Event.
wren-ai-service/src/web/v1/services/instructions.py Updated error instantiation in the error handling methods (switched from self.Event.Error to self.Error).

Possibly related PRs

Suggested labels

module/ui

Suggested reviewers

  • cyyeh

Poem

In a burrow of code, I hop with delight,
New SQL corrections shine so bright.
Endpoints spring to life with async might,
Caching each event like stars in the night.
With errors handled neat and dreams in flight,
I, the coding bunny, celebrate this upgrade right! 🐰✨

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
wren-ai-service/src/web/v1/routers/sql_corrections.py (2)

76-97: Ensure pipeline availability and consider concurrency.

  1. The code references service_container.sql_correction_service and background tasks without verifying that the sql_correction pipeline exists. If it’s misconfigured or missing, a KeyError might occur.
  2. Consider whether unlimited concurrent SQL correction tasks might strain resources. You could implement a queue or concurrency limit to prevent potential load spikes.

Would you like me to provide a script to search for pipeline instantiation references in the codebase to ensure that "sql_correction" is always defined?


107-113: Consider returning a 404 for unknown events.

Currently, if the event_id is unknown, the service returns a default failed status. From an API design perspective, returning HTTP 404 might be more intuitive to indicate that the event does not exist.

wren-ai-service/src/web/v1/services/sql_corrections.py (2)

27-35: Reevaluate cache size and TTL.

maxsize=1_000_000 and ttl=120 can quickly consume memory if there are frequent requests with minimal cache rotation. Consider adjusting the limits or setting up monitoring to ensure stability under high load.


36-50: Consider enriching log details on exceptions.

_handle_exception currently logs only the message. It might be helpful to include more details (e.g., stack traces) to accelerate troubleshooting in production environments.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4d4792e and 1a7906a.

📒 Files selected for processing (6)
  • wren-ai-service/src/globals.py (2 hunks)
  • wren-ai-service/src/web/v1/routers/__init__.py (2 hunks)
  • wren-ai-service/src/web/v1/routers/sql_corrections.py (1 hunks)
  • wren-ai-service/src/web/v1/services/__init__.py (2 hunks)
  • wren-ai-service/src/web/v1/services/instructions.py (2 hunks)
  • wren-ai-service/src/web/v1/services/sql_corrections.py (1 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
wren-ai-service/src/globals.py (3)
wren-ai-service/src/pipelines/generation/sql_correction.py (1) (1)
  • SQLCorrection (99:143)
wren-ai-service/src/pipelines/generation/sql_correction.py (1) (1)
  • SQLCorrection (99:143)
wren-ai-service/src/pipelines/generation/sql_correction.py (1) (1)
  • SQLCorrection (99:143)
wren-ai-service/src/web/v1/services/sql_corrections.py (2)
wren-ai-service/src/web/v1/routers/sql_corrections.py (2) (2)
  • correct (77:96)
  • get (108:113)
wren-ai-service/src/web/v1/routers/sql_corrections.py (2) (2)
  • correct (77:96)
  • get (108:113)
⏰ Context from checks skipped due to timeout of 90000ms (4)
  • GitHub Check: pytest
  • GitHub Check: pytest
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (12)
wren-ai-service/src/web/v1/services/__init__.py (2)

72-72: Implementation looks good

The import for the new SqlCorrectionService is properly added, following the established pattern in the codebase.


87-87: Correctly exported the new service

The SqlCorrectionService is appropriately added to the all list, making it available for import by other modules.

wren-ai-service/src/web/v1/routers/__init__.py (2)

14-14: Import looks good

The sql_corrections module is properly imported, consistent with the pattern for other routers.


34-34: Router integration is correct

The sql_corrections router is properly integrated into the main router, maintaining the established pattern in the codebase.

wren-ai-service/src/web/v1/services/instructions.py (2)

54-54: Error instantiation refactoring looks good

Changed from self.Event.Error to self.Error, simplifying the error instantiation approach and likely aligning with patterns used in the new SqlCorrectionService.


154-154: Consistent error handling approach

Updated to use self.Error directly, matching the change made in the _handle_exception method and maintaining consistency throughout the service.

wren-ai-service/src/globals.py (2)

30-30: Service container attribute looks good

The SqlCorrectionService is properly added to the ServiceContainer class, maintaining consistent structure with other services.


261-269: Service instantiation looks good

The SqlCorrectionService is properly instantiated with the correct pipeline configuration, following the established pattern in the codebase. The service reuses the existing sql_correction pipeline that's already used elsewhere in the application (e.g., in sql_expansion_service), which is a good practice for code reuse.

wren-ai-service/src/web/v1/routers/sql_corrections.py (2)

19-63: Good documentation for the new endpoints.

The detailed docstring provides clarity regarding the endpoint usage and workflow, which is helpful for both developers and API consumers.


66-70: Validate optional fields.

project_id is optional, which is fine. However, consider validating or normalizing the project ID if it's a critical piece of metadata. This ensures that downstream services won't fail due to an unexpected format or missing value.

wren-ai-service/src/web/v1/services/sql_corrections.py (2)

15-25: Clear event modeling.

The Error and Event models neatly encapsulate all necessary properties, facilitating straightforward data handling. This approach makes it easy to track the status of each correction request and attach relevant error information.


107-120: Consistent approach for unknown events.

The fallback to a failed event is consistent with your design. However, if you decide to return a 404 in the router, ensure that the same logic is reflected here by throwing a custom exception or returning a sentinel value that the router can interpret.

Comment thread wren-ai-service/src/web/v1/services/sql_corrections.py
@paopa paopa force-pushed the feat/sql-correcting-endpoint branch from 383d804 to caa3f87 Compare March 18, 2025 09:25
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (9)
wren-ai-service/src/globals.py (2)

30-30: Add a concise docstring or comment for “sql_correction_service.”
Currently, there is no description for this attribute. Including a docstring or comment clarifying its purpose would improve maintainability and readability.


261-269: Consider consolidating repeated pipeline instances.
A sql_correction pipeline is added here, but a similar pipeline reference appears in other services (e.g., ask_service at line 116). Reusing or consolidating this pipeline might avoid duplication and reduce overhead.

wren-ai-service/src/web/v1/routers/sql_corrections.py (2)

76-97: Background task usage is correct; consider adding error response.
While the primary logic is in the background task, you may optionally provide an immediate check that the pipeline key exists (like you do in the service). If missing, you might return a 400 or informative response. Currently, the code always defers checking to the background task.


107-113: Return 404 or similar for missing events.
Currently, if an event is missing, the code returns a failed status with OTHERS error. Consider using HTTP status 404 to reflect that the resource (event) does not exist.

wren-ai-service/src/web/v1/services/sql_corrections.py (5)

16-19: Error model is sufficient for general failures.
Provides a helpful structure (code, message). If you anticipate more error codes, consider enumerating them.


27-35: Cache-based storage is straightforward but be mindful of concurrency.
TTLCache is not inherently threadsafe if used in a multi-worker environment. If concurrency grows, you may need a lock or a thread-safe cache variant.


36-50: Exception handling logs the error but consider advanced logging strategies.
When an exception occurs, you set the event to “failed” and store the error. This is good. For large-scale debugging, consider structured logging or linking a correlation ID.


57-106: Check for missing pipeline keys before usage.
You do self._pipelines["sql_correction"] directly. If sql_correction is unexpectedly absent, you raise a KeyError. You have a fallback in _handle_exception, but a pre-check with a friendlier error might improve resilience.


107-120: Graceful handling of missing or expired events.
Returns a “failed” event with message. This is user-friendly, but consider also returning an HTTP 404 in the router if an event is absent.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a7906a and caa3f87.

📒 Files selected for processing (6)
  • wren-ai-service/src/globals.py (2 hunks)
  • wren-ai-service/src/web/v1/routers/__init__.py (2 hunks)
  • wren-ai-service/src/web/v1/routers/sql_corrections.py (1 hunks)
  • wren-ai-service/src/web/v1/services/__init__.py (2 hunks)
  • wren-ai-service/src/web/v1/services/instructions.py (2 hunks)
  • wren-ai-service/src/web/v1/services/sql_corrections.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • wren-ai-service/src/web/v1/services/instructions.py
  • wren-ai-service/src/web/v1/services/init.py
  • wren-ai-service/src/web/v1/routers/init.py
🧰 Additional context used
🧬 Code Definitions (3)
wren-ai-service/src/web/v1/routers/sql_corrections.py (1)
wren-ai-service/src/web/v1/services/sql_corrections.py (4) (4)
  • SqlCorrectionService (15:122)
  • correct (59:105)
  • Event (20:25)
  • CorrectionRequest (51:55)
wren-ai-service/src/globals.py (1)
wren-ai-service/src/pipelines/generation/sql_correction.py (1) (1)
  • SQLCorrection (99:143)
wren-ai-service/src/web/v1/services/sql_corrections.py (1)
wren-ai-service/src/web/v1/routers/sql_corrections.py (2) (2)
  • correct (77:96)
  • get (108:113)
⏰ Context from checks skipped due to timeout of 90000ms (3)
  • GitHub Check: pytest
  • GitHub Check: Analyze (javascript-typescript)
  • GitHub Check: Analyze (go)
🔇 Additional comments (9)
wren-ai-service/src/web/v1/routers/sql_corrections.py (5)

1-14: Imports and module setup look good.
All necessary imports for FastAPI, Pydantic, and service dependencies are properly organized. No issues noted here.


16-63: Docstring thoroughly documents the router’s purpose.
The router-level docstring is comprehensive and outlines usage, endpoints, and request/response models. Nicely done for maintainability.


66-70: Validate nullable fields in PostRequest.
project_id is optional, but consider confirming that an empty string vs. null is handled appropriately downstream.


72-74: PostResponse data structure is minimal but sufficient.
Only event_id is returned. This is clear for asynchronous processes. No issues found.


99-105: GetResponse model is straightforward.
It closely parallels the event’s structure in SqlCorrectionService. No issues found.

wren-ai-service/src/web/v1/services/sql_corrections.py (4)

1-4: Imports and logging setup are fine.
No extraneous imports or unused references detected here.


20-26: Event model design is clear.
Covers typical states for asynchronous tasks, including an optional trace_id for debugging. Good approach.


51-56: CorrectionRequest structure is well-defined.
Captures essentials (sql, error, project_id). This is adequate for passing correction data.


121-123: setitem usage is consistent with your caching approach.
Directly assigning the Event object is fine; just keep concurrency considerations in mind if usage increases.

Copy link
Copy Markdown
Member

@cyyeh cyyeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: we'll improve this PR by adding sql functions, instructions etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci/ai-service ai-service related module/ai-service ai-service related wren-ai-service

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants