feat(wren-ai-service): user guide and misleading streaming(ai-env-changed)#1015
Conversation
3c8107f to
dabf183
Compare
dabf183 to
173fc85
Compare
WalkthroughThis pull request introduces new pipeline entries for misleading and user guide assistance across various configuration files and Python modules. It adds new settings ( Changes
Sequence Diagram(s)sequenceDiagram
participant S as Service Container
participant C as create_service_container
participant U as fetch_wren_ai_docs
participant IC as IntentClassification
participant UA as UserGuideAssistance
S->>C: Initialize container with settings
C->>U: Call fetch_wren_ai_docs(doc_endpoint, is_oss)
U-->>C: Return documentation data
C->>IC: Instantiate with wren_ai_docs
C->>UA: Instantiate with wren_ai_docs
sequenceDiagram
participant U as User
participant A as Ask Service
participant IC as IntentClassification Pipeline
participant UA as UserGuideAssistance Pipeline
U->>A: Send query
A->>A: Determine intent (USER_GUIDE vs MISLEADING_QUERY)
alt Intent is USER_GUIDE
A->>UA: Route query to UserGuideAssistance
else Intent is MISLEADING_QUERY
A->>IC: Route query to IntentClassification
end
UA-->>A: Return result
IC-->>A: Return result
A->>U: Respond with processed output
Possibly related PRs
Suggested reviewers
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (7)
🚧 Files skipped from review as they are similar to previous changes (6)
⏰ Context from checks skipped due to timeout of 90000ms (1)
🔇 Additional comments (2)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
3c81896 to
8256540
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (9)
wren-ai-service/docs/config_examples/config.deepseek.yaml (1)
132-133: Misleading Assistance Pipeline in Deepseek Configuration
The newmisleading_assistancepipeline entry is added with the default LLM configuration. Given that Deepseek configurations sometimes use their own models (as seen with other entries), please verify whether usinglitellm_llm.defaultis intentional here or if a Deepseek-specific model should be applied.wren-ai-service/tests/data/config.test.yaml (1)
78-82: LGTM: New pipeline entries added, but remove trailing whitespace.The new
user_guide_assistanceanddata_assistancepipelines have been correctly added with the appropriate model, but there's trailing whitespace on line 82 that should be removed.- - name: data_assistance - llm: openai_llm.gpt-4o-mini - + - name: data_assistance + llm: openai_llm.gpt-4o-mini🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 82-82: trailing spaces
(trailing-spaces)
wren-ai-service/src/pipelines/generation/misleading_assistance.py (3)
49-68: Consider adding docstrings for clarity.
Thispromptfunction constructs a prompt for handling a potentially misleading user query. Adding a short docstring would help future contributors understand its purpose and expected input/output at a glance.
99-108: Confirm no memory leaks in_streaming_callback.
Each newquery_idspawns a fresh queue. If the user disconnects before reading, these queues could remain unconsumed. Consider a cleanup strategy for stale queues if not consumed within a certain time.
134-154: Confirm returned data shape inrun.
The method returns the result fromself._pipe.execute(...). Document the structure (e.g. dict with “replies” or similar) to clarify to consumers.wren-ai-service/src/pipelines/generation/user_guide_assistance.py (2)
43-55: Consider adding docstrings for prompt construction.
Thepromptfunction merges user input, language, and docs. Including a brief docstring specifying parameters and returns would help maintainability.
90-99: Evaluate potential queue growth.
Like inmisleading_assistance, consider a cleanup policy if user queues are never drained. Otherwise, these queues could accumulate indefinitely in long-running services.wren-ai-service/src/pipelines/generation/intent_classification.py (2)
28-28: Clarify classification instructions.
This added line mentions four conditions but lumps them into a single sentence. Consider providing a concise bullet list for readability.
337-338: Neat approach for flexible config injection.
Acceptingwren_ai_docsvia_configsis an elegant way to keep the pipeline modular. Just confirm that each consumer expects these docs.Also applies to: 361-363, 390-390
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (21)
deployment/kustomizations/base/cm.yaml(3 hunks)docker/config.example.yaml(3 hunks)wren-ai-service/docs/config_examples/config.anthropic.yaml(1 hunks)wren-ai-service/docs/config_examples/config.azure.yaml(1 hunks)wren-ai-service/docs/config_examples/config.deepseek.yaml(1 hunks)wren-ai-service/docs/config_examples/config.google_ai_studio.yaml(1 hunks)wren-ai-service/docs/config_examples/config.groq.yaml(1 hunks)wren-ai-service/docs/config_examples/config.lm_studio.yaml(1 hunks)wren-ai-service/docs/config_examples/config.ollama.yaml(1 hunks)wren-ai-service/src/config.py(1 hunks)wren-ai-service/src/globals.py(3 hunks)wren-ai-service/src/pipelines/generation/__init__.py(3 hunks)wren-ai-service/src/pipelines/generation/intent_classification.py(10 hunks)wren-ai-service/src/pipelines/generation/misleading_assistance.py(1 hunks)wren-ai-service/src/pipelines/generation/user_guide_assistance.py(1 hunks)wren-ai-service/src/utils.py(2 hunks)wren-ai-service/src/web/v1/services/ask.py(4 hunks)wren-ai-service/tests/data/config.test.yaml(1 hunks)wren-ai-service/tests/pytest/services/test_ask.py(2 hunks)wren-ai-service/tools/config/config.example.yaml(3 hunks)wren-ai-service/tools/config/config.full.yaml(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (5)
wren-ai-service/src/globals.py (2)
wren-ai-service/src/utils.py (1)
fetch_wren_ai_docs(163-185)wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
UserGuideAssistance(65-141)
wren-ai-service/tests/pytest/services/test_ask.py (1)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
UserGuideAssistance(65-141)
wren-ai-service/src/pipelines/generation/__init__.py (2)
wren-ai-service/src/pipelines/generation/misleading_assistance.py (2)
misleading_assistance(71-72)MisleadingAssistance(78-154)wren-ai-service/src/pipelines/generation/user_guide_assistance.py (2)
user_guide_assistance(58-59)UserGuideAssistance(65-141)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (5)
wren-ai-service/src/core/pipeline.py (1)
BasicPipeline(15-21)wren-ai-service/src/core/provider.py (1)
LLMProvider(6-15)wren-ai-service/src/pipelines/generation/intent_classification.py (2)
prompt(262-284)run(370-392)wren-ai-service/src/pipelines/generation/data_assistance.py (5)
prompt(51-67)run(135-154)_streaming_callback(99-107)get_streaming_results(109-132)_get_streaming_results(110-111)wren-ai-service/src/pipelines/common.py (1)
dry_run_pipeline(36-59)
wren-ai-service/src/web/v1/services/ask.py (3)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
run(125-141)wren-ai-service/src/pipelines/generation/intent_classification.py (1)
run(370-392)wren-ai-service/src/pipelines/generation/followup_sql_generation_reasoning.py (1)
run(173-196)
🪛 GitHub Actions: AI Service Test
wren-ai-service/tests/pytest/services/test_ask.py
[error] 39-39: TypeError: IntentClassification.init() missing 1 required positional argument: 'wren_ai_docs'
[error] 233-233: AssertionError: assert 'failed' == 'finished'
[error] 623-623: KeyError: 'misleading_assistance'
🪛 YAMLlint (1.35.1)
wren-ai-service/tests/data/config.test.yaml
[error] 82-82: trailing spaces
(trailing-spaces)
🔇 Additional comments (40)
wren-ai-service/docs/config_examples/config.anthropic.yaml (1)
104-105: New Pipeline Entry for Misleading Assistance Added
The introduction of themisleading_assistancepipeline entry withllm: litellm_llm.defaultis correctly implemented. Please verify that this new entry is aligned with the overall intent of the PR. Additionally, since the PR title mentions a second component (user guide assistance), confirm whether a correspondinguser_guide_assistanceentry should be added here or is handled elsewhere.wren-ai-service/docs/config_examples/config.azure.yaml (1)
107-108: New Misleading Assistance Pipeline Entry in Azure Configuration
The newly addedmisleading_assistanceentry appears properly formatted and positioned within thepipeslist. Please verify that its placement and the use oflitellm_llm.defaultare intentional for this Azure-specific configuration. Consistency with other environments is critical.wren-ai-service/docs/config_examples/config.groq.yaml (1)
113-114: Addition of Misleading Assistance Pipeline in GROQ Config
The update adds themisleading_assistancepipeline with the appropriate LLM configuration. This change looks consistent with the other configuration files. As with the other files, please ensure that if auser_guide_assistancecomponent is expected per PR objectives, it is added where appropriate.wren-ai-service/docs/config_examples/config.google_ai_studio.yaml (1)
118-119: Misleading Assistance Pipeline Added to Google AI Studio Config
The new pipeline entry formisleading_assistanceusinglitellm_llm.defaultis correctly introduced. Please double-check that this entry meets the intended configuration requirements for the Google AI Studio environment. Also, confirm if a similar entry foruser_guide_assistanceshould be integrated following the PR objectives.wren-ai-service/docs/config_examples/config.lm_studio.yaml (1)
112-113: New misleading_assistance pipeline properly integrated.The addition of the
misleading_assistancepipeline withlitellm_llm.defaultas the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.wren-ai-service/docs/config_examples/config.ollama.yaml (1)
110-111: New misleading_assistance pipeline properly integrated.The addition of the
misleading_assistancepipeline withlitellm_llm.defaultas the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.wren-ai-service/tools/config/config.example.yaml (3)
122-123: New misleading_assistance pipeline properly integrated.The addition of the
misleading_assistancepipeline withlitellm_llm.defaultas the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.
137-138: New user_guide_assistance pipeline properly integrated.The addition of the
user_guide_assistancepipeline withlitellm_llm.defaultas the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.
166-167: New documentation settings added correctly.The new settings
doc_endpointandis_ossare properly defined with appropriate values. These settings will be used by thefetch_wren_ai_docsfunction as seen in other files.wren-ai-service/tools/config/config.full.yaml (3)
122-123: New misleading_assistance pipeline properly integrated.The addition of the
misleading_assistancepipeline withlitellm_llm.defaultas the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.
137-138: New user_guide_assistance pipeline properly integrated.The addition of the
user_guide_assistancepipeline withlitellm_llm.defaultas the LLM provider is correctly configured and follows the same pattern as other pipeline entries in this configuration file.
166-167: New documentation settings added correctly.The new settings
doc_endpointandis_ossare properly defined with appropriate values. These settings will be used by thefetch_wren_ai_docsfunction as seen in other files.wren-ai-service/src/globals.py (5)
10-10: Import for new documentation fetching utility added correctly.The import for
fetch_wren_ai_docsfromsrc.utilsis properly added to support the new documentation-related functionality.
48-49: Documentation fetching implemented correctly.The code correctly calls the
fetch_wren_ai_docsfunction with the appropriate settings parameters to retrieve the documentation needed for the new assistance pipelines.
86-89: Documentation data properly passed to IntentClassification component.The
wren_ai_docsvariable is correctly passed to theIntentClassificationpipeline component, enabling it to access the documentation data for user intent analysis.
90-92: MisleadingAssistance component properly integrated.The initialization of the
MisleadingAssistancepipeline component is correctly implemented using the appropriate pipe components configuration.
96-99: UserGuideAssistance component properly integrated with documentation data.The
UserGuideAssistancepipeline component is correctly initialized with both the pipe components configuration and thewren_ai_docsdata needed for providing documentation-based assistance.wren-ai-service/src/config.py (1)
57-60: LGTM: User guide configuration added correctly.The new user guide configuration section with
is_ossanddoc_endpointfields is well documented and follows the established pattern in the codebase. The default values look appropriate.docker/config.example.yaml (3)
109-111: LGTM: New misleading_assistance pipeline added.The new pipeline entry has been correctly configured to use the
litellm_llm.defaultmodel.
128-129: LGTM: New user_guide_assistance pipeline added.The new pipeline entry has been correctly configured to use the
litellm_llm.defaultmodel.
153-154: LGTM: New configuration settings added.The
doc_endpointandis_osssettings are correctly added to the example configuration file with appropriate values.wren-ai-service/tests/data/config.test.yaml (1)
88-89: LGTM: New settings configuration added.The
doc_endpointandis_osssettings are correctly added to the test configuration file with appropriate values.wren-ai-service/src/pipelines/generation/__init__.py (4)
7-7: LGTM: Added import for MisleadingAssistance.The import for the new
MisleadingAssistanceclass has been correctly added in alphabetical order.
20-20: LGTM: Added import for UserGuideAssistance.The import for the new
UserGuideAssistanceclass has been correctly added in alphabetical order.
38-39: LGTM: Added UserGuideAssistance to all list.The
UserGuideAssistanceclass has been correctly added to the__all__list.
42-42: LGTM: Added MisleadingAssistance to all list.The
MisleadingAssistanceclass has been correctly added to the__all__list.deployment/kustomizations/base/cm.yaml (3)
157-159: LGTM! Added pipeline component for misleading assistance.The new pipeline component
misleading_assistancehas been properly configured to use thelitellm_llm.defaultmodel.
176-178: LGTM! Added pipeline component for user guide assistance.The new pipeline component
user_guide_assistancehas been properly configured to use thelitellm_llm.defaultmodel.
200-201: LGTM! Added new settings for documentation endpoint and OSS flag.The settings
doc_endpointandis_osshave been properly added to the configuration. These will be used by thefetch_wren_ai_docsfunction to determine which documentation to fetch.wren-ai-service/src/web/v1/services/ask.py (3)
102-104: LGTM! Addedgeneral_typefield to response models.The
general_typefield correctly categorizes the type of assistance provided in the response.Also applies to: 109-111
307-327: LGTM! Added handling for theMISLEADING_QUERYintent.The code now properly handles the
MISLEADING_QUERYintent by creating an asynchronous task to run themisleading_assistancepipeline.
354-371: LGTM! Added handling for theUSER_GUIDEintent.The code now properly handles the
USER_GUIDEintent by creating an asynchronous task to run theuser_guide_assistancepipeline.wren-ai-service/src/pipelines/generation/misleading_assistance.py (2)
70-73: Validate the existence of prompt data before usage.
Inmisleading_assistance, ensureprompt.get("prompt")always returns a valid string. If it’sNone, the generator call might raise an exception.Do we have upstream checks guaranteeing
"prompt"is populated?
109-133: Assess the infinite loop risk.
Thewhile True:loop inget_streaming_resultsterminates on either a<DONE>token or a 120s timeout. While this works, verify that large or slow streams won’t surpass 120 seconds, leading to unintended breaks.wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
58-60: Check for null or empty prompt string.
Inuser_guide_assistance, verify the prompt dictionary actually contains"prompt". Otherwise, the generator call may fail.wren-ai-service/src/pipelines/generation/intent_classification.py (5)
59-72: Ensure the “GENERAL” category covers only non-schema questions.
The updated definition for GENERAL might overlap with MISLEADING_QUERY. Double-check the boundary conditions so the pipeline’s logic remains consistent.
73-84: Good expansion for USER_GUIDE category.
Introducing “USER_GUIDE” clarifies queries focused on Wren AI usage. This is a well-structured addition that helps separate schema-related from guide-related queries.
106-107: Consistent naming for JSON output.
The expanded"results"field includes"USER_GUIDE". Ensure all downstream parsers handle this new value without errors.
264-264: Validate wren_ai_docs usage.
You’ve introducedwren_ai_docsto the prompt and pass it todocs=wren_ai_docs. Verify these docs are relevant in all flows and that no sensitive internal data is unintentionally exposed.Also applies to: 283-283
315-315: Allow for new classification results.
The expandedresultsliteral supports the new “USER_GUIDE” category. This is a clean approach that helps ensure type consistency with Pydantic.
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (18)
wren-ai-service/docs/config_examples/config.ollama.yaml (1)
110-111: New Pipeline Entry for Misleading Assistance:
A new pipeline entry namedmisleading_assistance(usinglitellm_llm.default) has been added. Please verify that its structure (with only thellmfield) is sufficient for the intended assistance functionality. Also, the PR title suggests that a “user guide” related feature is expected—confirm if a correspondinguser_guide_assistanceentry should also be introduced.wren-ai-service/docs/config_examples/config.groq.yaml (1)
113-114: New Pipeline Entry for Misleading Assistance:
The file now includes amisleading_assistancepipeline entry withlitellm_llm.default. Ensure this addition is consistent with similar entries in other configuration examples, and check if additional similar pipelines (e.g. for user guide assistance) are expected according to the PR objectives.wren-ai-service/docs/config_examples/config.lm_studio.yaml (1)
112-113: New Misleading Assistance Pipeline Entry:
A new pipeline namedmisleading_assistanceusinglitellm_llm.defaulthas been added. Please confirm that this entry meets the design requirements for assistance features and consider whether a separateuser_guide_assistancepipeline is also required.wren-ai-service/docs/config_examples/config.anthropic.yaml (1)
104-105: Introduced Misleading Assistance Pipeline:
The newly introducedmisleading_assistancepipeline usinglitellm_llm.defaultaligns with the ongoing assistance enhancements. Confirm that this configuration fits into the overall strategy for handling misleading queries and that no additional assistance pipelines (e.g.user_guide_assistance) are missed.wren-ai-service/docs/config_examples/config.azure.yaml (1)
107-108: New Misleading Assistance Pipeline Entry:
The file now includes amisleading_assistanceentry withlitellm_llm.default. Ensure that this entry is consistent with its counterparts in other configuration files and supports the intended misleading streaming functionality described in the PR.wren-ai-service/tests/data/config.test.yaml (1)
82-82: Remove trailing whitespace.There's trailing whitespace at the end of line 82 that should be removed.
- - name: data_assistance - llm: openai_llm.gpt-4o-mini - + - name: data_assistance + llm: openai_llm.gpt-4o-mini🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 82-82: trailing spaces
(trailing-spaces)
wren-ai-service/tests/pytest/services/test_ask.py (1)
176-177: Good mock addition but check if more implementation is needed.You've added
user_guide_assistanceto the mock service, but the empty string might not be sufficient if tests need to interact with it meaningfully. Consider creating a proper mock class similar to other mocks in the file.-"user_guide_assistance": "", +"user_guide_assistance": UserGuideAssistanceMock(),Then add a
UserGuideAssistanceMockclass to the mocks.py file:class UserGuideAssistanceMock: async def run(self, query: str, language: str, query_id: Optional[str] = None): return {"response": "Mock user guide assistance response"} async def get_streaming_results(self, query_id): yield "Mock streaming response"wren-ai-service/src/web/v1/services/ask.py (1)
109-111: Field usage inAskResultResponse.
While_AskResultResponseandAskResultResponseshare similar fields, ensure that combining them doesn’t cause confusion. Theexclude=Truedirective might help keep the public fields uncluttered.wren-ai-service/src/pipelines/generation/user_guide_assistance.py (3)
17-28: Maintain clarity in system instructions.
The system prompt is self-explanatory. Consider referencing any disclaimers in the doc if user queries fall outside the coverage of your user guide.
100-123: Streaming results logic.
Enforcing a 120-second timeout is good, but consider logging or returning partial results if the user queue is empty or delayed.
144-153: Dry run example.
Demonstrates pipeline usage with a sample query. Consider adding a short docstring explaining how to execute this module.wren-ai-service/src/pipelines/generation/misleading_assistance.py (7)
15-15: Use module-specific logger.
Consider usinglogging.getLogger(__name__)to match the module name and improve log traceability.
18-33: Consider externalizing the system prompt.
Externalizing lengthy prompt text (e.g., in a config file or constants module) can enhance maintainability, especially if it's reused or localized.
49-68: Prevent potential token overflow when merging history.
Ifhistoriescan become very large, consider bounding or summarizing past questions to prevent exceeding token limits.
70-73: Add error handling for missing or invalid prompt fields.
A try-except block or validations onprompt["prompt"]can help prevent runtime errors ifpromptis malformed.
78-94: Add docstring to clarify pipeline usage.
A brief docstring explaining initialization parameters and_componentswould enhance readability.
99-108: Handle null or empty chunk content gracefully.
Consider verifyingchunk.contentisn't empty orNoneand logging errors for unexpected streaming issues.
109-133: Log streaming timeouts.
Whenasyncio.wait_fortimes out, logging a warning or error can help diagnose stream disconnections or long response times.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (21)
deployment/kustomizations/base/cm.yaml(3 hunks)docker/config.example.yaml(3 hunks)wren-ai-service/docs/config_examples/config.anthropic.yaml(1 hunks)wren-ai-service/docs/config_examples/config.azure.yaml(1 hunks)wren-ai-service/docs/config_examples/config.deepseek.yaml(1 hunks)wren-ai-service/docs/config_examples/config.google_ai_studio.yaml(1 hunks)wren-ai-service/docs/config_examples/config.groq.yaml(1 hunks)wren-ai-service/docs/config_examples/config.lm_studio.yaml(1 hunks)wren-ai-service/docs/config_examples/config.ollama.yaml(1 hunks)wren-ai-service/src/config.py(1 hunks)wren-ai-service/src/globals.py(3 hunks)wren-ai-service/src/pipelines/generation/__init__.py(3 hunks)wren-ai-service/src/pipelines/generation/intent_classification.py(10 hunks)wren-ai-service/src/pipelines/generation/misleading_assistance.py(1 hunks)wren-ai-service/src/pipelines/generation/user_guide_assistance.py(1 hunks)wren-ai-service/src/utils.py(2 hunks)wren-ai-service/src/web/v1/services/ask.py(4 hunks)wren-ai-service/tests/data/config.test.yaml(1 hunks)wren-ai-service/tests/pytest/services/test_ask.py(2 hunks)wren-ai-service/tools/config/config.example.yaml(3 hunks)wren-ai-service/tools/config/config.full.yaml(3 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (6)
wren-ai-service/src/pipelines/generation/__init__.py (2)
wren-ai-service/src/pipelines/generation/misleading_assistance.py (2)
misleading_assistance(71-72)MisleadingAssistance(78-154)wren-ai-service/src/pipelines/generation/user_guide_assistance.py (2)
user_guide_assistance(58-59)UserGuideAssistance(65-141)
wren-ai-service/src/globals.py (4)
wren-ai-service/src/utils.py (1)
fetch_wren_ai_docs(163-185)wren-ai-service/src/pipelines/generation/misleading_assistance.py (1)
MisleadingAssistance(78-154)wren-ai-service/src/pipelines/generation/data_assistance.py (1)
DataAssistance(78-154)wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
UserGuideAssistance(65-141)
wren-ai-service/tests/pytest/services/test_ask.py (1)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
UserGuideAssistance(65-141)
wren-ai-service/src/web/v1/services/ask.py (3)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
run(125-141)wren-ai-service/src/pipelines/generation/intent_classification.py (1)
run(370-392)wren-ai-service/src/pipelines/generation/followup_sql_generation_reasoning.py (1)
run(173-196)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (5)
wren-ai-service/src/core/pipeline.py (1)
BasicPipeline(15-21)wren-ai-service/src/core/provider.py (1)
LLMProvider(6-15)wren-ai-service/src/pipelines/generation/intent_classification.py (2)
prompt(262-284)run(370-392)wren-ai-service/src/pipelines/generation/data_assistance.py (5)
prompt(51-67)run(135-154)_streaming_callback(99-107)get_streaming_results(109-132)_get_streaming_results(110-111)wren-ai-service/src/pipelines/common.py (1)
dry_run_pipeline(36-59)
wren-ai-service/src/pipelines/generation/misleading_assistance.py (3)
wren-ai-service/src/core/pipeline.py (1)
BasicPipeline(15-21)wren-ai-service/src/core/provider.py (1)
LLMProvider(6-15)wren-ai-service/src/web/v1/services/ask.py (2)
ask(204-639)AskHistory(16-18)
🪛 YAMLlint (1.35.1)
wren-ai-service/tests/data/config.test.yaml
[error] 82-82: trailing spaces
(trailing-spaces)
🪛 GitHub Actions: AI Service Test
wren-ai-service/tests/pytest/services/test_ask.py
[error] 39-39: TypeError: IntentClassification.init() missing 1 required positional argument: 'wren_ai_docs'
[error] 233-233: AssertionError: assert 'failed' == 'finished'
wren-ai-service/src/web/v1/services/ask.py
[error] 623-623: KeyError: 'misleading_assistance'
[error] 103-103: SQL is not executable: No project found
[warning] 155-155: SQL pairs file not found: sql_pairs.json
[warning] 291-291: Calling QdrantDocumentStore.write_documents() with empty list
[error] 51-51: Failed to parse MDL: unexpected character: line 1 column 1 (char 0)
[error] 58-58: An error occurred during relationship recommendation generation: Pipeline error
[error] 139-139: Relationship Recommendation Resource with ID 'non_existent_id' not found.
[error] 51-51: An error occurred during semantics description generation: Test exception
[warning] 155-155: SQL pairs file not found: sql_pairs.json
[warning] 291-291: Calling QdrantDocumentStore.write_documents() with empty list
[warning] 155-155: SQL pairs file not found: sql_pairs.json
[warning] 155-155: SQL pairs file not found: sql_pairs.json
[warning] 291-291: Calling QdrantDocumentStore.write_documents() with empty list
[warning] 155-155: SQL pairs file not found: sql_pairs.json
🔇 Additional comments (42)
wren-ai-service/docs/config_examples/config.ollama.yaml (1)
147-166: Settings Block Review:
Thesettingsblock in this file remains unchanged. According to the PR objectives and high-level summary, new parameters likedoc_endpoint(set tohttps://docs.getwren.ai) andis_oss(set totrue) should be introduced. Please verify whether these settings are omitted by design in the docs example or if they need to be added here.wren-ai-service/docs/config_examples/config.azure.yaml (1)
159-178: Verify Settings Consistency:
The currentsettingsblock does not include the new parameters (doc_endpointandis_oss) described in the PR objectives/summary. Please confirm if these settings are only meant for production configurations (e.g. indeployment/kustomizations/base/cm.yaml,docker/config.example.yaml, orwren-ai-service/src/config.py) or if they should also be reflected in the docs example.wren-ai-service/docs/config_examples/config.deepseek.yaml (1)
132-133: LGTM: Clean misleading_assistance pipeline implementationThe addition of the
misleading_assistancepipeline is properly configured with thelitellm_llm.defaultmodel. This matches the PR objectives for addressing misleading queries.wren-ai-service/tools/config/config.full.yaml (3)
122-123: LGTM: Consistent misleading_assistance implementationThe misleading_assistance pipeline implementation aligns with the configuration in other files, using the appropriate LLM model.
137-138: LGTM: user_guide_assistance pipeline additionThe user_guide_assistance pipeline is properly added with the correct LLM model reference, supporting the new documentation capabilities mentioned in the PR objectives.
166-167: LGTM: Settings for documentation endpoint and OSS statusThe addition of
doc_endpointandis_osssettings provides the necessary configuration for documentation fetching functionality, aligning with the PR objectives.wren-ai-service/src/pipelines/generation/intent_classification.py (5)
28-28: LGTM: Expanded intent classification categoriesThe addition of "USER_GUIDE" as a fourth classification category is consistently implemented across the system prompt, output format, and result model class.
Also applies to: 106-106, 315-315
59-84: LGTM: Well-defined intent categoriesThe updated GENERAL category and new USER_GUIDE category are clearly defined with appropriate characteristics and examples, making the intent classification more robust and capable of handling documentation requests.
141-144: LGTM: User guide integration in prompt templateThe prompt template now properly includes the USER GUIDE section, enabling the model to consider documentation content when classifying user intents.
264-265: LGTM: Documentation parameter in prompt functionThe prompt function correctly accepts and utilizes the wren_ai_docs parameter, passing it to the prompt builder.
Also applies to: 283-283
337-338: LGTM: wren_ai_docs integration in IntentClassification classThe IntentClassification class has been properly updated to accept and store the wren_ai_docs parameter, integrating documentation content into the intent classification pipeline.
Also applies to: 361-363
wren-ai-service/docs/config_examples/config.google_ai_studio.yaml (1)
118-119: LGTM: Consistent misleading_assistance implementationThe addition of the misleading_assistance pipeline matches the pattern established in other configuration files, maintaining consistency across different model providers.
wren-ai-service/src/config.py (1)
57-59: Looking good: New user guide configuration parameters added.The new configuration parameters for the user guide feature are well-structured and properly typed with appropriate defaults.
wren-ai-service/src/pipelines/generation/__init__.py (2)
7-7: Well implemented: New pipeline imports for assistance features.The imports for
MisleadingAssistanceandUserGuideAssistanceclasses are properly added.Also applies to: 20-20
38-38: Correctly exposed: New pipeline components added to all.The new pipeline components have been properly exported in the
__all__list, maintaining the module's public interface.Also applies to: 42-42
deployment/kustomizations/base/cm.yaml (2)
157-158: Complete configuration: New pipeline entries registered.The
misleading_assistanceanduser_guide_assistancepipeline configurations are properly defined, following the same pattern as other pipeline entries.Also applies to: 176-177
200-201: Good configuration: User guide settings added.The
doc_endpointandis_osssettings are properly configured, matching the defaults from the Settings class.wren-ai-service/tests/data/config.test.yaml (2)
78-81: Well tested: Pipeline entries added to test configuration.The test configuration has been updated with the new pipeline entries to ensure proper testing of the new functionality.
88-89: Test settings updated correctly: Added user guide configuration.The
doc_endpointandis_osssettings have been properly added to the test configuration, aligning with the main configuration.docker/config.example.yaml (3)
109-110: New pipeline addition looks good.The
misleading_assistancepipeline has been properly added and configured with thelitellm_llm.defaultmodel.
128-129: New pipeline addition looks good.The
user_guide_assistancepipeline has been properly added and configured with thelitellm_llm.defaultmodel.
153-154:Details
✅ Verification successful
Verify that the documentation endpoint is correct and accessible.
The new settings for
doc_endpointandis_osshave been added correctly, but it's important to ensure the endpoint is accessible and returns the expected format.
🏁 Script executed:
#!/bin/bash # Check if the documentation endpoint is accessible curl -s -o /dev/null -w "%{http_code}" https://docs.getwren.ai/oss/llms.mdLength of output: 80
Documentation endpoint verified successfully
The documentation endpoint athttps://docs.getwren.ai(tested viahttps://docs.getwren.ai/oss/llms.md) returned an HTTP 200 response, indicating it is accessible and functioning as expected. No changes are needed for these settings.wren-ai-service/tools/config/config.example.yaml (3)
122-123: New pipeline addition looks good.The
misleading_assistancepipeline has been properly added and configured with thelitellm_llm.defaultmodel.
137-138: New pipeline addition looks good.The
user_guide_assistancepipeline has been properly added and configured with thelitellm_llm.defaultmodel.
166-167: Settings additions look good.The new settings for
doc_endpointandis_osshave been added correctly.wren-ai-service/src/globals.py (3)
10-10: Import statement looks good.
Importingfetch_wren_ai_docsat the top level is a clear and readable approach to keep all dependencies visible.
96-99: Passing documentation to UserGuideAssistance.
This is consistent with the design of your new pipeline. Confirm that providing the samewren_ai_docsinstance here doesn’t lead to duplication or memory overhead issues.
88-89:Details
❓ Verification inconclusive
Pass
wren_ai_docscarefully.
You're passing documentation data to the IntentClassification pipeline. Confirm that the pipeline properly uses this data and handles any edge cases if the docs are empty.
🏁 Script executed:
#!/bin/bash # Verify that the `IntentClassification` class references `wren_ai_docs` safely rg 'wren_ai_docs' -A 10Length of output: 12191
Attention: Verify Handling of Empty Documentation in the IntentClassification Pipeline
It appears that
wren_ai_docsis passed fromglobals.pydirectly into the pipelines (including IntentClassification inwren-ai-service/src/pipelines/generation/intent_classification.py). Please confirm that the downstream components—particularly theprompt_builder.runcall within the IntentClassification pipeline—gracefully handle the scenario wherewren_ai_docsis empty. If the documentation data could ever be empty, ensure that appropriate fallback behavior or error checking is implemented to prevent runtime issues.wren-ai-service/src/web/v1/services/ask.py (5)
102-104: Introduction ofgeneral_typein_AskResultResponse.
This optional field helps categorize general assistance queries (MISLEADING_QUERY, DATA_ASSISTANCE, USER_GUIDE). The addition is clear and consistent.🧰 Tools
🪛 GitHub Actions: AI Service Test
[error] 103-103: SQL is not executable: No project found
326-326: Populatinggeneral_type="MISLEADING_QUERY".
This properly flags the assistance pipeline behind the scenes. No issues found here.
350-351: Assigninggeneral_type="DATA_ASSISTANCE".
Consistent with the approach for misleading queries. Good alignment with the pipeline naming.
353-354: Returning results early.
Exiting the function right after setting metadata means no further processing. This is intentional for “GENERAL” type queries, so it looks fine.
682-687: Determining SQL reasoning pipeline.
The fallback to “followup_sql_generation_reasoning” or “sql_generation_reasoning” based onis_followupis consistent with your architecture.wren-ai-service/src/pipelines/generation/user_guide_assistance.py (5)
1-5: Imports and typing are correct.
The standard library, logging, and async usage look good. The imports from the hamilton framework and haystack library are well organized.
30-39: User prompt template.
The usage of Jinja-like placeholders and loops is straightforward. Good practice to handle multi-doc scenarios.
43-55:promptfunction usage.
This function integrates well with thePromptBuilder. Looks consistent with other pipeline patterns.
124-142: Asyncrunmethod orchestrating pipeline.
Your pipeline steps are well-structured. The usage ofAsyncDriverplus the Hamilton node references is consistent. Looks good overall.
65-99:Details
✅ Verification successful
Constructor sets up LLM with streaming callback.
Usingllm_provider.get_generatorwith a_streaming_callbackis consistent with other pipelines. Watch for concurrency issues if multiple queries share the same queue keys.
🏁 Script executed:
#!/bin/bash # Confirm that query_id uniqueness is enforced throughout the codebase rg 'query_id'Length of output: 56680
LLM Streaming Callback and Query ID Concurrency Verified
- The
_streaming_callbackimplementation inwren-ai-service/src/pipelines/generation/user_guide_assistance.pyis consistent with similar patterns in other pipelines.- A repository-wide search confirms that query IDs are generated as unique UUIDs and are enforced as unique both in database migrations and throughout the service endpoints.
- This uniform enforcement minimizes the risk of concurrency issues due to shared queue keys, assuming the upstream components consistently supply unique query IDs.
wren-ai-service/src/pipelines/generation/misleading_assistance.py (4)
1-4: All import statements appear necessary and correctly used.
35-46: Well-structured template prompt.
No issues found with logic or formatting usage.
135-154: Pipeline execution flow looks correct.
No functional issues spotted. The approach to passing parameters intoexecuteis clear and consistent.
157-167: Dry run pipeline section is appropriate.
Good for quick local testing. No concerns identified.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (9)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (9)
17-28: Consider limiting the scope of system instructions.
This multi-line string contains broad instructions. If the codebase supports more modular prompts, consider extracting or condensing the instructions to focus specifically on user guide tasks while delegating generic instructions (e.g., language adaptation) elsewhere. This separation can simplify maintenance and testing.
30-39: Add more context to user prompt template.
The template outlines the user question and docs, but you might consider including user context (e.g., previous user messages if relevant) or clarifying placeholders. This helps the model provide more accurate responses in multi-turn settings.
42-55: Specify a docstring for the prompt function.
While the function signature is clear, adding a docstring explaining the transformation from inputs to the prompt helps future maintainers understand how queries and documentation are formatted.
57-59: Add error handling for failed LLM calls.
Ifgenerator(prompt=...)raises an exception (e.g., network failure, timeout), the function will fail silently. Consider using try/except to gracefully handle errors and possibly return a structured error response.
65-89: Document the constructor arguments.
Adding docstrings for__init__clarifies howllm_providerandwren_ai_docsare intended to be used and what extra configuration should be provided in**kwargs.
90-99: Consider queue size limit to avoid potential memory issues.
Prolonged streaming or a large volume of data can lead to unbounded growth in_user_queues. While this might be acceptable in certain deployments, you might consider a maximum queue size or backpressure mechanism to prevent potential memory exhaustion.
100-123: Validate presence of streaming callback content.
The loop correctly yields chunks until<DONE>is encountered. However, you could add an additional check if chunk content is unexpectedly empty or invalid. That ensures more robust error handling.
125-142: Provide a return type annotation for clarity.
Although it returnsawait self._pipe.execute(...), the signature does not specify a return type. Updating it to reflect the actual structure (e.g.,-> dictor-> Any) helps future maintainers and static analysis tools.
144-153: Consider removing or guarding the main block in production code.
Thedry_run_pipelinemain block is helpful for local testing or demonstrations. In production, ensure it's not inadvertently invoked or consider moving this example usage to a dedicated script or test.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
wren-ai-service/src/globals.py(3 hunks)wren-ai-service/src/pipelines/generation/user_guide_assistance.py(1 hunks)wren-ai-service/src/utils.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- wren-ai-service/src/utils.py
- wren-ai-service/src/globals.py
There was a problem hiding this comment.
Actionable comments posted: 0
♻️ Duplicate comments (2)
wren-ai-service/src/utils.py (1)
163-189: Handle potentialValueErrorwhen splitting docs.This block is missing a safety check or error handling for the case where no newline exists in the doc string, which can raise a
ValueError. Similar concerns have been raised and remain relevant.for doc in docs: if doc: - path, content = doc.split("\n") + try: + path, content = doc.split("\n", 1) + except ValueError: + logger.warning(f"Malformed doc format: {doc[:100]}...") + continue results.append( { "path": f'{doc_endpoint_base}/{path.replace(".md", "")}', "content": content, } )wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
57-60: Ensure robust error handling in asynchronous generation.The pipeline currently awaits the generator but doesn't robustly handle failures like timeouts or network errors from the generator. This was previously flagged and still warrants consideration.
🧹 Nitpick comments (2)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (2)
17-28: Refine grammar and style in system prompt.Some instructions could be rewritten for improved clarity, e.g. “please respond to the user that you couldn’t find a relevant answer…”. Consider updating the text to be more polished.
-You need to understand the user question and the user guide, and then answer the user question. +Please read the user question and the user guide carefully, and formulate a concise, accurate response. -3. If you think you cannot answer the user question given the user guide, please kindly respond user that you don't find relevant answer in the user guide. +3. If there is insufficient information in the user guide to answer the question, kindly inform the user that no relevant answer was found.
125-142: Add explicit return type annotation for clarity.Although the method returns a dictionary from
_pipe.execute(), adding a return type annotation helps maintain clarity.- async def run( + async def run( + self, + query: str, + language: str, + query_id: Optional[str] = None, + ) -> dict: ...
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
wren-ai-service/src/globals.py(3 hunks)wren-ai-service/src/pipelines/generation/user_guide_assistance.py(1 hunks)wren-ai-service/src/utils.py(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- wren-ai-service/src/globals.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (3)
wren-ai-service/src/core/pipeline.py (1)
BasicPipeline(15-21)wren-ai-service/src/core/provider.py (1)
LLMProvider(6-15)wren-ai-service/src/pipelines/common.py (1)
dry_run_pipeline(36-59)
🔇 Additional comments (2)
wren-ai-service/src/utils.py (2)
6-6: No additional feedback on the new import.
161-162: No meaningful changes in blank lines.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
wren-ai-service/src/web/v1/services/ask.py (3)
354-373: Add validation for user query before calling user_guide_assistance.The USER_GUIDE intent handler doesn't validate the user's query before calling the pipeline, which could lead to processing empty or malformed inputs.
687-690: Add validation for selected pipeline in get_ask_streaming_result.The method dynamically selects a pipeline based on the
general_typebut doesn't verify if the pipeline exists before trying to access it.
307-318: Add exception handling for asynchronous tasks.The code creates asynchronous tasks for different pipelines but doesn't implement any error handling. If the tasks fail, exceptions will be unhandled and tasks may be orphaned.
Also applies to: 354-362
🧹 Nitpick comments (1)
wren-ai-service/src/web/v1/services/ask.py (1)
673-686: Consider using a dictionary mapping for pipeline selection.The current if-elif structure for selecting the pipeline is becoming harder to maintain as more general types are added. A dictionary mapping would be cleaner and more maintainable.
Refactor the pipeline selection code:
- _pipeline_name = "" - if self._ask_results.get(query_id).type == "GENERAL": - if self._ask_results.get(query_id).general_type == "USER_GUIDE": - _pipeline_name = "user_guide_assistance" - elif self._ask_results.get(query_id).general_type == "DATA_ASSISTANCE": - _pipeline_name = "data_assistance" - elif self._ask_results.get(query_id).general_type == "MISLEADING_QUERY": - _pipeline_name = "misleading_assistance" - elif self._ask_results.get(query_id).status == "planning": - if self._ask_results.get(query_id).is_followup: - _pipeline_name = "followup_sql_generation_reasoning" - else: - _pipeline_name = "sql_generation_reasoning" + result = self._ask_results.get(query_id) + pipeline_mapping = { + "GENERAL": { + "USER_GUIDE": "user_guide_assistance", + "DATA_ASSISTANCE": "data_assistance", + "MISLEADING_QUERY": "misleading_assistance" + }, + "planning": { + True: "followup_sql_generation_reasoning", # is_followup = True + False: "sql_generation_reasoning" # is_followup = False + } + } + + _pipeline_name = "" + if result.type == "GENERAL" and result.general_type in pipeline_mapping["GENERAL"]: + _pipeline_name = pipeline_mapping["GENERAL"][result.general_type] + elif result.status == "planning": + _pipeline_name = pipeline_mapping["planning"][result.is_followup]
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
wren-ai-service/src/web/v1/services/ask.py(4 hunks)wren-ai-service/tests/pytest/services/test_ask.py(3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- wren-ai-service/tests/pytest/services/test_ask.py
🧰 Additional context used
🧬 Code Graph Analysis (1)
wren-ai-service/src/web/v1/services/ask.py (3)
wren-ai-service/src/pipelines/generation/user_guide_assistance.py (1)
run(125-141)wren-ai-service/src/pipelines/generation/intent_classification.py (1)
run(370-392)wren-ai-service/src/pipelines/generation/sql_generation_reasoning.py (1)
run(162-183)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: pytest
Summary by CodeRabbit
New Features
Tests