feat: Add ALTK Agent with tool validation and comprehensive tests#10587
Conversation
- Added agent-lifecycle-toolkit~=0.4.1 dependency to pyproject.toml - Implemented ALTKBaseAgent with comprehensive error handling and tool validation - Added ALTKToolWrappers for SPARC integration and tool execution safety - Created ALTK Agent component with proper LangChain integration - Added comprehensive test suite covering tool validation, conversation context, and edge cases - Fixed docstring formatting to comply with ruff linting standards
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the WalkthroughAdds a modular ALTK agent framework: base agent, tool wrapper pipeline (pre/post validation), SPARC-based tool spec conversion, conversation normalization, and comprehensive unit tests; also tightens the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant ALTK as ALTKAgentComponent
participant TPM as ToolPipelineManager
participant PreW as PreToolValidationWrapper
participant Valid as ValidatedTool
participant PostW as PostToolProcessingWrapper
participant SPARC
participant LLM
User->>ALTK: run_agent(input)
rect rgba(220,235,255,0.35)
ALTK->>ALTK: configure_tool_pipeline()
ALTK->>TPM: add_wrapper(PreW), add_wrapper(PostW)
end
rect rgba(220,255,230,0.35)
ALTK->>ALTK: build_conversation_context()
ALTK->>ALTK: get_user_query()
ALTK->>LLM: send conversation + tool specs
end
LLM-->>ALTK: chooses tool_call
ALTK->>TPM: process_tools(tools)
TPM->>PreW: wrap_tool(original)
PreW->>Valid: produce ValidatedTool
TPM->>PostW: wrap_tool(ValidatedTool)
rect rgba(255,235,220,0.35)
Valid->>SPARC: validate tool call (conversation + spec)
alt SPARC approves
SPARC-->>Valid: ok
Valid->>Valid: execute wrapped tool
else SPARC rejects
SPARC-->>Valid: rejection
Valid-->>ALTK: formatted rejection
end
end
Valid-->>PostW: raw result
PostW->>PostW: optional post-processing / codegen
PostW-->>ALTK: processed result
ALTK-->>User: final message
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Areas to focus on:
Possibly related PRs
Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (2 warnings, 1 inconclusive)
✅ Passed checks (4 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (7)
pyproject.toml(1 hunks)src/backend/tests/unit/components/models_and_agents/test_altk_agent_logic.py(1 hunks)src/backend/tests/unit/components/models_and_agents/test_altk_agent_tool_conversion.py(1 hunks)src/backend/tests/unit/components/models_and_agents/test_conversation_context_ordering.py(1 hunks)src/lfx/src/lfx/base/agents/altk_base_agent.py(1 hunks)src/lfx/src/lfx/base/agents/altk_tool_wrappers.py(1 hunks)src/lfx/src/lfx/components/models_and_agents/altk_agent.py(3 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 9/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 4/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 13/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 7/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 2/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 5/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 10/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 1/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 3/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 6/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 12/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 11/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 8/13
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
- GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
- GitHub Check: Run Backend Tests / LFX Tests - Python 3.10
- GitHub Check: Test Starter Templates
- GitHub Check: Update Starter Projects
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (4)
src/lfx/src/lfx/base/agents/altk_base_agent.py (2)
119-130: Consider mixed args/kwargs handling.If both
argsandkwargsare provided, the current implementation only usesargs(line 122), ignoringkwargsentirely. This could be unexpected behavior for callers.Consider clarifying the precedence or merging both:
def _execute_tool(self, *args, **kwargs) -> str: """Execute the wrapped tool with compatibility across LC versions.""" # BaseTool.run() expects tool_input as first argument if args: # Use first arg as tool_input, pass remaining args tool_input = args[0] - return self.wrapped_tool.run(tool_input, *args[1:]) + # If kwargs provided, merge them into tool_input if it's a dict + if kwargs and isinstance(tool_input, dict): + tool_input = {**tool_input, **kwargs} + return self.wrapped_tool.run(tool_input, *args[1:]) if kwargs: # Use kwargs dict as tool_input return self.wrapped_tool.run(kwargs) # No arguments - pass empty dict as tool_input return self.wrapped_tool.run({})
308-319: Simplify and strengthen chat_history type detection.The current logic has overlapping and fragile type checks. Lines 316-317 check for
"text" in m.datawhich could raiseTypeErrorifm.datais not a dict. Line 318 checksisinstance(m, Message)right after, which overlaps with the previous check.Consider consolidating the checks:
if hasattr(self, "chat_history") and self.chat_history: if ( hasattr(self.chat_history, "to_data") and callable(self.chat_history.to_data) and self.chat_history.__class__.__name__ == "Data" ): input_dict["chat_history"] = data_to_messages(self.chat_history) - # Handle both lfx.schema.message.Message and langflow.schema.message.Message types - if all(hasattr(m, "to_data") and callable(m.to_data) and "text" in m.data for m in self.chat_history): - input_dict["chat_history"] = data_to_messages(self.chat_history) - if all(isinstance(m, Message) for m in self.chat_history): + # Handle both lfx.schema.message.Message and langflow.schema.message.Message types + elif isinstance(self.chat_history, list) and all(isinstance(m, Message) for m in self.chat_history): input_dict["chat_history"] = data_to_messages([m.to_data() for m in self.chat_history])This removes the risky
"text" in m.datacheck and useselifto avoid redundant checks.src/lfx/src/lfx/base/agents/altk_tool_wrappers.py (2)
135-142: Consider caching SPARC component creation.A new
SPARCReflectionComponentis created on every_runcall (lines 137-140). If the tool is invoked multiple times with the same LLM configuration, this could be inefficient.Consider initializing it once if the LLM client remains constant:
def _run(self, *args, **kwargs) -> str: """Execute the tool with validation.""" - self.sparc_component = SPARCReflectionComponent( - config=ComponentConfig(llm_client=self._get_altk_llm_object()), - track=Track.FAST_TRACK, # Use fast track for performance - execution_mode=SPARCExecutionMode.SYNC, # Use SYNC to avoid event loop conflicts - ) + if not self.sparc_component: + self.sparc_component = SPARCReflectionComponent( + config=ComponentConfig(llm_client=self._get_altk_llm_object()), + track=Track.FAST_TRACK, # Use fast track for performance + execution_mode=SPARCExecutionMode.SYNC, # Use SYNC to avoid event loop conflicts + ) return self._validate_and_run(*args, **kwargs)
468-512: Consider more robust error message detection.Line 473 uses a heuristic check for emoji characters (
"❌"or"•") to detect SPARC rejection messages. This could produce false positives if legitimate tool responses contain these characters.Consider a more structured approach, such as checking for a specific error prefix or using a flag passed from the validation layer:
def process_tool_response(self, tool_response: str, **_kwargs) -> str: logger.info("Calling process_tool_response of PostToolProcessor") tool_response_str = self._get_tool_response_str(tool_response) - # First check if this looks like an error message with bullet points (SPARC rejection) - if "❌" in tool_response_str or "•" in tool_response_str: + # Check if this looks like a SPARC rejection message + if tool_response_str.startswith("Tool call validation failed:"): logger.info("Detected error message with special characters, skipping JSON parsing") return tool_response_strThis approach is more specific and less likely to misidentify legitimate content.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/backend/tests/unit/components/models_and_agents/test_altk_agent_tool_conversion.py(1 hunks)src/lfx/src/lfx/base/agents/altk_base_agent.py(1 hunks)src/lfx/src/lfx/base/agents/altk_tool_wrappers.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-06-12T15:25:01.072Z
Learnt from: edwinjosechittilappilly
Repo: langflow-ai/langflow PR: 8504
File: src/backend/base/langflow/initial_setup/starter_projects/Image Sentiment Analysis.json:391-393
Timestamp: 2025-06-12T15:25:01.072Z
Learning: The repository owner prefers CodeRabbit not to review or comment on JSON files because they are autogenerated.
Applied to files:
src/lfx/src/lfx/base/agents/altk_tool_wrappers.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 3/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 11/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 9/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 13/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 12/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 6/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 7/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 10/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 1/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 4/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 8/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 5/13
- GitHub Check: Run Frontend Tests / Playwright Tests - Shard 2/13
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
- GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
- GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
- GitHub Check: Test Starter Templates
- GitHub Check: Run Ruff Check and Format
🔇 Additional comments (20)
src/lfx/src/lfx/base/agents/altk_base_agent.py (6)
41-77: LGTM! Well-structured message normalization.The function correctly handles both string and list content formats with appropriate fallbacks for edge cases.
83-101: LGTM! Clean abstraction for tool wrappers.The base class provides a clear contract for tool wrapper implementations.
132-158: Verify API key presence before access.Lines 144 and 150 call
.get_secret_value()on API keys without checking if they're set. If the API key is None or not properly configured, this could raise anAttributeError.Consider adding defensive checks:
if isinstance(llm_object, ChatAnthropic): model_name = f"anthropic/{llm_object.model}" - api_key = llm_object.anthropic_api_key.get_secret_value() + api_key = llm_object.anthropic_api_key.get_secret_value() if llm_object.anthropic_api_key else None + if not api_key: + logger.warning("Anthropic API key is not configured") + return None llm_client_type = "litellm.output_val" if use_output_val else "litellm"Similar pattern for OpenAI at line 150.
161-187: LGTM! Clean pipeline management.The reverse order application (line 184) ensures wrappers are applied in the expected decorator pattern where the first added wrapper becomes the outermost layer.
210-243: LGTM! Comprehensive conversation context building.The method handles multiple input formats with appropriate validation and clear error messages. The type checking at lines 229-234 provides good user feedback.
320-331: Verify content structure before filtering.Lines 323-324 assume all items in
lc_message.contentare dictionaries with agetmethod. If the list contains non-dict items, this will raise anAttributeError.Consider adding a type guard:
if hasattr(lc_message, "content") and isinstance(lc_message.content, list): # ! Because the input has to be a string, we must pass the images in the chat_history - - image_dicts = [item for item in lc_message.content if item.get("type") == "image"] - lc_message.content = [item for item in lc_message.content if item.get("type") != "image"] + + image_dicts = [item for item in lc_message.content if isinstance(item, dict) and item.get("type") == "image"] + lc_message.content = [item for item in lc_message.content if not isinstance(item, dict) or item.get("type") != "image"]src/lfx/src/lfx/base/agents/altk_tool_wrappers.py (8)
27-98: LGTM! Comprehensive type conversion.The function handles all major JSON Schema type constructs including unions, intersections, and arrays with recursive conversion. The fallback to "string" at line 96 is a safe default.
101-133: LGTM! Well-structured validated tool wrapper.The class properly uses
Field(default_factory=...)for mutable defaults, preventing the common Python pitfall of shared mutable default arguments.
144-207: LGTM! Robust validation with appropriate fallbacks.The method includes comprehensive error handling (lines 204-207) that falls back to direct tool execution if SPARC validation fails, ensuring resilience.
209-232: LGTM! Proper Pydantic v2 compatibility.The method correctly handles both Pydantic v1 (
__fields__) and v2 (model_fields) as noted in the addressed past review comment. The fallback logic (lines 228-230) ensures robustness.
234-257: LGTM! User-friendly error formatting.The method provides clear, actionable feedback with emoji indicators and suggested corrections when available.
264-304: LGTM! Clean wrapper implementation.The wrapper correctly handles already-wrapped tools (lines 284-290) and provides clear warning messages when required dependencies are missing (line 295).
306-407: LGTM! Comprehensive tool spec conversion.The method includes:
- Proper unwrapping of nested wrappers with depth protection (line 347)
- Pydantic v2 compatibility for required field detection (lines 375-378)
- Robust error handling with minimal spec fallback (lines 384-403)
- Clear warning when no specs are generated (line 406)
515-556: LGTM! Clean post-processing wrapper.The wrapper follows the same pattern as
PreToolValidationWrapperwith proper checks for required dependencies and handling of already-wrapped tools.src/backend/tests/unit/components/models_and_agents/test_altk_agent_tool_conversion.py (6)
7-53: LGTM! Well-structured test fixtures.The mock tools cover key scenarios: basic tools, no-parameter tools, and schema-based tools with various parameter types.
56-71: LGTM! Comprehensive test for basic tool conversion.The test verifies that LangChain's automatic parameter extraction from method signatures works correctly with the conversion function.
88-146: LGTM! Thorough test for list parameter conversion.The test validates both the Pydantic schema structure and the converted OpenAI function spec, ensuring proper handling of optional array parameters. The logging issue noted in past reviews has been fixed (line 128).
149-170: LGTM! Good test for batch conversion.The test ensures the conversion function correctly handles multiple tools in a single call and maintains their individual specifications.
172-197: LGTM! Good error handling test.The test validates that the conversion function gracefully degrades to a minimal spec when schema access fails, ensuring robustness.
200-248: LGTM! Comprehensive test for complex schemas.The test covers:
- Required vs optional field detection (lines 244-248)
- Union type handling for nullable fields (lines 234-236)
- Array type conversion (lines 239-242)
- Proper type mapping for all JSON Schema types
This provides excellent coverage for real-world tool schemas.
Summary by CodeRabbit
New Features
Tests
Chores