feat: Add ALTK Agent with tool validation and comprehensive tests by korenLazar · Pull Request #10587 · langflow-ai/langflow

korenLazar · 2025-11-13T18:18:27Z

Added agent-lifecycle-toolkit~=0.4.1 dependency to pyproject.toml
Implemented ALTKBaseAgent with comprehensive error handling and tool validation
Added ALTKToolWrappers for SPARC integration and tool execution safety
Created ALTK Agent component with proper LangChain integration
Added comprehensive test suite covering tool validation, conversation context, and edge cases
Fixed docstring formatting to comply with ruff linting standards

Summary by CodeRabbit

New Features
- Added pre-execution tool validation and post-tool response processing with configurable wrappers.
- Introduced a modular ALTK agent foundation with improved conversation context construction and message normalization.
Tests
- Added extensive unit tests covering agent orchestration, tool conversion, wrapper behaviors, conversation ordering, and edge cases.
Chores
- Updated agent-lifecycle-toolkit dependency to ~=0.4.1.

- Added agent-lifecycle-toolkit~=0.4.1 dependency to pyproject.toml - Implemented ALTKBaseAgent with comprehensive error handling and tool validation - Added ALTKToolWrappers for SPARC integration and tool execution safety - Created ALTK Agent component with proper LangChain integration - Added comprehensive test suite covering tool validation, conversation context, and edge cases - Fixed docstring formatting to comply with ruff linting standards

coderabbitai · 2025-11-13T18:18:45Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds a modular ALTK agent framework: base agent, tool wrapper pipeline (pre/post validation), SPARC-based tool spec conversion, conversation normalization, and comprehensive unit tests; also tightens the agent-lifecycle-toolkit dependency spec.

Changes

Cohort / File(s)	Summary
Dependency specification `pyproject.toml`	Changes `agent-lifecycle-toolkit` dependency to `agent-lifecycle-toolkit~=0.4.1`
Base agent infrastructure `src/lfx/src/lfx/base/agents/...` `src/lfx/src/lfx/base/agents/altk_base_agent.py`	Adds message normalization, `BaseToolWrapper`, `ALTKBaseTool` (cross-version tool execution), `ToolPipelineManager`, and `ALTKBaseAgentComponent` with tool-pipeline orchestration, context building, and `run_agent` flow
Tool validation & processing wrappers `src/lfx/src/lfx/base/agents/altk_tool_wrappers.py`	Adds `ValidatedTool`, `PreToolValidationWrapper`, `PostToolProcessor`, `PostToolProcessingWrapper`, Pydantic→JSON Schema conversion helpers, wrapper unwrapping, and SPARC reflection/validation flow
ALTK agent component `src/lfx/src/lfx/components/models_and_agents/altk_agent.py`	Adds `ALTKAgentComponent` (inherits `ALTKBaseAgentComponent`), configures wrapper pipeline (enable flags, thresholds), introduces `NormalizedInputProxy` for Data→LangChain message normalization, and updates lifecycle/update_runnable flows
Unit tests — orchestration & logic `src/backend/tests/unit/components/models_and_agents/test_altk_agent_logic.py`	Adds extensive tests covering pipeline manager, wrapper ordering, validated tool behavior, agent execution flow, conversation/context handling, normalization, and edge cases
Unit tests — tool conversion `src/backend/tests/unit/components/models_and_agents/test_altk_agent_tool_conversion.py`	Adds tests for converting LangChain tools to SPARC/OpenAI-style specs (various schemas, arrays, required/optional fields, error paths)
Unit tests — conversation ordering `src/backend/tests/unit/components/models_and_agents/test_conversation_context_ordering.py`	Adds tests ensuring chronological conversation context construction across scenarios and regression protection

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant ALTK as ALTKAgentComponent
    participant TPM as ToolPipelineManager
    participant PreW as PreToolValidationWrapper
    participant Valid as ValidatedTool
    participant PostW as PostToolProcessingWrapper
    participant SPARC
    participant LLM

    User->>ALTK: run_agent(input)

    rect rgba(220,235,255,0.35)
    ALTK->>ALTK: configure_tool_pipeline()
    ALTK->>TPM: add_wrapper(PreW), add_wrapper(PostW)
    end

    rect rgba(220,255,230,0.35)
    ALTK->>ALTK: build_conversation_context()
    ALTK->>ALTK: get_user_query()
    ALTK->>LLM: send conversation + tool specs
    end

    LLM-->>ALTK: chooses tool_call
    ALTK->>TPM: process_tools(tools)
    TPM->>PreW: wrap_tool(original)
    PreW->>Valid: produce ValidatedTool
    TPM->>PostW: wrap_tool(ValidatedTool)
    
    rect rgba(255,235,220,0.35)
    Valid->>SPARC: validate tool call (conversation + spec)
    alt SPARC approves
        SPARC-->>Valid: ok
        Valid->>Valid: execute wrapped tool
    else SPARC rejects
        SPARC-->>Valid: rejection
        Valid-->>ALTK: formatted rejection
    end
    end

    Valid-->>PostW: raw result
    PostW->>PostW: optional post-processing / codegen
    PostW-->>ALTK: processed result
    ALTK-->>User: final message

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Areas to focus on:

altk_tool_wrappers.py: Pydantic→JSON Schema mapping, nested schema handling, wrapper unwrapping logic.
altk_base_agent.py: wrapper application order, ToolPipelineManager semantics, cross-version tool execution.
altk_agent.py: NormalizedInputProxy behavior and integration into runnable/tool update flow.
Tests: ensure mocks reflect production contracts for SPARC/tool schemas.

Possibly related PRs

feat: Adding an Agent component that uses components from ALTK. #10221 — Adds/modifies ALTK agent components and pre/post-tool processing; closely related to wrapper and component changes.
refactor: Reorganize sidebar categories #10180 — Updates imports/exports around models_and_agents including altk_agent; touches same code paths.
refactor(agent): standardize memory handling and update chat history logic #8715 — Changes chat history/Data-to-message conversions; related to normalization fixes introduced here.

Suggested labels

size:XXL

Suggested reviewers

edwinjosechittilappilly
ogabrielluiz
Cristhianzl

Pre-merge checks and finishing touches

❌ Failed checks (2 warnings, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 61.81% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.
Excessive Mock Usage Warning	⚠️ Warning	Test file test_altk_agent_logic.py contains 9+ mock classes and excessive mock instantiations that may test mock behavior rather than real component functionality, obscuring actual system behavior validation.	Replace mock classes with real objects for core logic testing; use lightweight test doubles only for external dependencies (SPARC, LLM); consolidate repetitive mocks into reusable fixtures.
Test Quality And Coverage	❓ Inconclusive	Test files referenced in PR summary could not be located in the repository despite extensive search.	Verify that all test files (test_altk_agent_logic.py, test_altk_agent_tool_conversion.py, test_conversation_context_ordering.py) and implementation files are committed and accessible in the repository.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main changes: introducing an ALTK Agent component with tool validation capabilities and comprehensive test coverage.
Test Coverage For New Implementations	✅ Passed	PR includes three comprehensive test files with 60+ test functions and 500+ assertions covering three new implementation files totaling 1000+ lines of code. Tests are substantive with proper mocking and fixture setup.
Test File Naming And Structure	✅ Passed	All three test files follow correct pytest patterns with proper naming (test_*.py), correct backend directory structure, descriptive test function names, logical test class organization, comprehensive assertions, mocking for isolation, and coverage of both positive scenarios and edge cases.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

korenLazar · 2025-11-13T18:24:30Z

@coderabbitai review

coderabbitai · 2025-11-13T18:24:35Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba68eea and 7ac153c.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (7)

pyproject.toml (1 hunks)
src/backend/tests/unit/components/models_and_agents/test_altk_agent_logic.py (1 hunks)
src/backend/tests/unit/components/models_and_agents/test_altk_agent_tool_conversion.py (1 hunks)
src/backend/tests/unit/components/models_and_agents/test_conversation_context_ordering.py (1 hunks)
src/lfx/src/lfx/base/agents/altk_base_agent.py (1 hunks)
src/lfx/src/lfx/base/agents/altk_tool_wrappers.py (1 hunks)
src/lfx/src/lfx/components/models_and_agents/altk_agent.py (3 hunks)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)

GitHub Check: Run Frontend Tests / Playwright Tests - Shard 9/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 4/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 13/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 7/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 2/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 5/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 10/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 1/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 3/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 6/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 12/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 11/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 8/13
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
GitHub Check: Run Backend Tests / LFX Tests - Python 3.10
GitHub Check: Test Starter Templates
GitHub Check: Update Starter Projects

coderabbitai · 2025-11-13T18:48:40Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (4)

src/lfx/src/lfx/base/agents/altk_base_agent.py (2)

119-130: Consider mixed args/kwargs handling.

If both args and kwargs are provided, the current implementation only uses args (line 122), ignoring kwargs entirely. This could be unexpected behavior for callers.

Consider clarifying the precedence or merging both:

 def _execute_tool(self, *args, **kwargs) -> str:
     """Execute the wrapped tool with compatibility across LC versions."""
     # BaseTool.run() expects tool_input as first argument
     if args:
         # Use first arg as tool_input, pass remaining args
         tool_input = args[0]
-        return self.wrapped_tool.run(tool_input, *args[1:])
+        # If kwargs provided, merge them into tool_input if it's a dict
+        if kwargs and isinstance(tool_input, dict):
+            tool_input = {**tool_input, **kwargs}
+        return self.wrapped_tool.run(tool_input, *args[1:])
     if kwargs:
         # Use kwargs dict as tool_input
         return self.wrapped_tool.run(kwargs)
     # No arguments - pass empty dict as tool_input
     return self.wrapped_tool.run({})

308-319: Simplify and strengthen chat_history type detection.

The current logic has overlapping and fragile type checks. Lines 316-317 check for "text" in m.data which could raise TypeError if m.data is not a dict. Line 318 checks isinstance(m, Message) right after, which overlaps with the previous check.

Consider consolidating the checks:

         if hasattr(self, "chat_history") and self.chat_history:
             if (
                 hasattr(self.chat_history, "to_data")
                 and callable(self.chat_history.to_data)
                 and self.chat_history.__class__.__name__ == "Data"
             ):
                 input_dict["chat_history"] = data_to_messages(self.chat_history)
-            # Handle both lfx.schema.message.Message and langflow.schema.message.Message types
-            if all(hasattr(m, "to_data") and callable(m.to_data) and "text" in m.data for m in self.chat_history):
-                input_dict["chat_history"] = data_to_messages(self.chat_history)
-            if all(isinstance(m, Message) for m in self.chat_history):
+            # Handle both lfx.schema.message.Message and langflow.schema.message.Message types
+            elif isinstance(self.chat_history, list) and all(isinstance(m, Message) for m in self.chat_history):
                 input_dict["chat_history"] = data_to_messages([m.to_data() for m in self.chat_history])

This removes the risky "text" in m.data check and uses elif to avoid redundant checks.

src/lfx/src/lfx/base/agents/altk_tool_wrappers.py (2)

135-142: Consider caching SPARC component creation.

A new SPARCReflectionComponent is created on every _run call (lines 137-140). If the tool is invoked multiple times with the same LLM configuration, this could be inefficient.

Consider initializing it once if the LLM client remains constant:

     def _run(self, *args, **kwargs) -> str:
         """Execute the tool with validation."""
-        self.sparc_component = SPARCReflectionComponent(
-            config=ComponentConfig(llm_client=self._get_altk_llm_object()),
-            track=Track.FAST_TRACK,  # Use fast track for performance
-            execution_mode=SPARCExecutionMode.SYNC,  # Use SYNC to avoid event loop conflicts
-        )
+        if not self.sparc_component:
+            self.sparc_component = SPARCReflectionComponent(
+                config=ComponentConfig(llm_client=self._get_altk_llm_object()),
+                track=Track.FAST_TRACK,  # Use fast track for performance
+                execution_mode=SPARCExecutionMode.SYNC,  # Use SYNC to avoid event loop conflicts
+            )
         return self._validate_and_run(*args, **kwargs)

468-512: Consider more robust error message detection.

Line 473 uses a heuristic check for emoji characters ("❌" or "•") to detect SPARC rejection messages. This could produce false positives if legitimate tool responses contain these characters.

Consider a more structured approach, such as checking for a specific error prefix or using a flag passed from the validation layer:

     def process_tool_response(self, tool_response: str, **_kwargs) -> str:
         logger.info("Calling process_tool_response of PostToolProcessor")
         tool_response_str = self._get_tool_response_str(tool_response)
 
-        # First check if this looks like an error message with bullet points (SPARC rejection)
-        if "❌" in tool_response_str or "•" in tool_response_str:
+        # Check if this looks like a SPARC rejection message
+        if tool_response_str.startswith("Tool call validation failed:"):
             logger.info("Detected error message with special characters, skipping JSON parsing")
             return tool_response_str

This approach is more specific and less likely to misidentify legitimate content.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ac153c and 017f60f.

📒 Files selected for processing (3)

src/backend/tests/unit/components/models_and_agents/test_altk_agent_tool_conversion.py (1 hunks)
src/lfx/src/lfx/base/agents/altk_base_agent.py (1 hunks)
src/lfx/src/lfx/base/agents/altk_tool_wrappers.py (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-06-12T15:25:01.072Z

Learnt from: edwinjosechittilappilly
Repo: langflow-ai/langflow PR: 8504
File: src/backend/base/langflow/initial_setup/starter_projects/Image Sentiment Analysis.json:391-393
Timestamp: 2025-06-12T15:25:01.072Z
Learning: The repository owner prefers CodeRabbit not to review or comment on JSON files because they are autogenerated.

Applied to files:

src/lfx/src/lfx/base/agents/altk_tool_wrappers.py

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (21)

GitHub Check: Run Frontend Tests / Playwright Tests - Shard 3/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 11/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 9/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 13/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 12/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 6/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 7/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 10/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 1/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 4/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 8/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 5/13
GitHub Check: Run Frontend Tests / Playwright Tests - Shard 2/13
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
GitHub Check: Test Starter Templates
GitHub Check: Run Ruff Check and Format

🔇 Additional comments (20)

src/lfx/src/lfx/base/agents/altk_base_agent.py (6)
41-77: LGTM! Well-structured message normalization.

The function correctly handles both string and list content formats with appropriate fallbacks for edge cases.

83-101: LGTM! Clean abstraction for tool wrappers.

The base class provides a clear contract for tool wrapper implementations.

132-158: Verify API key presence before access.

Lines 144 and 150 call .get_secret_value() on API keys without checking if they're set. If the API key is None or not properly configured, this could raise an AttributeError.

Consider adding defensive checks:
     if isinstance(llm_object, ChatAnthropic):
         model_name = f"anthropic/{llm_object.model}"
-        api_key = llm_object.anthropic_api_key.get_secret_value()
+        api_key = llm_object.anthropic_api_key.get_secret_value() if llm_object.anthropic_api_key else None
+        if not api_key:
+            logger.warning("Anthropic API key is not configured")
+            return None
         llm_client_type = "litellm.output_val" if use_output_val else "litellm"
Similar pattern for OpenAI at line 150.

161-187: LGTM! Clean pipeline management.

The reverse order application (line 184) ensures wrappers are applied in the expected decorator pattern where the first added wrapper becomes the outermost layer.

210-243: LGTM! Comprehensive conversation context building.

The method handles multiple input formats with appropriate validation and clear error messages. The type checking at lines 229-234 provides good user feedback.

320-331: Verify content structure before filtering.

Lines 323-324 assume all items in lc_message.content are dictionaries with a get method. If the list contains non-dict items, this will raise an AttributeError.

Consider adding a type guard:
         if hasattr(lc_message, "content") and isinstance(lc_message.content, list):
             # ! Because the input has to be a string, we must pass the images in the chat_history
-
-            image_dicts = [item for item in lc_message.content if item.get("type") == "image"]
-            lc_message.content = [item for item in lc_message.content if item.get("type") != "image"]
+            
+            image_dicts = [item for item in lc_message.content if isinstance(item, dict) and item.get("type") == "image"]
+            lc_message.content = [item for item in lc_message.content if not isinstance(item, dict) or item.get("type") != "image"]
src/lfx/src/lfx/base/agents/altk_tool_wrappers.py (8)

27-98: LGTM! Comprehensive type conversion.

The function handles all major JSON Schema type constructs including unions, intersections, and arrays with recursive conversion. The fallback to "string" at line 96 is a safe default.

101-133: LGTM! Well-structured validated tool wrapper.

The class properly uses Field(default_factory=...) for mutable defaults, preventing the common Python pitfall of shared mutable default arguments.

144-207: LGTM! Robust validation with appropriate fallbacks.

The method includes comprehensive error handling (lines 204-207) that falls back to direct tool execution if SPARC validation fails, ensuring resilience.

209-232: LGTM! Proper Pydantic v2 compatibility.

The method correctly handles both Pydantic v1 (__fields__) and v2 (model_fields) as noted in the addressed past review comment. The fallback logic (lines 228-230) ensures robustness.

234-257: LGTM! User-friendly error formatting.

The method provides clear, actionable feedback with emoji indicators and suggested corrections when available.

264-304: LGTM! Clean wrapper implementation.

The wrapper correctly handles already-wrapped tools (lines 284-290) and provides clear warning messages when required dependencies are missing (line 295).

306-407: LGTM! Comprehensive tool spec conversion.

The method includes:

Proper unwrapping of nested wrappers with depth protection (line 347)

Pydantic v2 compatibility for required field detection (lines 375-378)

Robust error handling with minimal spec fallback (lines 384-403)

Clear warning when no specs are generated (line 406)

515-556: LGTM! Clean post-processing wrapper.

The wrapper follows the same pattern as PreToolValidationWrapper with proper checks for required dependencies and handling of already-wrapped tools.

src/backend/tests/unit/components/models_and_agents/test_altk_agent_tool_conversion.py (6)

7-53: LGTM! Well-structured test fixtures.

The mock tools cover key scenarios: basic tools, no-parameter tools, and schema-based tools with various parameter types.

56-71: LGTM! Comprehensive test for basic tool conversion.

The test verifies that LangChain's automatic parameter extraction from method signatures works correctly with the conversion function.

88-146: LGTM! Thorough test for list parameter conversion.

The test validates both the Pydantic schema structure and the converted OpenAI function spec, ensuring proper handling of optional array parameters. The logging issue noted in past reviews has been fixed (line 128).

149-170: LGTM! Good test for batch conversion.

The test ensures the conversion function correctly handles multiple tools in a single call and maintains their individual specifications.

172-197: LGTM! Good error handling test.

The test validates that the conversion function gracefully degrades to a minimal spec when schema access fails, ensuring robustness.

200-248: LGTM! Comprehensive test for complex schemas.

The test covers:

Required vs optional field detection (lines 244-248)

Union type handling for nullable fields (lines 234-236)

Array type conversion (lines 239-242)

Proper type mapping for all JSON Schema types

This provides excellent coverage for real-world tool schemas.

github-actions Bot added the community Pull Request from an external contributor label Nov 13, 2025

[autofix.ci] apply automated fixes

dfce983

edwinjosechittilappilly self-requested a review November 13, 2025 18:22

korenLazar changed the title ~~Add ALTK Agent with tool validation and comprehensive tests~~ feat: Add ALTK Agent with tool validation and comprehensive tests Nov 13, 2025

[autofix.ci] apply automated fixes (attempt 2/3)

d732664