Skip to content

Tool-Calling Agent Pattern: Direct Function Calling #156

@yasha-dev1

Description

@yasha-dev1

Overview

The Tool-Calling Agent pattern leverages LLM function calling APIs (Claude, OpenAI, Gemini) to enable agents to invoke tools via structured outputs. This is the simplest and fastest agent pattern, outputting only tool calls without intermediate reasoning text. It's approximately 40% faster than ReAct agents and is the recommended default for most use cases.

How It Works

The agent uses the LLM's native function calling capability:

  1. LLM receives user query + tool schemas
  2. LLM outputs structured tool call(s) via function calling API
  3. Tool executes and returns result
  4. LLM receives tool result and generates response
  5. Repeat if needed (multi-step)
User: "What's the weather in Paris and convert 20°C to Fahrenheit?"

LLM → [tool_call: get_weather(city="Paris")]
Tool → {"temp_celsius": 20, "condition": "sunny"}

LLM → [tool_call: celsius_to_fahrenheit(celsius=20)]
Tool → {"fahrenheit": 68}

LLM → "The weather in Paris is 20°C (68°F) and sunny."

No explicit "Thought" steps - the LLM directly outputs tool calls.

Reference Implementations

Proposed PyWorkflow Implementation

from pyworkflow import workflow, step, agent
from pyworkflow.agents import ToolCallingAgent, Tool

# Define tools as PyWorkflow steps
@step()
async def get_weather(city: str) -> dict:
    """Get current weather for a city."""
    return {"city": city, "temp": 20, "condition": "sunny"}

@step()
async def send_email(to: str, subject: str, body: str) -> bool:
    """Send an email."""
    await email_service.send(to, subject, body)
    return True

# Create tool-calling agent (DEFAULT pattern)
@agent(
    model="claude-sonnet-4-5-20250929",
    tools=[get_weather, send_email],
    # pattern="tool_calling" is the default, can be omitted
)
async def assistant_agent(query: str):
    """
    General-purpose assistant using function calling.
    Fastest and most cost-effective agent pattern.
    """
    pass  # Agent loop handled by framework

# Use the agent
result = await assistant_agent.run(
    "Check weather in London and email the forecast to user@example.com"
)

# Access execution trace
print(result.answer)
print(result.tool_calls)  # List of tool invocations

Parallel Tool Calling

Claude Sonnet 3.7+ and GPT-4 support parallel tool calls:

@agent(
    model="claude-sonnet-3-7-20250219",
    tools=[get_weather, get_stock_price],
    enable_parallel_tools=True,  # Claude beta: token-efficient-tools-2025-02-19
)
async def parallel_agent(query: str):
    pass

# Single query triggers multiple tools in parallel
result = await parallel_agent.run(
    "What's the weather in Paris, London, and Tokyo?"
)
# → get_weather("Paris") + get_weather("London") + get_weather("Tokyo") execute concurrently

Event Types

Tool-calling agents record these events:

  1. AGENT_STARTED - Agent execution begins

    {"run_id": "abc123", "query": "...", "tools": ["get_weather", "send_email"]}
  2. AGENT_TOOL_CALL - LLM requests tool execution

    {
      "tool_call_id": "call_xyz",
      "tool_name": "get_weather",
      "tool_input": {"city": "Paris"},
      "parallel": false
    }
  3. AGENT_TOOL_RESULT - Tool execution completes

    {
      "tool_call_id": "call_xyz",
      "tool_name": "get_weather",
      "result": {"temp": 20, "condition": "sunny"},
      "error": null
    }
  4. AGENT_RESPONSE - LLM generates user-facing response

    {"response": "The weather in Paris is 20°C and sunny."}
  5. AGENT_COMPLETED / AGENT_FAILED

Implementation Details

Tool Schema Generation

PyWorkflow automatically generates tool schemas from step functions:

@step()
async def search_database(
    query: str,
    filters: dict[str, Any] | None = None,
    limit: int = 10
) -> list[dict]:
    """
    Search the database for records.
    
    Args:
        query: Search query string
        filters: Optional key-value filters
        limit: Maximum number of results (default: 10)
    """
    pass

# Auto-generated Claude tool schema:
{
  "name": "search_database",
  "description": "Search the database for records.",
  "input_schema": {
    "type": "object",
    "properties": {
      "query": {"type": "string", "description": "Search query string"},
      "filters": {
        "type": "object",
        "description": "Optional key-value filters",
        "additionalProperties": true
      },
      "limit": {"type": "integer", "description": "Maximum number of results (default: 10)", "default": 10}
    },
    "required": ["query"]
  }
}

Error Handling

When tools fail, the agent receives error information:

try:
    result = await tool_function(**params)
except Exception as e:
    # Record error in event log
    await ctx.record_event(EventType.AGENT_TOOL_RESULT, {
        "tool_call_id": call_id,
        "tool_name": tool_name,
        "result": None,
        "error": str(e),
        "is_error": true
    })
    
    # Pass error back to LLM for handling
    tool_result = {
        "type": "tool_result",
        "tool_use_id": call_id,
        "is_error": true,
        "content": f"Error: {e}"
    }

The LLM can then retry with different parameters or inform the user.

PyWorkflow Integration

# Internal implementation (pyworkflow/agents/tool_calling.py)
class ToolCallingAgent:
    async def run(self, query: str):
        ctx = get_context()
        messages = [{"role": "user", "content": query}]
        
        while True:
            # Call LLM with tools
            response = await self.llm.generate(
                messages=messages,
                tools=self.tool_schemas,
                model=self.model
            )
            
            # Check if LLM wants to use tools
            if response.stop_reason == "tool_use":
                tool_calls = response.content  # List of tool_use blocks
                
                # Execute tools (as PyWorkflow steps - durable!)
                tool_results = []
                for tool_call in tool_calls:
                    await ctx.record_event(EventType.AGENT_TOOL_CALL, {
                        "tool_call_id": tool_call.id,
                        "tool_name": tool_call.name,
                        "tool_input": tool_call.input
                    })
                    
                    # Execute step (retryable, recorded in event log)
                    result = await self._execute_tool(tool_call.name, tool_call.input)
                    
                    await ctx.record_event(EventType.AGENT_TOOL_RESULT, {
                        "tool_call_id": tool_call.id,
                        "result": result
                    })
                    
                    tool_results.append({"tool_use_id": tool_call.id, "content": result})
                
                # Continue conversation with tool results
                messages.append(response.content)
                messages.append({"role": "user", "content": tool_results})
            
            elif response.stop_reason == "end_turn":
                # LLM finished - extract final answer
                await ctx.record_event(EventType.AGENT_RESPONSE, {"response": response.text})
                return response.text

Trade-offs

Pros

  • ~40% faster than ReAct (no reasoning text overhead)
  • Lower cost: Fewer tokens generated
  • Simpler: No prompt engineering for thought format
  • Native API support: Leverages optimized LLM function calling
  • Parallel tool calls: Supported by Claude 3.7+ and GPT-4
  • Type safety: Structured inputs/outputs validated by LLM

Cons

  • Less explainable: No explicit reasoning traces
  • Black box: Hard to debug why LLM chose a tool
  • Limited transparency: Can't see "thinking" between tool calls
  • Trust required: Must trust LLM's tool selection logic

Comparison to ReAct

Aspect Tool-Calling ReAct
Speed 40% faster Slower
Cost Lower Higher
Explainability Low High
Prompt Engineering Minimal Complex
Use Case Most production apps Complex reasoning

When to Use Tool-Calling vs ReAct

Use Tool-Calling (this pattern) when:

  • Speed and cost are priorities
  • Tools are straightforward (API calls, database queries)
  • Explainability is not critical
  • You trust the LLM's tool selection

Use ReAct when:

  • Debugging/explainability is critical
  • Complex multi-step reasoning required
  • Need to audit decision-making process
  • Human-in-the-loop approval needed

Related Issues

References

Implementation Checklist

  • Create pyworkflow/agents/tool_calling.py with ToolCallingAgent class
  • Implement automatic tool schema generation from @step functions
  • Add Claude API integration (messages API + tools)
  • Add OpenAI API integration (function calling)
  • Add parallel tool execution support (Claude 3.7+)
  • Implement error handling and retry logic
  • Add event types: AGENT_TOOL_CALL, AGENT_TOOL_RESULT
  • Create @agent() decorator (defaults to tool_calling pattern)
  • Add tests with mock LLM responses
  • Document tool-calling agent in examples/
  • Add integration tests with real APIs
  • Support Gemini function calling

Metadata

Metadata

Assignees

No one assigned

    Labels

    agentsAI Agent module (pyworkflow_agents)enhancementNew feature or requestfeatureFeature to be implemented

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions