Tool-Calling Agent Pattern: Direct Function Calling

## Overview

The Tool-Calling Agent pattern leverages LLM function calling APIs (Claude, OpenAI, Gemini) to enable agents to invoke tools via structured outputs. This is the **simplest and fastest** agent pattern, outputting only tool calls without intermediate reasoning text. It's approximately **40% faster** than ReAct agents and is the recommended default for most use cases.

## How It Works

The agent uses the LLM's native function calling capability:

1. **LLM receives** user query + tool schemas
2. **LLM outputs** structured tool call(s) via function calling API
3. **Tool executes** and returns result
4. **LLM receives** tool result and generates response
5. Repeat if needed (multi-step)

```
User: "What's the weather in Paris and convert 20°C to Fahrenheit?"

LLM → [tool_call: get_weather(city="Paris")]
Tool → {"temp_celsius": 20, "condition": "sunny"}

LLM → [tool_call: celsius_to_fahrenheit(celsius=20)]
Tool → {"fahrenheit": 68}

LLM → "The weather in Paris is 20°C (68°F) and sunny."
```

**No explicit "Thought" steps** - the LLM directly outputs tool calls.

## Reference Implementations

- [Claude Tool Use Documentation](https://platform.claude.com/docs/en/agents-and-tools/tool-use/implement-tool-use) - Anthropic's official guide
- [Claude Function Calling Guide](https://composio.dev/blog/claude-function-calling-tools) - Composio tutorial
- [OpenAI Function Calling](https://platform.openai.com/docs/guides/function-calling) - OpenAI's implementation
- [LlamaIndex Function Calling Agent](https://developers.llamaindex.ai/python/examples/agent/anthropic_agent/) - Python example
- [Anthropic Tool Use Code Example](https://gist.github.com/markomanninen/7c6e1e96faf0ec4382249e2131a9f1d0) - Full implementation

## Proposed PyWorkflow Implementation

```python
from pyworkflow import workflow, step, agent
from pyworkflow.agents import ToolCallingAgent, Tool

# Define tools as PyWorkflow steps
@step()
async def get_weather(city: str) -> dict:
    """Get current weather for a city."""
    return {"city": city, "temp": 20, "condition": "sunny"}

@step()
async def send_email(to: str, subject: str, body: str) -> bool:
    """Send an email."""
    await email_service.send(to, subject, body)
    return True

# Create tool-calling agent (DEFAULT pattern)
@agent(
    model="claude-sonnet-4-5-20250929",
    tools=[get_weather, send_email],
    # pattern="tool_calling" is the default, can be omitted
)
async def assistant_agent(query: str):
    """
    General-purpose assistant using function calling.
    Fastest and most cost-effective agent pattern.
    """
    pass  # Agent loop handled by framework

# Use the agent
result = await assistant_agent.run(
    "Check weather in London and email the forecast to user@example.com"
)

# Access execution trace
print(result.answer)
print(result.tool_calls)  # List of tool invocations
```

### Parallel Tool Calling

Claude Sonnet 3.7+ and GPT-4 support parallel tool calls:

```python
@agent(
    model="claude-sonnet-3-7-20250219",
    tools=[get_weather, get_stock_price],
    enable_parallel_tools=True,  # Claude beta: token-efficient-tools-2025-02-19
)
async def parallel_agent(query: str):
    pass

# Single query triggers multiple tools in parallel
result = await parallel_agent.run(
    "What's the weather in Paris, London, and Tokyo?"
)
# → get_weather("Paris") + get_weather("London") + get_weather("Tokyo") execute concurrently
```

## Event Types

Tool-calling agents record these events:

1. **AGENT_STARTED** - Agent execution begins
   ```python
   {"run_id": "abc123", "query": "...", "tools": ["get_weather", "send_email"]}
   ```

2. **AGENT_TOOL_CALL** - LLM requests tool execution
   ```python
   {
     "tool_call_id": "call_xyz",
     "tool_name": "get_weather",
     "tool_input": {"city": "Paris"},
     "parallel": false
   }
   ```

3. **AGENT_TOOL_RESULT** - Tool execution completes
   ```python
   {
     "tool_call_id": "call_xyz",
     "tool_name": "get_weather",
     "result": {"temp": 20, "condition": "sunny"},
     "error": null
   }
   ```

4. **AGENT_RESPONSE** - LLM generates user-facing response
   ```python
   {"response": "The weather in Paris is 20°C and sunny."}
   ```

5. **AGENT_COMPLETED** / **AGENT_FAILED**

## Implementation Details

### Tool Schema Generation

PyWorkflow automatically generates tool schemas from step functions:

```python
@step()
async def search_database(
    query: str,
    filters: dict[str, Any] | None = None,
    limit: int = 10
) -> list[dict]:
    """
    Search the database for records.
    
    Args:
        query: Search query string
        filters: Optional key-value filters
        limit: Maximum number of results (default: 10)
    """
    pass

# Auto-generated Claude tool schema:
{
  "name": "search_database",
  "description": "Search the database for records.",
  "input_schema": {
    "type": "object",
    "properties": {
      "query": {"type": "string", "description": "Search query string"},
      "filters": {
        "type": "object",
        "description": "Optional key-value filters",
        "additionalProperties": true
      },
      "limit": {"type": "integer", "description": "Maximum number of results (default: 10)", "default": 10}
    },
    "required": ["query"]
  }
}
```

### Error Handling

When tools fail, the agent receives error information:

```python
try:
    result = await tool_function(**params)
except Exception as e:
    # Record error in event log
    await ctx.record_event(EventType.AGENT_TOOL_RESULT, {
        "tool_call_id": call_id,
        "tool_name": tool_name,
        "result": None,
        "error": str(e),
        "is_error": true
    })
    
    # Pass error back to LLM for handling
    tool_result = {
        "type": "tool_result",
        "tool_use_id": call_id,
        "is_error": true,
        "content": f"Error: {e}"
    }
```

The LLM can then retry with different parameters or inform the user.

### PyWorkflow Integration

```python
# Internal implementation (pyworkflow/agents/tool_calling.py)
class ToolCallingAgent:
    async def run(self, query: str):
        ctx = get_context()
        messages = [{"role": "user", "content": query}]
        
        while True:
            # Call LLM with tools
            response = await self.llm.generate(
                messages=messages,
                tools=self.tool_schemas,
                model=self.model
            )
            
            # Check if LLM wants to use tools
            if response.stop_reason == "tool_use":
                tool_calls = response.content  # List of tool_use blocks
                
                # Execute tools (as PyWorkflow steps - durable!)
                tool_results = []
                for tool_call in tool_calls:
                    await ctx.record_event(EventType.AGENT_TOOL_CALL, {
                        "tool_call_id": tool_call.id,
                        "tool_name": tool_call.name,
                        "tool_input": tool_call.input
                    })
                    
                    # Execute step (retryable, recorded in event log)
                    result = await self._execute_tool(tool_call.name, tool_call.input)
                    
                    await ctx.record_event(EventType.AGENT_TOOL_RESULT, {
                        "tool_call_id": tool_call.id,
                        "result": result
                    })
                    
                    tool_results.append({"tool_use_id": tool_call.id, "content": result})
                
                # Continue conversation with tool results
                messages.append(response.content)
                messages.append({"role": "user", "content": tool_results})
            
            elif response.stop_reason == "end_turn":
                # LLM finished - extract final answer
                await ctx.record_event(EventType.AGENT_RESPONSE, {"response": response.text})
                return response.text
```

## Trade-offs

### Pros
- **~40% faster** than ReAct (no reasoning text overhead)
- **Lower cost**: Fewer tokens generated
- **Simpler**: No prompt engineering for thought format
- **Native API support**: Leverages optimized LLM function calling
- **Parallel tool calls**: Supported by Claude 3.7+ and GPT-4
- **Type safety**: Structured inputs/outputs validated by LLM

### Cons
- **Less explainable**: No explicit reasoning traces
- **Black box**: Hard to debug why LLM chose a tool
- **Limited transparency**: Can't see "thinking" between tool calls
- **Trust required**: Must trust LLM's tool selection logic

### Comparison to ReAct

| Aspect | Tool-Calling | ReAct |
|--------|--------------|-------|
| Speed | **40% faster** | Slower |
| Cost | **Lower** | Higher |
| Explainability | Low | **High** |
| Prompt Engineering | Minimal | Complex |
| Use Case | **Most production apps** | Complex reasoning |

### When to Use Tool-Calling vs ReAct

**Use Tool-Calling (this pattern) when:**
- Speed and cost are priorities
- Tools are straightforward (API calls, database queries)
- Explainability is not critical
- You trust the LLM's tool selection

**Use ReAct when:**
- Debugging/explainability is critical
- Complex multi-step reasoning required
- Need to audit decision-making process
- Human-in-the-loop approval needed

## Related Issues

- #153 - ReAct Agent (slower but more explainable)
- #[ISSUE_3] - Plan-and-Execute Agent (combines planning + tool calling)
- #[ISSUE_5] - RAG Agent (tool calling + retrieval)

## References

- [Claude Tool Use Implementation Guide](https://platform.claude.com/docs/en/agents-and-tools/tool-use/implement-tool-use)
- [Claude Programmatic Tool Calling](https://platform.claude.com/docs/en/agents-and-tools/tool-use/programmatic-tool-calling)
- [Claude 4.5 Function Calling Tutorial](https://composio.dev/blog/claude-function-calling-tools)
- [Anthropic Tool Calling with Exa](https://docs.exa.ai/reference/anthropic-tool-calling)
- [The Guide to Structured Outputs and Function Calling](https://agenta.ai/blog/the-guide-to-structured-outputs-and-function-calling-with-llms)
- [LlamaIndex Anthropic Agent Example](https://developers.llamaindex.ai/python/examples/agent/anthropic_agent/)

## Implementation Checklist

- [ ] Create `pyworkflow/agents/tool_calling.py` with ToolCallingAgent class
- [ ] Implement automatic tool schema generation from @step functions
- [ ] Add Claude API integration (messages API + tools)
- [ ] Add OpenAI API integration (function calling)
- [ ] Add parallel tool execution support (Claude 3.7+)
- [ ] Implement error handling and retry logic
- [ ] Add event types: AGENT_TOOL_CALL, AGENT_TOOL_RESULT
- [ ] Create @agent() decorator (defaults to tool_calling pattern)
- [ ] Add tests with mock LLM responses
- [ ] Document tool-calling agent in examples/
- [ ] Add integration tests with real APIs
- [ ] Support Gemini function calling

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tool-Calling Agent Pattern: Direct Function Calling #156

Overview

How It Works

Reference Implementations

Proposed PyWorkflow Implementation

Parallel Tool Calling

Event Types

Implementation Details

Tool Schema Generation

Error Handling

PyWorkflow Integration

Trade-offs

Pros

Cons

Comparison to ReAct

When to Use Tool-Calling vs ReAct

Related Issues

References

Implementation Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Aspect	Tool-Calling	ReAct
Speed	40% faster	Slower
Cost	Lower	Higher
Explainability	Low	High
Prompt Engineering	Minimal	Complex
Use Case	Most production apps	Complex reasoning

Tool-Calling Agent Pattern: Direct Function Calling #156

Description

Overview

How It Works

Reference Implementations

Proposed PyWorkflow Implementation

Parallel Tool Calling

Event Types

Implementation Details

Tool Schema Generation

Error Handling

PyWorkflow Integration

Trade-offs

Pros

Cons

Comparison to ReAct

When to Use Tool-Calling vs ReAct

Related Issues

References

Implementation Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions