Claude provider: agentic loop max_iterations=10 too low and not configurable
Problem
The Claude provider's _execute_agentic_loop() has a hardcoded max_iterations=10 default. This is the number of tool-use roundtrips a single agent can make before the provider throws a ProviderError. For any agent that needs to read multiple files, write code, and verify results, 10 iterations is far too few.
The Copilot provider has no equivalent iteration limit — it uses the SDK's built-in agentic loop with a 30-minute wall-clock timeout (max_session_seconds=1800) and a 5-minute idle timeout (idle_timeout_seconds=300). There is no cap on the number of tool calls.
Impact
Complex agents (e.g., a "coder" agent that reads a plan, explores the codebase, writes implementation code, and verifies changes) routinely need 20-40+ tool-use iterations. With the current limit of 10, these agents always fail with:
Agentic loop exceeded maximum iterations (10)
💡 Suggestion: The agent may be stuck in a tool-use loop. Check your MCP tools.
This makes the Claude provider unusable for any non-trivial coding workflow.
Root Cause
In conductor/providers/claude.py, _execute_agentic_loop():
async def _execute_agentic_loop(
self,
messages: list[dict[str, Any]],
model: str,
temperature: float | None,
max_tokens: int,
tools: list[dict[str, Any]] | None,
output_schema: dict[str, OutputField] | None,
has_output_schema: bool,
max_iterations: int = 10, # <-- hardcoded, not configurable
...
) -> tuple[ClaudeResponse, int | None, bool]:
The value is never overridden — the caller in _execute_with_retry() doesn't pass it, and there's no way to set it from the workflow YAML runtime config or per-agent config.
Suggested Fix
Option A: Make it configurable (preferred)
- Add a
max_agent_iterations field to the Claude provider's runtime config (similar to max_tokens, temperature, etc.)
- Allow per-agent override via the agent config in the workflow YAML
- Set a reasonable default (e.g., 50) that matches the Copilot provider's practical behavior
Example workflow YAML:
runtime:
provider: claude
config:
max_agent_iterations: 50 # or per-agent override
Option B: Match Copilot provider behavior
Replace the iteration count limit with a wall-clock timeout, matching the Copilot provider's IdleRecoveryConfig pattern:
max_session_seconds: 1800 (30 minutes per agent)
idle_timeout_seconds: 300 (5 minutes without API activity)
This would provide true feature parity.
Option C: Quick fix (minimum viable)
Just increase the default from 10 to 50. This covers most practical use cases without architectural changes.
Workaround
Patch the default locally:
sed -i 's/max_iterations: int = 10,/max_iterations: int = 50,/' \
"$(python -c 'import conductor.providers.claude; print(conductor.providers.claude.__file__)')"
Environment
- conductor-cli: installed from
git+https://github.com/microsoft/conductor.git
- Provider:
claude
- Workflow:
implement.yaml (coder agent with filesystem MCP server)
- The coder agent typically makes 20-40 tool calls per epic implementation
Claude provider: agentic loop
max_iterations=10too low and not configurableProblem
The Claude provider's
_execute_agentic_loop()has a hardcodedmax_iterations=10default. This is the number of tool-use roundtrips a single agent can make before the provider throws aProviderError. For any agent that needs to read multiple files, write code, and verify results, 10 iterations is far too few.The Copilot provider has no equivalent iteration limit — it uses the SDK's built-in agentic loop with a 30-minute wall-clock timeout (
max_session_seconds=1800) and a 5-minute idle timeout (idle_timeout_seconds=300). There is no cap on the number of tool calls.Impact
Complex agents (e.g., a "coder" agent that reads a plan, explores the codebase, writes implementation code, and verifies changes) routinely need 20-40+ tool-use iterations. With the current limit of 10, these agents always fail with:
This makes the Claude provider unusable for any non-trivial coding workflow.
Root Cause
In
conductor/providers/claude.py,_execute_agentic_loop():The value is never overridden — the caller in
_execute_with_retry()doesn't pass it, and there's no way to set it from the workflow YAML runtime config or per-agent config.Suggested Fix
Option A: Make it configurable (preferred)
max_agent_iterationsfield to the Claude provider's runtime config (similar tomax_tokens,temperature, etc.)Example workflow YAML:
Option B: Match Copilot provider behavior
Replace the iteration count limit with a wall-clock timeout, matching the Copilot provider's
IdleRecoveryConfigpattern:max_session_seconds: 1800(30 minutes per agent)idle_timeout_seconds: 300(5 minutes without API activity)This would provide true feature parity.
Option C: Quick fix (minimum viable)
Just increase the default from 10 to 50. This covers most practical use cases without architectural changes.
Workaround
Patch the default locally:
Environment
git+https://github.com/microsoft/conductor.gitclaudeimplement.yaml(coder agent with filesystem MCP server)