Skip to content

Conversation

@cagataycali
Copy link
Member

Description

Fixes a flaky integration test in test_structured_output_agent_loop.py where test_tool_use_with_structured_output would fail intermittently.

Root Cause

The LLM sometimes outputs '+' instead of 'add' as the operation string. The original calculator tool only accepted exact string matches like "add", "subtract", etc., causing the function to return 0 when the LLM used symbols.

Example of failure:

# LLM called: calculator(operation='+', a=2, b=2)
# Expected: 4
# Got: 0 (because '+' didn't match 'add')

This was observed in PR #1444 CI run: https://github.com/strands-agents/sdk-python/actions/runs/20875011420/job/59982925582?pr=1444

Changes

  • Accept both word and symbol forms for all operations:
    • add / +
    • subtract / - / sub
    • multiply / * / mul
    • divide / / / div
    • power / ** / pow
  • Normalize input with lower() and strip()
  • Fix divide operation (was b/a, now a/b to match standard math notation)
  • Improve docstring with Args section

Related Issues

Documentation PR

No documentation changes required - this is a test fixture.

Type of Change

Bug fix (test reliability)

Testing

  • Manual testing confirms calculator now handles both 'add' and '+'
  • Change is minimal and isolated to test file

Checklist

  • I have read the CONTRIBUTING document
  • My changes generate no new warnings
  • This is a test-only change

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.


Created by strands-coder autonomous agent 🦆

The test_tool_use_with_structured_output test was flaky because the LLM
sometimes uses '+' instead of 'add' as the operation string. The calculator
tool now accepts both formats for all operations.

Changes:
- Accept both word and symbol forms: add/+, subtract/-, multiply/*, divide//, power/**
- Also accept common abbreviations: sub, mul, div, pow
- Normalize input with lower() and strip()
- Fix divide operation (was b/a, now a/b)
- Improve docstring with Args section

This makes the integ tests more resilient to LLM output variations.
@codecov
Copy link

codecov bot commented Jan 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@cagataycali
Copy link
Member Author

✅ Approved & Ready to Merge!

This PR has been approved by @dbschmigelski and all CI checks are passing! 🎉

Status Summary

  • ✅ CI: All checks SUCCESS
  • ✅ Review: APPROVED
  • ✅ Mergeable: No conflicts
  • ✅ Codecov: 100% coverage

What This Fixes

Makes the calculator integration test more robust by handling LLM output variations gracefully.

Impact

  • Test stability improvement
  • No production code changes
  • Low risk

Ready for merge when maintainers are available! 🦆

@yonib05 yonib05 merged commit 3ffc327 into strands-agents:main Jan 12, 2026
15 checks passed
@cagataycali cagataycali deleted the fix/flaky-calculator-test branch January 12, 2026 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants