Skip to content

Include reason in relevance check failures#1584

Merged
dgageot merged 1 commit intodocker:mainfrom
dgageot:relevance-mismatch
Feb 4, 2026
Merged

Include reason in relevance check failures#1584
dgageot merged 1 commit intodocker:mainfrom
dgageot:relevance-mismatch

Conversation

@dgageot
Copy link
Member

@dgageot dgageot commented Feb 4, 2026

When running cagent eval with relevance checks, the LLM judge now provides a reason explaining why each check passed or failed. This reason is now captured and displayed in the output, making it easier to understand why a relevance check failed.

Output format changed from:
✗ relevance:
To:
✗ relevance: (reason: )

Assisted-By: cagent

@dgageot dgageot requested a review from a team as a code owner February 4, 2026 12:44
rumpl
rumpl previously approved these changes Feb 4, 2026
github-actions[bot]
github-actions bot previously approved these changes Feb 4, 2026
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Approved - No bugs found in the changed code.

This PR successfully adds reason tracking to relevance check failures. The implementation is solid:

  • The new RelevanceResult struct properly encapsulates both criterion and reason
  • The refactored parseJudgeResponse correctly returns structured results with proper fallback handling
  • Concurrent processing in CheckRelevance is safe with pre-allocated indexed results
  • Error handling is consistent across all code paths
  • Empty reason strings are handled gracefully with conditional formatting
  • All tests have been updated to verify the new behavior

The changes improve debuggability by providing explanations for failed relevance checks without introducing any logic errors or bugs.

@dgageot dgageot dismissed stale reviews from github-actions[bot] and rumpl via 6bc54fa February 4, 2026 13:11
@dgageot dgageot force-pushed the relevance-mismatch branch from 65c5973 to 6bc54fa Compare February 4, 2026 13:11
When running cagent eval with relevance checks, the LLM judge now provides
a reason explaining why each check passed or failed. This reason is now
captured and displayed in the output, making it easier to understand why
a relevance check failed.

Output format changed from:
  ✗ relevance: <criterion>
To:
  ✗ relevance: <criterion> (reason: <explanation>)

Assisted-By: cagent
@dgageot dgageot force-pushed the relevance-mismatch branch from 6bc54fa to 7b17fdb Compare February 4, 2026 14:38
@dgageot dgageot merged commit bfd57a3 into docker:main Feb 4, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants