Expose all tool_trajectory_avg_score match types#89
Conversation
krisztianfekete
left a comment
There was a problem hiding this comment.
Thanks for the PR, added some comments!
Can you also add tests please?
There was a problem hiding this comment.
Can we remove the synthetic JSON files from sample, and add tests instead?
There was a problem hiding this comment.
fixed please check test format!
| if match_type | ||
| else ToolTrajectoryCriterion.MatchType.EXACT | ||
| ) | ||
| criterion = ToolTrajectoryCriterion(threshold=effective_threshold, matchType=_match) |
There was a problem hiding this comment.
Let's keep this consistent with all other ADK objects in this file that are constructed with snake_case kwargs.
| ) | ||
| @click.option( | ||
| "--trajectory-match-type", | ||
| type=click.Choice(["EXACT", "IN_ORDER", "ANY_ORDER"], case_sensitive=False), |
There was a problem hiding this comment.
Can you please perform the same validation at all the layers this can be a problem in the codebase?
| threshold: float | None, | ||
| eval_semaphore: asyncio.Semaphore, | ||
| trajectory_match_type: str | None = None, | ||
| eval_semaphore: asyncio.Semaphore = None, |
There was a problem hiding this comment.
Can we add trajectory_match_type: str | None = None after the existing optional parameters instead or reordering or changing the signature?
There was a problem hiding this comment.
Now we don't have to change the semaphore anymore.
| metrics: state.selectedMetrics, | ||
| judgeModel: state.judgeModel, | ||
| threshold: state.threshold, | ||
| trajectoryMatchType: state.trajectoryMatchType !== 'EXACT' ? state.trajectoryMatchType : undefined, |
There was a problem hiding this comment.
Let's always send this to avoid breaking this if we change what's the default on the backend side.
There was a problem hiding this comment.
ok fixed this makes sense!
|
@krisztianfekete please let me know if there are other fixes I can do here! |
| threshold: float | None, | ||
| eval_semaphore: asyncio.Semaphore, | ||
| trajectory_match_type: str | None = None, | ||
| eval_semaphore: asyncio.Semaphore = None, |
There was a problem hiding this comment.
Now we don't have to change the semaphore anymore.
| eval_set_id: str | ||
| metrics: list[str] = ["tool_trajectory_avg_score"] | ||
| judge_model: str = "gemini-2.5-flash" | ||
| trajectory_match_type: str | None = None |
There was a problem hiding this comment.
Can we use a literal here?
fix #84 :
Expose all tool_trajectory_avg_score match types: EXACT, IN_ORDER, or ANY_ORDER.