Problem or motivation
We want to have a dedicated skills evaluator (probably similar to rubric_based_tool_use_quality_v1) that would be used to check whether specific skills have been called and perhaps the order of them too.
Proposed solution
We want to have a dedicated skills evaluator (probably similar to rubric_based_tool_use_quality_v1) that would be used to check whether specific skills have been called and perhaps the order of them too.
Alternatives considered
No response
Additional context
No response
Human confirmation
Problem or motivation
We want to have a dedicated skills evaluator (probably similar to
rubric_based_tool_use_quality_v1) that would be used to check whether specific skills have been called and perhaps the order of them too.Proposed solution
We want to have a dedicated skills evaluator (probably similar to
rubric_based_tool_use_quality_v1) that would be used to check whether specific skills have been called and perhaps the order of them too.Alternatives considered
No response
Additional context
No response
Human confirmation