Skip to content

bug: contains grader doc claims case-insensitive default but implementation is case-sensitive #1154

@christso

Description

@christso

Objective

plugins/agentv-dev/skills/agentv-bench/agents/grader.md:42 documents:

contains | Check if response includes the value substring (case-insensitive by default)

The implementation at packages/core/src/evaluation/graders/assertions.ts:14-25 does:

export function runContainsAssertion(output: string, value: string): AssertionResult {
  const passed = output.includes(value);  // case-sensitive — no toLowerCase
  ...
}

contains-any (assertions.ts:28-45) and contains-all (assertions.ts:48-63) are also case-sensitive (raw .includes()). The icontains* family at assertions.ts:68-123 explicitly lowercases both sides — which only makes sense as a variant if the bare contains* functions are case-sensitive.

So grader.md:42 is both factually wrong and internally inconsistent with the icontains* entries at grader.md:45.

Reproducer

tests:
  - id: t
    input: test
    assertions:
      - name: has_hello
        type: contains
        value: hello

Response "Hello, world!" → assertion fails. The auto-generated failure text "Output does not contain \"hello\"" comes from assertions.ts:20 and is a grep anchor for the case-sensitive branch.

Design latitude

  1. Fix the doc (recommended)grader.md:42 states contains is case-sensitive by default; direct users to icontains* for case-insensitive matching. Aligns with the existing icontains convention.
  2. Fix the implementation — make contains* case-insensitive by default (change assertions.ts:15, :32, :52). Breaking change; any eval relying on case-sensitive contains would start passing incorrectly.

Option 1 is the YAGNI path unless there's concrete evidence users expected case-insensitive behavior from bare contains. icontains* already covers the case-insensitive use case.

Acceptance signals

  • grader.md:42-44 accurately describes contains, contains-any, contains-all case-sensitivity (case-sensitive if Option 1).
  • Regression test in packages/core/test/evaluation/graders/ pinning the chosen behavior, e.g. for Option 1: expect(runContainsAssertion("Hello", "hello").score).toBe(0) and expect(runContainsAssertion("hello", "hello").score).toBe(1).
  • No other skill/doc file claims contains* is case-insensitive.

Non-goals

  • equals (assertions.ts:196), starts-with (:126), ends-with (:140) are all case-sensitive and their grader.md:46-49 entries do not claim otherwise — explicitly out of scope.
  • regex case flags are handled via flags parameter — out of scope.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions