Conversation
Document and implement per-test execution output and ensure suite headings are deduplicated. Updates include: docs/specs to describe per-test lines (passed/failed, model, tokens), tests in src/lib/it/it.test.ts that cover failed async test logging and suite-heading deduplication, change in src/lib/it/it.ts to log based on actual test result (didPass()), and refactor src/lib/output/testLogging.ts to track logged suites with a Set and reset it. These changes ensure accurate pass/fail messages and that each suite header is printed only once.
This reverts commit e63404d.
Replace per-test pass/fail lines with a unified "Finished in <N ms>" message and adjust duration formatting (adds space before "ms"). Remove didPass parameter from logging APIs and simplify logCurrentContextExecution/logTestExecution to always emit timing (and include model/token lines when present). Update CLI summary layout for aligned labels and cyan-bolded values. Bump spec example model to gpt-5.2 and update docs/specs/tests to match the new output format; remove some getFailedTestCount usage in it.ts.
There was a problem hiding this comment.
Pull request overview
This pull request standardizes Katt’s per-test execution log lines and CLI summary output formatting by removing per-test pass/fail indicators and switching to a consistent “Finished in …” duration line, while updating tests and documentation/specs to match.
Changes:
- Simplifies per-test logging by removing pass/fail outcome tracking and emitting
- Finished in <duration>consistently. - Updates CLI summary formatting so only the values (e.g.,
1 passed) are cyan+bold, not the labels, with aligned columns. - Revises unit tests and docs/spec examples to match the new output format.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/lib/output/testLogging.ts | Removes didPass from logging API and prints Finished in duration format. |
| src/lib/it/it.ts | Drops pass/fail inference and logs execution duration uniformly across sync/async paths. |
| src/lib/it/it.test.ts | Updates log output assertions to match new per-test format. |
| src/lib/expect/toBeClassifiedAs.ts | Updates logging call signature after removing didPass. |
| src/lib/expect/toBeClassifiedAs.test.ts | Updates log output assertions to match new per-test format. |
| src/lib/expect/promptCheck.ts | Updates logging call signature after removing didPass. |
| src/lib/expect/promptCheck.test.ts | Updates log output assertions to match new per-test format. |
| src/index.test.ts | Updates end-to-end CLI output assertions for per-test logs and summary formatting. |
| src/cli/runCli.ts | Adjusts summary output to color only values and align labels with spacing. |
| src/cli/runCli.test.ts | Updates summary formatting assertions to match aligned/colored output. |
| specs/feature-snapshot.spec.md | Minor spec formatting change (trailing newline). |
| specs/execution.spec.md | Updates examples and summary highlighting description to reflect new output format (one inconsistency remains). |
| docs/api-documentation.md | Documents the new per-test and summary output formats. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - The printed summary description words should be in the color `cyan` and `bold` style. | ||
| - Example: `Tests` in the above summary should be cyan and bold while `27 passed (27)` should be in default color and style. | ||
| - It exits with code `0` | ||
| - Example: `Evals` in the above summary should be in default color and style while `27 passed` should be cyan and bold . | ||
| - It exits with code `0`. |
There was a problem hiding this comment.
In the summary section, line 60 still states that the labels ("summary description words") should be cyan+bold, but the updated behavior and the example on line 61 indicate only the values (e.g., "27 passed") should be highlighted. Please update this requirement text to match the new output rules, and also remove the extra space before the period in "cyan and bold .".
This pull request updates the test execution logging format and summary output for the CLI and test framework. The changes standardize log lines to use "Finished in" instead of "Passed in" or "Failed in", remove explicit pass/fail indicators from per-test logs, and update summary formatting to highlight only the counts in cyan and bold. The implementation and test files are updated for consistency, and related documentation/specs are revised to match the new output.
Logging and Output Format Changes
- Finished in <cyan bold N ms>instead of- ✅ Passed in <cyan bold Nms>or- ❌ Failed in <cyan bold Nms>, removing explicit pass/fail indicators. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]1 passed) in cyan and bold, not the labels (e.g.,Files,Evals), and updated formatting for alignment and clarity. [1] [2] [3] [4]Implementation Refactoring
didPassparameter and outcome calculation), simplifyinglogTestExecutionand related APIs. [1] [2] [3] [4] [5] [6] [7]Documentation and Spec Updates
Test Updates
Code Cleanup
it.ts. [1] [2]These changes ensure a more consistent and streamlined test logging experience, with clearer output and easier maintenance.