Skip to content

Conversation

@dsarno
Copy link
Collaborator

@dsarno dsarno commented Dec 24, 2025

couple of fixes to CI test harness (still need to set it up so it runs with coplay unity license on main repo).

Summary by Sourcery

Tighten NL/T CI test harness behavior and reporting for malformed fragments, add a CI gate enforcing NL/T coverage and pass status, and clarify the Unity NL suite prompt’s required XML output shape.

Enhancements:

  • Preserve malformed per-test XML fragments for debugging and mark them as explicit failures in the synthesized JUnit results.
  • Improve NL/T summary markdown by attaching human-friendly default titles to each test ID when no explicit name is emitted.
  • Clarify the Unity NL suite authoring prompt to require raw XML output without markdown fences and to emphasize well-formed single- documents.

CI:

  • Add a CI gate step that inspects the NL/T JUnit report and fails the job when any required NL/T test is missing or failing.

Summary by CodeRabbit

  • New Features

    • Display names for test identifiers in reports and checklists for better readability.
    • CI enforcement gate validates test completeness and fails builds on missing or invalid tests.
  • Bug Fixes

    • Improved detection and handling of malformed test XML with content preservation for debugging.
    • Enhanced distinction between missing versus malformed test results in report generation.

✏️ Tip: You can customize this high-level summary in your review settings.

@sourcery-ai
Copy link
Contributor

sourcery-ai bot commented Dec 24, 2025

Reviewer's Guide

CI harness improvements for the Claude NL Unity suite: make malformed XML fragments visible and counted as failures, enrich NL/T test names in the summary with sensible defaults, and add a hard CI gate that fails the workflow when required NL/T tests are missing or failing, plus clarify the Unity prompt about writing raw XML testcases without markdown fences.

File-Level Changes

Change Details Files
Preserve malformed NL fragment XML and mark them explicitly as failing tests instead of treating them as missing outputs.
  • Import shutil and introduce a bad set to track malformed fragment IDs.
  • When parsing a reports/*_results.xml file fails but the file is non‑empty and matches a desired ID, copy it into reports/_staging/_malformed.xml for later inspection.
  • Mark such IDs as bad so they are later backfilled as explicit failing testcases instead of "not produced".
  • Adjust the backfill logic so that bad IDs get a failure element with message "malformed xml" and a detailed explanation pointing to the preserved staging file, while other missing IDs remain "not produced" failures.
.github/workflows/claude-nl-suite.yml
Improve the NL/T test summary output with stable, human-readable default titles while preserving emitted testcase names when present.
  • Define a default_titles mapping from each NL/T test ID to a descriptive title string.
  • Add a display_name helper that prefers the testcase name from name_map, falling back to the default title or the bare ID.
  • Update the checklist rendering to print each test as its display_name (with ID and title) and status, instead of just the raw ID.
.github/workflows/claude-nl-suite.yml
Introduce a CI gate that fails the workflow when any required NL/T JUnit testcase is missing or has failure/error, using the generated junit-nl-suite.xml.
  • Add a new always-running bash step that executes an inline Python script to enforce NL/T coverage and pass/fail status.
  • Have the script load the JUnit output from JUNIT_OUT (default reports/junit-nl-suite.xml), handling both testsuite and testsuites roots.
  • Derive a test ID for each testcase from its name or system-out content using regexes, mirroring the summary builder logic.
  • Build an id_status map for the required test IDs, detect missing and failing IDs, and emit GitHub ::error:: annotations listing them.
  • Exit non‑zero when any required ID is missing or failing so the job fails; otherwise print a success message indicating the NL/T CI gate passed.
.github/workflows/claude-nl-suite.yml
Clarify the Claude Unity NL prompt so agents write raw XML testcases directly (no markdown fences) and ensure files end with a closing testcase tag.
  • Reword the instructions from "Use this exact format" to "Use this exact shape" and explicitly tell the agent to write XML directly into the file without fences.</li><li>Remove the embeddedxml code fence from the example so the prompt itself is not markdown-encouraging.
  • Add an explicit requirement that the output must end with the closing tag, emphasizing well‑formed XML.
.claude/prompts/nl-unity-suite-nl.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 24, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

The changes update test XML emission rules to require direct, well-formed XML output without code fences and enforce proper document structure. The NL/T report workflow is enhanced to detect and handle malformed XML fragments, distinguish between missing and malformed results, introduce human-friendly test labels, and add a CI gate to enforce NL/T completeness.

Changes

Cohort / File(s) Summary
Test XML Formatting
.claude/prompts/nl-unity-suite-nl.md
Reworks XML emission rules: removes code fence requirement (previously ```xml...```), mandates direct XML output, enforces well-formed structure with single <testcase> root, and requires explicit closing tag </testcase> at file end.
NL/T Report & CI Robustness
.github/workflows/claude-nl-suite.yml
Adds malformed XML fragment detection and routing to staging with preserved IDs; introduces display\_name mapping for human-friendly test labels; distinguishes between "not produced" vs "malformed XML" failures; implements CI gate to parse JUnit, enforce NL/T completeness, and fail with explicit missing/failed ID lists.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 XML formatted clean and bright,
No fences hiding fragments tight,
CI gates now stand to keep the test,
Malformed or missing—we know the rest!
A rabbit's code runs quick and right! 🚀

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 91b6f4d and 2f0c698.

📒 Files selected for processing (2)
  • .claude/prompts/nl-unity-suite-nl.md
  • .github/workflows/claude-nl-suite.yml

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The desired test ID list and JUnit parsing/id extraction logic are now duplicated between the summary builder and the new CI gate; consider extracting this into a shared script or composite action so it stays in sync.
  • In the malformed XML preservation block, the filename is derived from fid; although guarded by fid in DESIRED, you might still want a safer fallback (e.g., using p.stem) so future changes to DESIRED/matching can’t accidentally generate None_malformed.xml.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `desired` test ID list and JUnit parsing/id extraction logic are now duplicated between the summary builder and the new CI gate; consider extracting this into a shared script or composite action so it stays in sync.
- In the malformed XML preservation block, the filename is derived from `fid`; although guarded by `fid in DESIRED`, you might still want a safer fallback (e.g., using `p.stem`) so future changes to `DESIRED`/matching can’t accidentally generate `None_malformed.xml`.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@dsarno dsarno merged commit 8e3bd1d into CoplayDev:main Dec 24, 2025
1 of 2 checks passed
@dsarno dsarno deleted the fix/CI-cleanup branch January 2, 2026 19:19
@coderabbitai coderabbitai bot mentioned this pull request Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant