feat: add web_test, build_runner, and enhanced managed test by tsavo-at-pieces · Pull Request #29 · open-runtime/runtime_ci_tooling

tsavo-at-pieces · 2026-02-24T22:47:02Z

Summary

Add web_test CI feature: standalone Ubuntu job with deterministic Chrome provisioning via browser-actions/setup-chrome@v2, configurable concurrency and optional test path filtering
Enable build_runner feature: runs dart run build_runner build --delete-conflicting-outputs before analyze/test steps to regenerate .g.dart codegen files
Enhance managed test: pipe output through tee for log capture, use PIPESTATUS for correct exit codes, upload test logs as artifacts
Add web_test validation (concurrency bounds, path traversal/injection safety) with 16 new tests
Document both features in SETUP.md and USAGE.md

Test plan

All 88+ existing + new validation tests pass (dart test)
Generate CI with web_test: true — verify web-test job with Chrome setup steps
Generate CI with web_test: false — verify no web-test content
Verify build_runner steps appear in analyze and test jobs
Verify managed test log capture and artifact upload in generated workflow

🤖 Generated with Claude Code

…d test - Add `web_test` CI feature: standalone ubuntu job with deterministic Chrome provisioning via browser-actions/setup-chrome@v2, configurable concurrency and optional test path filtering - Enable `build_runner` feature: runs `dart run build_runner build --delete-conflicting-outputs` before analyze/test steps to regenerate .g.dart codegen files - Enhance managed test: pipe test output through tee for log capture, use PIPESTATUS for correct exit codes, upload test logs as artifacts - Add web_test validation (concurrency bounds, path traversal/injection safety checks) with 16 new tests - Document build_runner and web_test features in SETUP.md and USAGE.md - Regenerate CI workflow with all new features Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…maries Two-layer capture strategy for comprehensive test output: - Layer 1 (zone-aware): --file-reporter json: captures all print() calls as PrintEvent objects with test attribution, --file-reporter expanded: captures human-readable output with all prints for all tests - Layer 2 (shell-level): 2>&1 | tee in CI template captures stdout.write(), isolate prints, and FFI output that bypass Dart zones TestCommand (live CI path via bin/manage_cicd.dart) now: - Uses Process.start() with piped streams for real-time + captured output - Writes structured logs to TEST_LOG_DIR (CI) or .dart_tool/test-logs/ - Parses NDJSON results for per-test pass/fail/skip with durations - Generates rich GitHub Actions job summaries with alert boxes, tables, and collapsible per-failure details (error, stack trace, captured prints) - Accumulates multiple errors per test (e.g. test + tearDown failures) - Dynamic code fence delimiters to prevent backtick injection - Cross-platform process.kill() (no ProcessSignal.sigkill on Windows) - UTF-8 decoder (dart test always outputs UTF-8) - 45-minute process timeout with 30-second stream drain timeout CI template updates: - shell: bash + set -o pipefail for cross-platform consistency - TEST_LOG_DIR env var passes log directory to Dart command - Artifact upload with if: always() and 14-day retention - Separate managed/unmanaged artifact upload paths Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Adds new CI-generation capabilities to runtime_ci_tooling, including an optional Chrome-based web test job, optional build_runner pre-steps, and improved managed test logging/artifacts for better debuggability in GitHub Actions.

Changes:

Add ci.features.web_test and ci.web_test (concurrency + path filtering) with validation tests.
Add optional build_runner steps to generated workflows.
Enhance managed test execution to capture logs via file reporters + shell tee, and upload logs as artifacts.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`lib/src/cli/utils/workflow_generator.dart`	Adds web_test feature flag + web_test config resolution and validation.
`templates/github/workflows/ci.skeleton.yaml`	Updates managed test steps (tee + artifacts) and introduces a new `web-test` job.
`templates/config.json`	Documents and adds default config for `web_test` and feature flag.
`test/workflow_generator_test.dart`	Adds validation test coverage for `ci.web_test` config.
`lib/src/cli/manage_cicd.dart`	Enhances managed test runner with structured reporters, log files, and job summary.
`lib/src/cli/commands/test_command.dart`	Adds similar output capture + summary behavior for CLI `test` command.
`USAGE.md`	Documents new optional `build_runner` / `web_test` features.
`SETUP.md`	Adds configuration table entries for `build_runner` and `web_test`.
`.runtime_ci/template_versions.json`	Updates template hashes/timestamps for the changed templates.
`.runtime_ci/config.json`	Enables managed_test + build_runner for this repo’s generated workflows.
`.github/workflows/ci.yaml`	Regenerated workflow showing build_runner + managed test log capture/upload.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

lib/src/cli/commands/test_command.dart

lib/src/cli/manage_cicd.dart

lib/src/cli/utils/workflow_generator.dart

templates/github/workflows/ci.skeleton.yaml

lib/src/cli/commands/test_command.dart

…d test coverage Address all findings from PR #29 code review: - Add validate() guard in render() for defense-in-depth against shell injection - Cap web_test concurrency at 32 (Chrome instances are heavy) - Detect duplicate paths, unknown keys, and cross-validate feature vs config - Shell-quote and normalize web_test paths in rendered output - HTML-escape platformId in GitHub step summaries - Add sync comments and proto setup to web-test job template - Extract duplicated TestFailure/TestResults/parsing into shared test_results_util - Add escapeHtml to StepSummary utility class - Add 15+ new validation tests (traversal, shell metacharacters, duplicates, etc.) - Update SETUP.md concurrency range and paths wording Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cursor · 2026-02-24T23:36:18Z

PR Summary

Medium Risk
Touches CI workflow generation and test execution/reporting, which can affect reliability and signal in all pipelines. New config validation reduces injection risk, but changes may alter test behavior/output and artifact retention across platforms.

Overview
Adds new CI capabilities driven by config: optional build_runner execution before analyze/test and a new optional web-test job that provisions Chrome on Ubuntu and runs dart test -p chrome with configurable concurrency and validated path filtering.

Upgrades managed test execution to capture full stdout/stderr and structured JSON results, write rich $GITHUB_STEP_SUMMARY output, and always upload test logs as artifacts; generated workflows now run tests via bash with pipefail and tee to preserve correct exit codes.

Updates templates/default config and docs to expose ci.features.web_test, ci.web_test settings, and enables managed_test/build_runner in this repo’s CI; adds extensive validation + render tests to prevent unsafe web_test.paths interpolation and enforce concurrency bounds.

^{Written by Cursor Bugbot for commit 0b65313. Configure here.}

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

Comment @cursor review or bugbot run to trigger another review on this PR

templates/config.json

lib/src/cli/manage_cicd.dart

…d tests - Move TestFailure/TestResults/parsing from test_results_util.dart into step_summary.dart (single shared location) - Delete standalone test_results_util.dart - Add render() guard tests: invalid config, multiple errors, invalid web_test - Add cross-validation edge case tests (null config with feature enabled) - Expand web-test skeleton with additional sync markers - Additional linter-driven cleanup across manage_cicd and workflow_generator Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

lib/src/cli/commands/test_command.dart

…fety, PLATFORM_ID - Add cumulative 1 MiB size guard to StepSummary.write() to prevent exceeding GitHub step summary limit - Store stream subscriptions separately and cancel on timeout to prevent resource leaks in TestCommand and _runTest() - Guard logDir creation and log file writes with FileSystemException catch blocks for robustness - Use package:path (p.join) for all path construction in TestCommand for cross-platform safety - Add single-quote escaping to StepSummary.escapeHtml() - HTML-escape collapsible() title parameter - Add PLATFORM_ID env var to multi-platform CI template from matrix.platform_id for meaningful test summary headings - Use $TEST_LOG_DIR consistently in CI template run blocks instead of hardcoded $RUNNER_TEMP/test-logs Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

lib/src/cli/utils/step_summary.dart

- Add _isSafeSecretIdentifier, _isSafeRunnerLabel, _isSafeSubPackageName validators for all Mustache-interpolated values - Validate secrets keys/values, pat_secret, line_length (numeric 1-10000), sub_packages name, and runner_overrides values - Reject hyphen-prefix and repo-root paths for sub_packages and web_test - Pin browser-actions/setup-chrome to SHA (v2.1.1) - Add -- end-of-options before web_test paths in skeleton - Shell-escape all interpolated values in manage_cicd.dart - Add RepoUtils: resolveTestLogDir, isSymlinkPath, ensureSafeDirectory, writeFileSafely for TEST_LOG_DIR safety - Wire WorkflowGenerator.validate() into validate command - Extract TestResultsUtil from StepSummary, add escapeHtml - Fix templates/config.json cross-validation (empty web_test: {}) - Add proto setup to web-test job, guard artifact upload with hashFiles - Add cli_utils_test.dart, expand workflow_generator_test.dart (214 tests) - Document path validation rules and cross-validation in SETUP.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Format checks now run across the repository and stage only tracked *.dart changes, preventing unrelated files from being committed while still capturing non-lib Dart edits. Regenerated workflow templates and synced related docs/template metadata.

Enforce uppercase-only secret identifiers and digits-only line_length string values, with corresponding test coverage updates. Include pending workflow template dependency change currently in local working tree.

…on tests - Add 6 _preserveUserSections tests: unknown section name, missing END marker, mismatched BEGIN/END names, regex-special characters in content, null existingContent, unrelated existingContent - Add 13 feature flag combination tests: format_check+web_test, build_runner+web_test, proto+web_test, multi-platform+web_test, single-platform+web_test (no analyze-and-test dep), secrets in web-test, lfs+web_test, managed_test in multi-platform, all features enabled, no features enabled, sub_packages render, runner_overrides - Total: 240 tests, all passing Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

This addresses the consolidated review findings by improving process/stream robustness, summary safety and parsing limits, and workflow template correctness across platform dependencies, artifact policy, and cache paths. It also adds focused regression tests for test-command behavior, summary edge cases, and workflow rendering contracts.

tsavo-at-pieces · 2026-02-25T02:57:30Z

Final Verification Against HEAD

I re-verified all 21 adversarial findings against current HEAD on feat/enhanced-managed-test.

Verdict

All 21 findings are resolved in current HEAD (fixed in code or invalidated by current implementation).

Per-issue status (1–21)

Byte vs char size guard — fixed (utf8.encode(...) byte accounting in step summary writes).
Invalid UTF-8 decode crash risk — fixed (Utf8Decoder(allowMalformed: true) in test stream handling).
Unbounded test output buffers — fixed (bounded buffers for test output and pub get).
Skeleton/rendered PIPESTATUS drift — resolved (no PIPESTATUS usage; set -o pipefail path is consistent).
SIGTERM-only timeout kill risk — fixed (kill escalation strategy in process timeout handling).
Stream subscription cleanup path — fixed (cleanup occurs in finally).
Missing single-platform PLATFORM_ID — fixed (single-platform ID is generated and used).
pub get timeout robustness — fixed (timeout-aware pub get execution path with termination).
_runTest/repoRoot contract mismatch — fixed (manage_cicd delegates with explicit repoRoot).
Missing unit tests for primary feature — fixed (new/expanded tests for command + summary/result utilities).
Artifact upload on cancelled runs (always()) — fixed (success() || failure()).
Exit without flush risk — fixed (exitWithCode() flushes stdout/stderr before exit).
Empty NDJSON marked as parsed — fixed (empty input remains unparsed).
Malformed JSON warning flood — fixed (circuit-breaker warning cap).
Sub-package tests lacking structured capture — fixed (JSON/expanded reporters + parsed summaries).
Weak render assertion — fixed (render tests parse YAML and assert structure).
collapsible() content escaping — fixed (content is escaped).
_codeFence pathological behavior — fixed (bounded max-run strategy).
Unbounded failures list growth — fixed (failures list cap + truncation behavior).
Retention-days inconsistency — fixed (single template variable policy applied consistently).
Windows pub-cache path — fixed (platform-aware cache path using LOCALAPPDATA on Windows).

Corrections to previous draft notes

#16 is not N/A anymore; it is now explicitly covered by stronger workflow render tests.
“Single-platform web-test independence” is outdated wording; current intended behavior is dependency on analyze-and-test in single-platform mode.
Test count in previous note is stale; current full suite is 257 passing tests.

Current validation snapshot

dart analyze ✅
dart test ✅ (257 passed)

Address post-review reliability and safety gaps by adding timeout-safe process execution with explicit kill semantics, bounded output capture, and fatal-path flush behavior. Also harden runtime sub-package loading with workflow-equivalent validation, align Windows pub-cache path handling across workflow templates, and gate issue-triage npm installation behind trigger conditions. Expand utility coverage for timeout handling, fatal-exit probe behavior, and runtime sub-package validation paths.

…figurable, update docs - Fix Windows pub-cache path in ci.skeleton.yaml: forward slashes → backslashes to match release/triage templates and Dart's default %LOCALAPPDATA%\Pub\Cache - Make artifact_retention_days configurable via ci config (1-90, default 7) with full validation matching line_length pattern - Extract Utf8BoundedBuffer utility for testable byte-bounded stream capture - Extract _resolveLogDirOrExit to eliminate late-final fragility - Update API_REFERENCE.md with all new public APIs (TestResultsUtil, TestResults, TestFailure, RepoUtils methods, Utf8BoundedBuffer, exitWithCode) - Document artifact retention policy in SETUP.md Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Switch test-results parsing to async line-by-line streaming to avoid loading large NDJSON files into memory. Add injectable exit handling and timeout controls in TestCommand so failure paths are testable, and expand tests/docs to cover the async parser contract.

Expand workflow generator coverage for new validation and render behaviors, including retention-days configuration, GH_PAT env indirection, single-platform IDs, and user-section preservation edge cases.

Merge latest main into feat/enhanced-managed-test, resolve API reference conflict, and regenerate workflows/template tracking for runtime_ci_tooling v0.14.1 consistency.

tsavo-at-pieces and others added 2 commits February 24, 2026 17:12

Copilot AI review requested due to automatic review settings February 24, 2026 22:47

Copilot started reviewing on behalf of tsavo-at-pieces February 24, 2026 22:47 View session

Copilot AI reviewed Feb 24, 2026

View reviewed changes

cursor bot reviewed Feb 24, 2026

View reviewed changes

templates/config.json Outdated Show resolved Hide resolved

lib/src/cli/manage_cicd.dart Outdated Show resolved Hide resolved

tsavo-at-pieces and others added 2 commits February 24, 2026 19:50

bot(format): apply dart format --line-length 120 [skip ci]

9292802

sentry bot reviewed Feb 25, 2026

View reviewed changes

lib/src/cli/commands/test_command.dart Outdated Show resolved Hide resolved

sentry bot reviewed Feb 25, 2026

View reviewed changes

lib/src/cli/utils/step_summary.dart Outdated Show resolved Hide resolved

tsavo-at-pieces and others added 5 commits February 24, 2026 20:44

fix: tighten ci config validation rules

e4ad440

Enforce uppercase-only secret identifiers and digits-only line_length string values, with corresponding test coverage updates. Include pending workflow template dependency change currently in local working tree.

This was referenced Feb 25, 2026

Stream NDJSON parsing to reduce memory pressure for large test suites #30

Open

Add rendered YAML golden file test to catch template/output drift #31

Open

Expand TestCommand test coverage: timeout, failure, and sub-package paths #32

Open

tsavo-at-pieces mentioned this pull request Feb 25, 2026

Systematically harden CLI process trust boundaries #33

Open

6 tasks

tsavo-at-pieces and others added 5 commits February 24, 2026 22:17

test: broaden workflow generator validation assertions

b711648

Expand workflow generator coverage for new validation and render behaviors, including retention-days configuration, GH_PAT env indirection, single-platform IDs, and user-section preservation edge cases.

merge: bring branch up to date with main

877a146

Merge latest main into feat/enhanced-managed-test, resolve API reference conflict, and regenerate workflows/template tracking for runtime_ci_tooling v0.14.1 consistency.

bot(format): apply dart format --line-length 120 [skip ci]

dcae6c2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add web_test, build_runner, and enhanced managed test#29

feat: add web_test, build_runner, and enhanced managed test#29
tsavo-at-pieces wants to merge 17 commits intomainfrom
feat/enhanced-managed-test

tsavo-at-pieces commented Feb 24, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tsavo-at-pieces commented Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tsavo-at-pieces commented Feb 24, 2026

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tsavo-at-pieces commented Feb 25, 2026

Final Verification Against HEAD

Verdict

Per-issue status (1–21)

Corrections to previous draft notes

Current validation snapshot

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cursor bot commented Feb 24, 2026 •

edited

Loading