Skip to content

Add end-to-end install integration test harness#56

Open
Copilot wants to merge 2 commits intomainfrom
copilot/add-e2e-install-integration-test
Open

Add end-to-end install integration test harness#56
Copilot wants to merge 2 commits intomainfrom
copilot/add-e2e-install-integration-test

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 23, 2026

Manual-dispatch test that drives install.md end-to-end against a long-lived target repo (mrjf/autoloop-test) using the Copilot CLI as the agent, then exercises one iteration each across the program-source × strategy matrix and tears down test debris on exit. Catches regressions in install.md, gh aw compile idempotency, scheduler discovery, strategy-section bleed, and first-iteration completion — none of which any other test covers.

Driver (tests/install-integration/run.sh)

  • Pre-flight (gh, copilot, python3, git, gh auth), captures origin/main SHA, pre-test reset (idempotent), clones target to a temp dir.
  • Feeds prompt.md to copilot --allow-all-tools; greps INSTALL_PR=<url> from stdout.
  • Phase 1 against the install-branch checkout → merges PR → Phase 2.
  • Phase 2: writes two file-based programs to main and creates one autoloop-program-labelled issue, then sequentially gh workflow run autoloop.lock.yml -f program=…, polls, and calls verify-phase2.sh per program.
  • Teardown via trap EXIT; honors --keep / KEEP_STATE_ON_FAILURE=1.

Prompt (prompt.md)

External so it can be edited without touching the driver. Tells the agent to follow install.md exactly through Step 5 and emit INSTALL_PR=<url>; explicitly stops before Step 6 so program creation stays deterministic.

Phase 1 (verify-phase1.sh)

Phase 2 (verify-phase2.sh <repo> <program> <run-id> <strategy>)

Per-program assertions matching the comment's seven points:

  1. Run conclusion == success.
  2. [Autoloop: <name>] issue exists.
  3. <!-- AUTOLOOP:STATUS --> comment present.
  4. State file <name>.md on memory/autoloop with a Machine State table.
  5. autoloop/<name> branch exists OR latest comment carries a rejection marker.
  6. Strategy-specific subsection — and for plain, a negative assertion that neither ## 🧬 Population nor ## ✅ Test Harness appears (catches strategy-discovery bleed).
# Source Strategy Asserts
1 file-based OpenEvolve ## 🧬 Population
2 issue-based Test-Driven ## ✅ Test Harness
3 file-based plain ## 📊 Iteration History + negative

Teardown (teardown.sh <repo> <base-sha>)

Idempotent. Force-resets main to the captured base SHA only if drifted, closes autoloop-program-labelled and [Autoloop:-titled issues, closes test PRs (autoloop/*, install-autoloop) with --delete-branch, then sweeps remaining autoloop/* / install-autoloop / memory/autoloop refs via gh api -X DELETE.

Actions wrapper (.github/workflows/install-integration-test.yml)

workflow_dispatch only — not on push, PR, or schedule. Inputs: keep_state_on_failure, install_test_repo. Installs gh-aw and @github/copilot, runs run.sh with INSTALL_TEST_TOKEN (PAT with repo scope on the target).

Notes for reviewers

  • All gh/git calls use portable flags (shasum -a 256, base64 --decode) so local mode works on macOS as well as Linux runners.
  • Programs run sequentially to avoid branch-name collisions and races on the memory/autoloop branch.
  • Phase-2 program 1's program.md references strategy/openevolve.md and program 2 references strategy/test-driven.md; the test treats those strategy files as the iteration's responsibility (issue body's expected behavior), which lets the harness ship before the strategy templates do.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/repos/bogus.invalid/bogus/issues
    • Triggering command: /usr/bin/python /usr/bin/python /home/REDACTED/work/autoloop/autoloop/workflows/scripts/autoloop_scheduler.py (http block)
  • https://api.github.com/repos/bogus.invalid/bogus/pulls
    • Triggering command: /usr/bin/python /usr/bin/python /home/REDACTED/work/autoloop/autoloop/workflows/scripts/autoloop_scheduler.py (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Copilot AI changed the title [WIP] Add end-to-end install integration test for Copilot CLI Add end-to-end install integration test harness Apr 23, 2026
Copilot finished work on behalf of mrjf April 23, 2026 21:25
Copilot AI requested a review from mrjf April 23, 2026 21:25
@mrjf mrjf marked this pull request as ready for review April 23, 2026 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

End-to-end install integration test: local + Actions modes, Copilot CLI as the agent, targets a scratch repo

2 participants