Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 54 additions & 0 deletions .github/workflows/install-integration-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
name: Install Integration Test

# End-to-end test of install.md against a long-lived target repo.
# Manual-dispatch only -- this exercises real LLM calls and force-pushes a
# remote branch, so it must not run on PRs or schedules.

on:
workflow_dispatch:
inputs:
keep_state_on_failure:
description: "Leave test repo in failure state for inspection"
type: boolean
default: false
install_test_repo:
description: "Target repo for the install (owner/repo)"
type: string
default: "mrjf/autoloop-test"

jobs:
install-integration:
runs-on: ubuntu-latest
timeout-minutes: 30
permissions:
contents: read
steps:
- uses: actions/checkout@v4

- uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install gh aw extension
env:
GH_TOKEN: ${{ secrets.INSTALL_TEST_TOKEN }}
run: gh extension install github/gh-aw

- name: Install Copilot CLI
env:
GH_TOKEN: ${{ secrets.INSTALL_TEST_TOKEN }}
# The Copilot CLI is distributed as an npm package. If the install
# path changes upstream, update this single step.
run: npm install -g @github/copilot

- name: Verify gh auth
env:
GH_TOKEN: ${{ secrets.INSTALL_TEST_TOKEN }}
run: gh auth status

- name: Run integration test
env:
GH_TOKEN: ${{ secrets.INSTALL_TEST_TOKEN }}
INSTALL_TEST_REPO: ${{ inputs.install_test_repo }}
KEEP_STATE_ON_FAILURE: ${{ inputs.keep_state_on_failure && '1' || '0' }}
run: ./tests/install-integration/run.sh
67 changes: 67 additions & 0 deletions tests/install-integration/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# install-integration

End-to-end integration test for [`install.md`](../../install.md). Runs the
install flow as a real coding agent (Copilot CLI) against a long-lived
target repo (`mrjf/autoloop-test` by default), then exercises Phase 2 by
running one iteration each of three programs across the program-source ×
strategy matrix.

This test is **manual-dispatch only**. It is not part of CI.

## Local mode

```bash
# from the autoloop repo root:
./tests/install-integration/run.sh
```

Requirements:

- `gh` CLI authenticated as a user with write access to the target repo.
- `copilot` CLI on PATH.
- `python3` and `git` on PATH.

Optional env / flags:

- `INSTALL_TEST_REPO=<owner>/<repo>` -- override the target (default
`mrjf/autoloop-test`).
- `--keep` (or `KEEP_STATE_ON_FAILURE=1`) -- skip teardown on failure so
the failure state can be inspected. Run `teardown.sh <repo> <base-sha>`
manually afterwards.

## Actions mode

Trigger the **Install Integration Test** workflow from the Actions tab. It
runs the same script on a GitHub-hosted runner. Requires the
`INSTALL_TEST_TOKEN` repo secret -- a PAT with `repo` scope on the target
repo (the default `GITHUB_TOKEN` has no access to repos outside the host).

## What it tests

See [the issue that introduced this harness](https://github.com/githubnext/autoloop/issues)
for the full motivation. In short:

- **Phase 1** (file presence + lock idempotency) -- catches regressions in
`install.md` and in `gh aw compile`.
- **Phase 2** (3 programs × 1 iteration each) -- catches regressions in
the scheduler, in strategy discovery, and in the iteration loop. The
three programs cover:

| # | Source | Strategy |
|---|------------|-----------------|
| 1 | file-based | OpenEvolve |
| 2 | issue-based| Test-Driven |
| 3 | file-based | plain (default) |

- **Phase 3** (teardown) -- resets the target repo to the captured base
SHA, closes test issues/PRs, and deletes test branches.

## Files

| File | Purpose |
|-----------------------|--------------------------------------------------|
| `run.sh` | Driver. Orchestrates phases 1-3. |
| `prompt.md` | Prompt fed to Copilot CLI (edit without touching the driver). |
| `verify-phase1.sh` | File-presence + lock-idempotency assertions. |
| `verify-phase2.sh` | Per-program assertions (one call per program). |
| `teardown.sh` | Idempotent cleanup. Safe to re-run. |
17 changes: 17 additions & 0 deletions tests/install-integration/prompt.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
You are installing autoloop into a freshly-reset GitHub repository.

Your working directory is the root of that repository, cloned locally. The
repository is empty except for the base fixtures in `src/` and `tests/`.

Follow the install instructions at the URL below, EXACTLY AS WRITTEN. Execute
each step using shell commands. Do not skip steps. Do not improvise. Do not
optimize or "improve" the instructions.

Stop after Step 5 (the install PR is opened). Do NOT proceed to Step 6
("Create Your First Program") -- the test harness handles program creation
itself in a deterministic way.

When you finish: print a single line `INSTALL_PR=<url>` with the URL of the
PR you opened in step 5. Then stop.

Install instructions: https://github.com/githubnext/autoloop/blob/main/install.md
Loading
Loading