Halve e2e CI minutes by sharing the build#662
Conversation
Build emanote once and ship the Nix store closure as a workflow artifact; the e2e-tests job downloads, imports, and runs both `live` and `static` modes serially against the same binary. Replaces the previous matrix that rebuilt emanote twice in parallel. - Halves Actions minutes for the e2e workflow (~10 → ~5 min/run) - A flaky test re-run no longer forces a 4-min rebuild — re-run the e2e-tests job alone.
The 'bin=$OUT/bin/emanote' output (consumed via needs.build.outputs.bin in the e2e-tests job) hard-codes the Nix derivation's bin/-subpath layout. If the derivation ever moves the binary, the mismatch would not surface until cucumber tries to spawn it. Assert the path exists in the build job so the failure is local and immediate.
nix-store --import accepts an incomplete closure without complaint; the failure would surface only when cucumber tries to spawn the binary. Add a post-import executable check so the failure is local to the import step instead of leaking into test diagnostics.
The previous matrix(mode: [live, static]) was a seam — adding a per-mode env var or timeout was a one-row config change. With the matrix collapsed into two near-identical inline steps the seam is gone. Note the tradeoff so a future divergence is restructured (matrix or split jobs) rather than handled with inline conditionals.
Three places referenced needs.build.outputs.bin: the import-step sanity check and both E2E steps. Lifting it to jobs.e2e-tests.env collapses those into a single declaration and lets the two E2E steps' env: blocks differ visibly only in EMANOTE_MODE — sharpening the contract the mode-serialization comment was already trying to make.
Hickey/Lowy Analysis
Hickey rationaleThe build/test split cleanly separates artifact production from test execution — two genuinely independent concepts. The matrix collapse is correct: both modes share one binary, so they belong in one job. Three findings worth fixing in the diff: the build job's Lowy rationaleVolatility map: the build axis (Haskell source, |
|
| Step | Status | Duration | Verification |
|---|---|---|---|
| sync | ✓ | 1s | git fetch ok; forge=github |
| research | ✓ | 39s | Confirmed Nix closure transfer pattern is sound |
| branch | ✓ | 3s | Created branch ci-split-build-test from origin/master |
| implement | ✓ | 22s | Split e2e-tests into build (uploads Nix closure artifact) + e2e-tests (downloads, imports, runs both modes serially) |
| check | ✓ | 1m 2s | cabal build all succeeded |
| docs | — | 10s | Skipped — CI-only change with no user-facing impact |
| fmt | ✓ | 9s | just fmt: cabal-fmt, fourmolu, hlint, nixpkgs-fmt all passed |
| commit | ✓ | 5s | Committed 12df28d9 and pushed |
| hickey+lowy | ✓ | 5m 14s | 3 fix commits applied; 2 findings no-op'd with rationale |
| police | ✓ | 4m 27s | 3 passes clean; elegance applied 1 fix (990b2ea0) |
| test | — | 13s | Skipped — no Haskell/cucumber test exercises CI YAML |
| create-pr | ✓ | 1m 16s | Draft PR #662 created; hickey/lowy comment posted |
| ci | ✓ | 9m 18s | GH Actions green at HEAD 990b2ea0 (build 7m25s, e2e-tests 1m42s) |
| Total | 25m 14s |
Optimization suggestions
The CI numbers landed close to the pre-PR baseline, not halved as the description predicted. Honest accounting:
- Wall-clock regressed from ~5min → ~9min. The old parallel matrix finished in ~5min because both matrix entries built emanote concurrently. The new sequential
build → e2e-testsshape has no parallelism, so the build's ~7min wall-clock dominates. - Actions-minutes improved only marginally (~10 → ~9). The build job grew from ~4min (pre-PR) to ~7min, eating most of the savings from de-duplicating the second build. The closure-transfer overhead is real: ~30s upload + 17s download + 46s import = ~1m30s on top of the build itself.
Three follow-ups worth considering, in priority order:
- Reconsider the build/test split. A simpler "collapse the matrix into one job" version (build once, run both modes serially in the same job, no artifact dance) would land at ~5min wall-clock and ~5 actions-min — strictly better than this PR on both axes. The build/test split's only retained benefit is "re-run flaky tests without rebuilding," which may not be worth the closure-transfer overhead unless flakes happen often.
- Drop the explicit
gziponnix-store --export.actions/upload-artifact@v4deflates internally — the explicitgzipis double-compression that costs CPU on both ends. Likely saves 30–60s. - Add a Nix binary cache (cachix or magic-nix-cache). Would drop the
nix buildstep itself from ~4min to ~30s on cache hit, dwarfing every other optimization here. This was option Start using HTML templating #1 in the original/talkanalysis and remains the highest-leverage change.
Workflow completed at 2026-04-25T23:23.
The e2e workflow fanned out as a 2-entry matrix on
mode: [live, static], which rebuilt emanote twice in parallel — ~8 actions-minutes per run for a 5-second test. The new shape: abuildjob runsnix buildonce and uploads its Nix closure as a workflow artifact;e2e-testsimports the closure and runs both modes serially against the same binary.Wall-clock is roughly unchanged (the bottleneck is still the ~4-minute Haskell rebuild), but actions-minutes drop ~50%, and a flaky test re-run no longer drags a fresh build along with it — re-running just
e2e-testspicks up the already-uploaded closure.A few follow-up commits add a build-side
test -xguard (a relocated binary now fails locally instead of leaking into a 30-minute test run), the same guard post-nix-store --import(since import accepts an incomplete closure without complaint), and liftEMANOTE_BINto the job'senv:so the two E2E steps differ visibly only inEMANOTE_MODE.