Autoloop pre-step can't read state files — build-tsb starved since 2026-04-12

## Symptom

Issue #1 (the `build-tsb-pandas-typescript-migration` autoloop program) has received no new comments since 2026-04-12 — over 8 days. The state file on `memory/autoloop` says `Last Run: 2026-04-12T11:15:07Z`, `Iteration Count: 230`, `Paused: false`, `Completed: false`. The program is healthy by every recorded criterion, but it is never selected.

Meanwhile `perf-comparison` runs every 30 minutes.

## Root cause

The autoloop pre-step ("Check which programs are due") reads state files from `/tmp/gh-aw/repo-memory/autoloop/`, but the memory-clone step runs **after** the pre-step and clones to `/tmp/gh-aw/repo-memory/default/`. Wrong directory *and* wrong order — the pre-step never sees state for any program.

Confirmed in the agent-job log for run 24642957273:

```
Found issue-based program: 'build-tsb-pandas-typescript-migration' (issue #1)
perf-comparison: no state file found (first run)
perf-comparison: no state found (first run)
build-tsb-pandas-typescript-migration: no state file found (first run)
build-tsb-pandas-typescript-migration: no state found (first run)
=== Autoloop Program Check ===
Selected program:      perf-comparison (.autoloop/programs/perf-comparison/program.md)
Deferred (next run):   ['build-tsb-pandas-typescript-migration']
Programs skipped:      (none)
```

Relevant code in `.github/workflows/autoloop.md`:

- L107: `repo_memory_dir = "/tmp/gh-aw/repo-memory/autoloop"` — where the Python pre-step looks for state files.
- The memory-clone step (downstream of the pre-step) sets `MEMORY_DIR=/tmp/gh-aw/repo-memory/default` and clones `memory/autoloop` there after the Python script has already run.

## Why build-tsb gets starved

Without state, every program looks like "first run" (no `last_run`, nothing to order by). The selection tiebreaker picks programs in `program_files` order, which appends file-based programs first (`perf-comparison`) and issue-based programs last (`build-tsb-pandas-typescript-migration`). So every run:

1. Both programs discovered.
2. Both "no state found → treat as first run".
3. `perf-comparison` wins the tiebreaker.
4. `build-tsb-pandas-typescript-migration` is deferred to "next run".
5. Next run: go to step 1.

Nothing breaks the cycle. `build-tsb` has been in "deferred" purgatory for 8 days.

## Fix

Clone the `memory/autoloop` branch into `/tmp/gh-aw/repo-memory/autoloop/` **before** the "Check which programs are due" step runs. Options:

**Option A (minimal):** Add a shell step at the top of the `steps:` list that does the clone:

```yaml
steps:
  - name: Clone repo-memory for scheduler
    env:
      GITHUB_TOKEN: ${{ github.token }}
      GITHUB_REPOSITORY: ${{ github.repository }}
    run: |
      mkdir -p /tmp/gh-aw/repo-memory
      git clone --depth=1 --branch memory/autoloop \
        "https://x-access-token:${GITHUB_TOKEN}@github.com/${GITHUB_REPOSITORY}.git" \
        /tmp/gh-aw/repo-memory/autoloop \
        || mkdir -p /tmp/gh-aw/repo-memory/autoloop  # branch may not exist on first run

  - name: Check which programs are due
    # (existing step)
```

**Option B:** Do the `memory/autoloop` fetch inline within the existing Python script using the GitHub Contents API (no separate shell step; single source of truth for where state lives). Heavier rewrite but less moving parts.

**Option C:** Reorder so gh-aw's built-in repo-memory clone runs before the pre-step, and change the pre-step to read from wherever the built-in clone lands. Requires coordination with gh-aw plumbing; brittle.

Prefer **Option A** — it's additive, doesn't touch gh-aw internals, and it's obvious from reading the workflow why the clone is there.

## Secondary fix — deterministic tie-breaking

Even after the state is read, when a program genuinely has never run, the tiebreaker should avoid permanent starvation. Prefer:

- Among programs with no `last_run`, pick the one whose `schedule` is shortest (so "every 30m" beats "every 6h"), then fall back to alphabetical by name.
- This way `build-tsb` (every 30m) would beat `perf-comparison` (every 6h) on the first run after the fix, then state catches up and ordinary `last_run` ordering takes over.

## Acceptance

- After merge, the next autoloop run logs `build-tsb-pandas-typescript-migration: last_run=2026-04-12T11:15:07Z, iteration_count=230` (state successfully read).
- Issue #1 receives a new Autoloop comment within one scheduled window.
- `perf-comparison` and `build-tsb` alternate naturally based on `last_run`, with neither starved.

## Context

- Failing symptom: https://github.com/githubnext/tsessebe/issues/1 (no comments since 2026-04-12T04:08Z).
- State file evidence: `build-tsb-pandas-typescript-migration.md` on `memory/autoloop` branch, `Last Run: 2026-04-12T11:15:07Z`.
- Example run showing the bug: https://github.com/githubnext/tsessebe/actions/runs/24642957273 (agent job logs).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoloop pre-step can't read state files — build-tsb starved since 2026-04-12 #162

Symptom

Root cause

Why build-tsb gets starved

Fix

Secondary fix — deterministic tie-breaking

Acceptance

Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Autoloop pre-step can't read state files — build-tsb starved since 2026-04-12 #162

Description

Symptom

Root cause

Why build-tsb gets starved

Fix

Secondary fix — deterministic tie-breaking

Acceptance

Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions