[Gastown] Polecat stuck in zombie hook state after clone failure causes cascading sling failures

## Parent Issue

#204 (Gastown Cloud — Phase 2)

## Bug Description

When a polecat agent fails to start in the container (e.g., `git clone` fails because the repo's default branch doesn't match), the polecat is left in a **zombie hook state**: `status=idle` but `current_hook_bead_id` is still set. Subsequent `slingBead` calls pick this agent (it looks idle) but fail with a hook conflict.

### Root Cause Chain

1. **Rig created with wrong default branch**: User creates a rig with `defaultBranch: 'main'` but the repo actually uses `master`. This can happen because the `CreateRigDialog` defaults to `'main'` and the user doesn't change it.

2. **Clone failure in container**: `startAgentInContainer` sends the start request to the container, which calls `git clone --no-checkout --branch main <url>` → `fatal: Remote branch main not found in upstream origin`.

3. **Agent left in zombie state**: The polecat was already created and hooked to the bead in `slingBead()` *before* the container start was attempted. When the container rejects the start, `schedulePendingWork` logs the failure and retries on the next alarm, but the agent remains hooked to the bead with `status=idle`.

4. **Cascading failures**: The next `slingBead` call finds this zombie polecat (idle), tries to hook a new bead, and gets `Agent is already hooked to bead <old-bead-id>`.

### Logs

```
[Rig.do] startAgentInContainer: error response: {"error":"git clone --no-checkout --branch main https://github.com/jrf0110/8track.git ... failed: fatal: Remote branch main not found"}
[Rig.do] schedulePendingWork: FAILED to start agent in container (attempt 1/5)
...
[Rig.do] getOrCreateAgent: found existing agent id=... name=Maple role=polecat status=idle current_hook=d97de1ce-...
[Rig.do] getOrCreateAgent: returning existing agent (idle=true, singleton=false)
[Rig.do] hookBead: CONFLICT - agent ... already hooked to d97de1ce-...
```

### Fix Required

Two issues need to be addressed:

**1. `getOrCreateAgent` should skip agents with existing hooks when looking for idle polecats**

The query currently sorts by `status=idle` first, but doesn't filter out agents that still have `current_hook_bead_id` set. An agent with `status=idle` AND a non-null hook is in an inconsistent state — it should either be cleaned up or skipped.

```sql
-- Current (broken)
WHERE role = ? ORDER BY CASE WHEN status = 'idle' THEN 0 ELSE 1 END

-- Fixed
WHERE role = ? AND (status != 'idle' OR current_hook_bead_id IS NULL)
ORDER BY CASE WHEN status = 'idle' THEN 0 ELSE 1 END
```

Or: `getOrCreateAgent` should unhook the zombie agent before returning it.

**2. Failed container starts should unhook the agent**

In `schedulePendingWork`, when `startAgentInContainer` fails, the error is logged but the agent is not unhooked. The bead stays `in_progress` with the agent still hooked, creating the zombie state. On failure, the agent should be unhooked and the bead reverted to `open`.

**3. (Preventive) Auto-detect default branch**

The `CreateRigDialog` defaults to `main`, but many repos use `master`. Consider using the GitHub/GitLab API to fetch the actual default branch when a repo is selected from integrations, or running `git ls-remote --symref <url> HEAD` during rig creation to detect it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Gastown] Polecat stuck in zombie hook state after clone failure causes cascading sling failures #411

Parent Issue

Bug Description

Root Cause Chain

Logs

Fix Required

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Gastown] Polecat stuck in zombie hook state after clone failure causes cascading sling failures #411

Description

Parent Issue

Bug Description

Root Cause Chain

Logs

Fix Required

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions