Parent: #204 | Phase 4: Hardening
Revised: Focus on container resource limits and multi-agent concurrency within a single container, rather than cloud-agent-next session throughput.
Goal
Validate the system under load and identify container resource limits.
Test Scenarios
- Simulate 30 concurrent polecats across 5 rigs in a single Town Container
- Measure DO→container latency under load (tool call round-trip time)
- Measure container resource usage (CPU, memory, disk) with N concurrent Kilo CLI processes
- Identify when to scale to larger instance types or shard to multiple containers
- Identify DO SQLite size limits and implement archival (closed beads → purge from DO)
- Test container crash/restart/restore cycles (ephemeral disk recovery)
- Measure review queue throughput under concurrent PR submissions
Metrics to Capture
- Tool call round-trip latency (p50, p95, p99)
- DO alarm scheduling accuracy under load
- SQLite query performance vs row count
- Container CPU/memory per Kilo CLI process
- Git clone/worktree creation time
- Container cold start time (fresh start vs restore from R2 snapshot)
- Memory/CPU usage per DO
Acceptance Criteria
Parent: #204 | Phase 4: Hardening
Goal
Validate the system under load and identify container resource limits.
Test Scenarios
Metrics to Capture
Acceptance Criteria