Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 24 additions & 3 deletions .claude/skills/resolving-issues/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -181,11 +181,32 @@ Coordinator owns this check (metadata work, allowed under "coordinator never cod
### Waiting for implementer commits without polling
- Implementer agent runs in the background. Coordinator gets a notification when the agent's tool call completes — but the agent's *commit* may land seconds before that, and the harness's stop hook fires on uncommitted changes during cargo gates.
- **Don't sleep, don't poll, don't read the agent's output file.** The system explicitly blocks long sleeps and warns against reading sub-agent transcripts (context overflow).
- **Use `Monitor` with an `until` loop watching for HEAD to advance** past the prior known SHA. One notification fires when the loop exits; the coordinator stays idle in between. Pattern:
- **Capture the full pre-dispatch SHA via `git log -1 --format='%H' <branch>` BEFORE arming the wait.** Use that exact 40-char string in the until-condition. Never type/synthesize the hash from a short prefix (`31c7851` ≠ `31c7851bf8c6...` — the loop will exit immediately on the mismatched comparison and you'll spuriously conclude the agent committed). Pattern: capture into a shell variable, interpolate into the loop.
- **Use `Monitor` (or `Bash` background with an `until` loop) watching for HEAD to advance** past the captured SHA. One notification fires when the loop exits; the coordinator stays idle in between. Pattern:
```bash
cd /home/user/willow && until [ "$(git log -1 --format='%H')" != "<prior-sha>" ]; do sleep 5; done; echo "agent committed: $(git log -1 --oneline)"
PRIOR=$(git log -1 --format='%H' <branch>)
until [ "$(git log -1 --format='%H' <branch>)" != "$PRIOR" ]; do sleep 10; done
git log -1 --oneline <branch>
```
Set `timeout_ms` to ~5–10 minutes for typical implementer runs (longer for cross-crate gates). The coordinator can do GitHub-API metadata work (filing follow-up issues, drafting PR body) while waiting; just don't touch files in the implementer's scope.
Set `timeout` to ~10–15 minutes for typical implementer runs (longer for cross-crate gates). The coordinator can do GitHub-API metadata work (filing follow-up issues, drafting PR body) while waiting; just don't touch files in the implementer's scope.
- **Two independent signals: SHA-advance + agent-completion notification.** They arrive on their own schedules — agent-completion can lag the commit by minutes (or vice versa, agent can complete WITHOUT committing — see "Implementer cuts off mid-flight" below). Trust whichever signal answers your immediate question:
- "Did the implementer's work land on master?" → check git ref / wait for SHA-advance.
- "Did the implementer agent terminate?" → wait for agent-completion notification.
Don't treat them as redundant — and don't preemptively conclude the implementer crashed just because one arrived first.

### Implementer cuts off mid-flight (uncommitted edits, agent terminated)
- A surprisingly common failure mode: the implementer agent gets cut off (token budget, time slice, harness limit) AFTER making the substantive edits but BEFORE running the local merge gate or committing. Symptoms:
- Agent-completion notification arrives with a truncated/incomplete summary (e.g. "Spec doc looks good. Let me check the test result." mid-thought).
- `git status` shows uncommitted changes matching the issue's expected scope.
- Stop hook fires complaining about uncommitted changes.
- **Don't conclude the work is wrong** — inspect the diff. If it matches the issue's intent, the edits are usually correct; the agent just ran out of runway before the gate-and-commit step.
- **Don't restart the issue from scratch.** Dispatch a thin **finalize-implementer** agent: brief it on the existing uncommitted state, hand it the gate list + commit-and-push instructions verbatim, no need to re-do brainstorming or TDD-red. The previous implementer's substance stands; the finalize agent's job is the mechanical close-out (verify gates, commit, push, report SHA).
- **Brief the finalize agent on what's already done** — paste a `git diff --stat` summary so it knows the scope and doesn't redo the work or panic at the size. Cite the previous agent's brainstorm decision (Option A/B etc.) so it carries the same reasoning into the commit body.
- **Stop-hook noise is normal mid-implementer.** A `git status` showing uncommitted changes during a cargo gate run does NOT mean the implementer crashed. Only suspect crash-before-commit if (a) sufficient time has passed for the gates to finish (typically 5–10 min for a small fix, longer for `just check` workspace-wide), AND (b) the agent-completion notification has arrived AND no commit was made. Don't preemptively dispatch a finalize agent based on stop-hook chatter alone.
- **Two false-alarm patterns to avoid:**
1. *Wrong-SHA wait exits immediately, you assume the implementer never started.* (See "Capture the full pre-dispatch SHA" above.) Result: redundant finalize dispatch on a tree the original agent already cleanly committed.
2. *Agent-completion notification lags the commit.* By the time you see the agent's summary, the work has been on origin for minutes. The redundant finalize agent will report "no changes to commit, gates green" — harmless but wasteful (~5 min cargo lock contention).
Both are mitigated by: capture full SHA → arm SHA-watch → wait for either signal → on signal arrival, `git status` first to disambiguate (clean tree = real commit landed; uncommitted edits matching scope = mid-flight, dispatch finalize).

### Implementer-flagged out-of-scope rot
- When the implementer surfaces pre-existing rot it intentionally doesn't fix (e.g. unrelated wasm break under `--all-features`, dead-code warnings in untouched files), the coordinator files a follow-up GH issue using `mcp__github__issue_write` (metadata work, allowed under the Coordinator-never-codes rule). Cite the discovery context (which dispatch surfaced it, which commit + gate step) so the next run has full provenance.
Expand Down
2 changes: 1 addition & 1 deletion crates/network/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ pub mod iroh;
pub mod topics;
pub mod traits;

#[cfg(any(test, feature = "test-utils"))]
#[cfg(all(not(target_arch = "wasm32"), any(test, feature = "test-utils")))]
pub mod mem;

// Re-export key types for convenience.
Expand Down
53 changes: 49 additions & 4 deletions crates/state/src/materialize.rs
Original file line number Diff line number Diff line change
Expand Up @@ -301,16 +301,23 @@ fn required_permission(kind: &EventKind) -> Option<Permission> {
// RevokePermission,
// RenameServer,
// SetServerDescription — admin-only, checked in the admin block above
// SetProfile — unrestricted (any member)
// UpdateProfile — unrestricted (any member; self-authorship
// is the only identity check)
// SetProfile — any current member; membership gate
// lives in apply_mutation (issue #177)
// UpdateProfile — any current member; membership gate
// lives in apply_mutation. Self-
// authorship is enforced structurally
// (only the author's own profile is
// mutated). See issue #177.
// PinMessage,
// UnpinMessage — unrestricted (any member)
// UnpinMessage — any current member; membership gate
// lives in apply_mutation (issue #177)
// ChannelRevive — membership check lives in apply_mutation
// (does not require SendMessages so a muted
// member can still un-archive)
// MuteChannel,
// MuteGrove — per-identity preference, never gated
// (preferences are not server state and
// survive a kick)
//
// If a new EventKind variant is added and is NOT listed here or
// in an arm above, it will silently get no permission check.
Expand Down Expand Up @@ -539,6 +546,15 @@ fn apply_mutation(state: &mut ServerState, event: &Event) -> ApplyResult {
}

EventKind::SetProfile { display_name } => {
// Defense-in-depth: reject if the author isn't a member of
// the server. `required_permission()` returns `None` for
// SetProfile (it's "any current member"), so the membership
// gate must live here. Without it, late-arriving events
// from a kicked or never-joined signer would silently
// mutate `state.profiles`. See issue #177.
if !state.members.contains_key(&event.author) {
return ApplyResult::Rejected(format!("author '{}' is not a member", event.author));
}
if display_name.chars().count() > 64 {
return ApplyResult::Rejected(format!(
"display name exceeds 64 chars ({} chars)",
Expand All @@ -556,6 +572,17 @@ fn apply_mutation(state: &mut ServerState, event: &Event) -> ApplyResult {
}

EventKind::UpdateProfile(delta) => {
// Defense-in-depth: reject if the author isn't a member of
// the server. `required_permission()` returns `None` for
// UpdateProfile (it's "any current member"; self-authorship
// is already structurally enforced by mutating only the
// author's own profile). The membership gate must live
// here so late-arriving events from a kicked or never-
// joined signer cannot silently mutate `state.profiles`.
// See issue #177.
if !state.members.contains_key(&event.author) {
return ApplyResult::Rejected(format!("author '{}' is not a member", event.author));
}
let crate::types::ProfileDelta {
display_name,
pronouns,
Expand Down Expand Up @@ -662,6 +689,15 @@ fn apply_mutation(state: &mut ServerState, event: &Event) -> ApplyResult {
channel_id,
message_id,
} => {
// Defense-in-depth: reject if the author isn't a member of
// the server. `required_permission()` returns `None` for
// PinMessage (it's "any current member"), so the membership
// gate must live here. Without it, late-arriving events
// from a kicked or never-joined signer would silently
// mutate `pinned_messages`. See issue #177.
if !state.members.contains_key(&event.author) {
return ApplyResult::Rejected(format!("author '{}' is not a member", event.author));
}
if let Some(ch) = state.channels.get_mut(channel_id) {
ch.pinned_messages.insert(*message_id);
}
Expand All @@ -671,6 +707,15 @@ fn apply_mutation(state: &mut ServerState, event: &Event) -> ApplyResult {
channel_id,
message_id,
} => {
// Defense-in-depth: reject if the author isn't a member of
// the server. `required_permission()` returns `None` for
// UnpinMessage (it's "any current member"), so the
// membership gate must live here. Without it, late-
// arriving events from a kicked or never-joined signer
// would silently mutate `pinned_messages`. See issue #177.
if !state.members.contains_key(&event.author) {
return ApplyResult::Rejected(format!("author '{}' is not a member", event.author));
}
if let Some(ch) = state.channels.get_mut(channel_id) {
ch.pinned_messages.remove(message_id);
}
Expand Down
13 changes: 10 additions & 3 deletions crates/state/src/sync.rs
Original file line number Diff line number Diff line change
Expand Up @@ -881,7 +881,11 @@ mod tests {
dag.insert(e).unwrap();
}

// Grant permissions to multiple peers.
// Grant permissions to multiple peers, recording each grant
// hash so peer SetProfile events can causally depend on it.
// (SetProfile requires membership — issue #177 — so the grant
// must topologically precede the SetProfile.)
let mut grant_hashes: std::collections::BTreeMap<_, _> = std::collections::BTreeMap::new();
for peer in [&peer_a, &peer_b, &peer_c] {
let e = dag.create_event(
&owner,
Expand All @@ -892,17 +896,20 @@ mod tests {
vec![],
0,
);
grant_hashes.insert(peer.endpoint_id(), e.hash);
dag.insert(e).unwrap();
}

// Set profiles from different peers.
// Set profiles from different peers, each depending on the
// grant that made them a member.
for (peer, name) in [(&peer_a, "Alice"), (&peer_b, "Bob"), (&peer_c, "Carol")] {
let dep = grant_hashes[&peer.endpoint_id()];
let e = dag.create_event(
peer,
EventKind::SetProfile {
display_name: name.into(),
},
vec![],
vec![dep],
0,
);
dag.insert(e).unwrap();
Expand Down
37 changes: 34 additions & 3 deletions crates/state/src/tests/dag.rs
Original file line number Diff line number Diff line change
Expand Up @@ -422,12 +422,28 @@ fn managed_dag_insert_remote_event_applies_to_state() {

// Simulate a remote event from a different peer.
let peer = Identity::generate();
// Grant peer membership first — `SetProfile` is gated on
// membership (issue #177); the grant adds peer to `state.members`.
let grant = managed.dag().create_event(
&owner,
EventKind::GrantPermission {
peer_id: peer.endpoint_id(),
permission: crate::event::Permission::SendMessages,
},
vec![],
50,
);
let grant_hash = grant.hash;
managed.insert_and_apply(grant).unwrap();

// Causally link the SetProfile to the grant so topological sort
// applies the grant first.
let event = managed.dag().create_event(
&peer,
EventKind::SetProfile {
display_name: "Alice".into(),
},
vec![],
vec![grant_hash],
100,
);
let outcome = managed.insert_and_apply(event).unwrap();
Expand All @@ -450,13 +466,28 @@ fn managed_dag_buffers_gap_events_and_resolves() {
let peer = Identity::generate();
let mut managed = ManagedDag::new(&owner, "Test", 5000).unwrap();

// Create peer's seq=1 event.
// Grant peer membership first — `SetProfile` is gated on
// membership (issue #177); the grant adds peer to `state.members`.
let grant = managed.dag().create_event(
&owner,
EventKind::GrantPermission {
peer_id: peer.endpoint_id(),
permission: crate::event::Permission::SendMessages,
},
vec![],
0,
);
let grant_hash = grant.hash;
managed.insert_and_apply(grant).unwrap();

// Create peer's seq=1 event, depending on the grant so topological
// sort guarantees the grant applies first.
let e1 = managed.dag().create_event(
&peer,
EventKind::SetProfile {
display_name: "first".into(),
},
vec![],
vec![grant_hash],
0,
);

Expand Down
Loading