-
Notifications
You must be signed in to change notification settings - Fork 37
fix(gastown): route MR bead failures through full review lifecycle to unblock convoys #1244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
47 commits
Select commit
Hold shift + click to select a range
6b0a7a9
fix(gastown): route MR bead failures through full review lifecycle to…
jrf0110 e332f0d
fix: address PR review — exclude pending MRs from orphan recovery, us…
jrf0110 2a1278b
refactor(gastown): remove superfluous ensureInitialized calls from To…
jrf0110 7258e16
refactor(gastown): restrict setTownId to town creation paths only
jrf0110 88431a6
refactor(gastown): extract scheduling module, parallelize alarm loop,…
jrf0110 489823e
fix(container): configure credential helper on bare repo for git-lfs
jrf0110 19013be
fix(gastown): add rehookOrphanedBeads patrol to recover stuck beads
jrf0110 82a1a00
fix(gastown): add timeouts to container fetch calls and treat unknown…
jrf0110 6d14c93
fix(gastown): clear dispatch cooldown on zombie recovery for immediat…
jrf0110 4251f08
docs(gastown): document DO sub-module pattern in AGENTS.md
jrf0110 627c711
fix(gastown): extend rehookOrphanedBeads to recover in_progress beads…
jrf0110 7b957fb
fix(gastown): close remaining recovery gaps for MR beads and orphaned…
jrf0110 c92318f
fix(gastown): use bead.bead_id instead of stale agent snapshot in dis…
jrf0110 5b01044
fix(gastown): prevent recoverStuckReviews from resetting MR beads wit…
jrf0110 969fb3e
fix(gastown): add dispatch cooldown on failure and increase MAX_DISPA…
jrf0110 8115179
fix(gastown): handle unhooked agent in agentDone gracefully instead o…
jrf0110 0d403d5
fix(gastown): resolve kilocodeToken for refinery via town config fall…
jrf0110 6468c95
fix(container): skip LFS smudge filter for all git operations in cont…
jrf0110 b2a4e19
fix(container): add global .gitconfig to skip LFS smudge for agent user
jrf0110 5a158f1
fix(gastown): prevent false zombie detection from resetting active re…
jrf0110 4804898
fix(gastown): route dead agents through agentCompleted for proper bea…
jrf0110 b6c0873
fix(gastown): don't reopen closed source beads when a stale MR bead f…
jrf0110 5d80215
fix(gastown): add diagnostic logging for refinery dispatch failures
jrf0110 d7c2c4c
fix(gastown): recover refinery gt_done when agent was unhooked by zom…
jrf0110 9f0fcd3
fix(gastown): enforce terminal state immutability and simplify zombie…
jrf0110 b783597
debug: add temporary debugAgentMetadata endpoint
jrf0110 319ddef
fix(gastown): fix Zod parse failure in schedulePendingWork that silen…
jrf0110 c966dc0
debug: capture container start error on refinery agent status message
jrf0110 0d05bd2
fix(gastown): close stale MR beads when one MR merges for the same so…
jrf0110 a3ec775
fix(gastown): skip popping MR beads whose source already has an in-fl…
jrf0110 0a9b889
fix(gastown): never route refineries through agentCompleted from witn…
jrf0110 6aabbd5
fix(gastown): eliminate refinery race conditions — never fail MR bead…
jrf0110 c502aaa
debug: add unauthenticated /debug/towns/:id/status endpoint and monit…
jrf0110 4b16fee
fix(gastown): don't fail MR beads when refinery start returns false
jrf0110 b896676
fix(gastown): fix stale refinery hook deadlock in recoverStuckReviews
jrf0110 bb2d7c5
fix(gastown): don't roll back bead status on dispatch failure for any…
jrf0110 fb3f920
fix(gastown): eliminate all fire-and-forget rework dispatch races
jrf0110 807efb4
fix(gastown): skip not_found for ALL agents in witnessPatrol + add me…
jrf0110 c8832f5
fix(gastown): unhook stale refinery before re-hooking + fast recovery…
jrf0110 ad150c0
fix(gastown): set refinery to idle on not_found (don't skip entirely)
jrf0110 6db5722
fix(gastown): add refinery dispatch retry in processReviewQueue
jrf0110 aaf31d5
fix(gastown): keep refinery hook on start failure + block popping whe…
jrf0110 016d6cc
fix(gastown): treat 'already running' container response as successfu…
jrf0110 52b446d
fix(gastown): check container status before retrying refinery dispatch
jrf0110 ed68536
fix(gastown): fix PR-strategy MR beads stuck after external merge
jrf0110 8b62a7e
fix(gastown): unhook refinery from terminal MR beads at start of proc…
jrf0110 f50b147
fix(gastown): check container status before freeing refinery from ter…
jrf0110 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,84 @@ | ||
| #!/bin/bash | ||
| # Continuously monitor a town's state via the debug endpoint. | ||
| # Usage: ./scripts/monitor-town.sh [townId] [interval_seconds] | ||
|
|
||
| TOWN_ID="${1:-8a6f9375-b806-4ee0-ad6e-1697ea2dbfff}" | ||
| INTERVAL="${2:-15}" | ||
| BASE_URL="${GASTOWN_URL:-https://gastown.kiloapps.io}" | ||
| URL="${BASE_URL}/debug/towns/${TOWN_ID}/status" | ||
|
|
||
| echo "Monitoring town ${TOWN_ID} every ${INTERVAL}s" | ||
| echo "Endpoint: ${URL}" | ||
| echo "Press Ctrl+C to stop" | ||
| echo "==========================================" | ||
|
|
||
| while true; do | ||
| RESP=$(curl -s --max-time 10 "${URL}" 2>/dev/null) | ||
| if [ -z "$RESP" ]; then | ||
| echo "$(date -u +%H:%M:%S) [ERROR] No response from ${URL}" | ||
| sleep "$INTERVAL" | ||
| continue | ||
| fi | ||
|
|
||
| echo "$RESP" | python3 -c " | ||
| import sys, json, datetime | ||
|
|
||
| try: | ||
| d = json.load(sys.stdin) | ||
| except: | ||
| print('$(date -u +%H:%M:%S) [ERROR] Invalid JSON response') | ||
| sys.exit(0) | ||
|
|
||
| ts = datetime.datetime.utcnow().strftime('%H:%M:%S') | ||
| alarm = d.get('alarmStatus', {}) | ||
| agents_info = alarm.get('agents', {}) | ||
| beads_info = alarm.get('beads', {}) | ||
| patrol_info = alarm.get('patrol', {}) | ||
| events = alarm.get('recentEvents', []) | ||
|
|
||
| working = agents_info.get('working', 0) | ||
| idle = agents_info.get('idle', 0) | ||
| op = beads_info.get('open', 0) | ||
| ip = beads_info.get('inProgress', 0) | ||
| ir = beads_info.get('inReview', 0) | ||
| failed = beads_info.get('failed', 0) | ||
| orphaned = patrol_info.get('orphanedHooks', 0) | ||
|
|
||
| # Agent details | ||
| agents = d.get('agentMeta', []) | ||
| hooked_agents = [a for a in agents if a.get('current_hook_bead_id')] | ||
| refinery = [a for a in agents if a.get('role') == 'refinery'] | ||
|
|
||
| # Non-terminal beads | ||
| beads = d.get('beadSummary', []) | ||
|
|
||
| print(f'{ts} W={working} I={idle} | open={op} prog={ip} review={ir} fail={failed} | hooks={orphaned} hooked={len(hooked_agents)}') | ||
|
|
||
| # Show refinery state | ||
| for r in refinery: | ||
| hook = r.get('current_hook_bead_id', 'NULL') or 'NULL' | ||
| print(f' refinery: status={r.get(\"status\",\"?\"):8s} hook={hook[:12]:12s} dispatch={r.get(\"dispatch_attempts\",0)}') | ||
|
|
||
| # Show non-terminal beads | ||
| if beads: | ||
| for b in beads[:8]: | ||
| assignee = str(b.get('assignee_agent_bead_id', '') or '')[:8] | ||
| print(f' {b.get(\"status\",\"?\"):12s} {b.get(\"type\",\"?\"):16s} {str(b.get(\"bead_id\",\"\"))[:8]} agent={assignee:8s} {str(b.get(\"title\",\"\"))[:50]}') | ||
| if len(beads) > 8: | ||
| print(f' ... and {len(beads) - 8} more') | ||
|
|
||
| # Show most recent event | ||
| if events: | ||
| e = events[0] | ||
| print(f' last: {e.get(\"time\",\"\")[:19]} {e.get(\"type\",\"\"):20s} {e.get(\"message\",\"\")[:70]}') | ||
|
|
||
| # Show review outcomes | ||
| review_events = [e for e in events if e.get('type') == 'review_completed'] | ||
| for e in review_events[:2]: | ||
| print(f' REVIEW: {e.get(\"time\",\"\")[:19]} {e.get(\"message\",\"\")[:70]}') | ||
|
|
||
| print() | ||
| " 2>/dev/null | ||
|
|
||
| sleep "$INTERVAL" | ||
| done |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WARNING:
hostnamedrops custom HTTPS portsURL.hostnamestrips:port, so a repo URL likehttps://git.example.com:8443/org/repo.gitwriteshttps://oauth2:...@git.example.com/https://x-access-token:...@git.example.cominto the credential store. Git credential matching treatsgit.example.com:8443as a different host, so LFS batch requests on GitHub/GitLab Enterprise with non-default ports will still miss the helper and fail. Useurl.host(orurl.origin) when composing the credential line.