From d918775c2f8bf1fc2f005bd86b75395efe81b972 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Sat, 7 Feb 2026 06:01:03 +0000 Subject: [PATCH 1/5] Initial plan From ec81ae4b13f1fbc00fa867e08bf1c492622b8ebf Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Sat, 7 Feb 2026 06:07:04 +0000 Subject: [PATCH 2/5] Add comprehensive analysis of projectOps + orchestration pain points with minimal, architectural recommendations Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com> --- PROJECTOPS_ORCHESTRATION_ANALYSIS.md | 402 +++++++++++++++++++++++++++ 1 file changed, 402 insertions(+) create mode 100644 PROJECTOPS_ORCHESTRATION_ANALYSIS.md diff --git a/PROJECTOPS_ORCHESTRATION_ANALYSIS.md b/PROJECTOPS_ORCHESTRATION_ANALYSIS.md new file mode 100644 index 00000000000..9e89427c20a --- /dev/null +++ b/PROJECTOPS_ORCHESTRATION_ANALYSIS.md @@ -0,0 +1,402 @@ +# ProjectOps + Orchestration: Analysis and Recommendations + +## Executive Summary + +This document analyzes the current pain points in combining projectOps with orchestration patterns in GitHub Agentic Workflows and provides minimal, architectural recommendations that naturally fit into the existing architecture. + +## Current Architecture + +### Orchestration Pattern +- **Orchestrator**: Dispatches workers and aggregators using `dispatch-workflow` +- **Workers**: Process individual units of work (e.g., triage sub-issues) +- **Aggregator**: Collects results and updates tracking issues/projects + +### ProjectOps Pattern +- Safe outputs: `create-issue`, `link-sub-issue`, `update-issue`, `update-project` +- Temporary ID system for referencing newly created issues before they exist +- Island-based content updates using `replace-island` operation (currently run-id based only) + +## Pain Points Analysis + +### 1. Tool-Call Ordering Flakiness + +**Problem**: Agent sometimes tries to link a sub-issue before creating it, breaking temporary ID references. + +**Root Cause**: +- LLM non-determinism in tool call sequencing +- Lack of explicit dependency enforcement in the tool system + +**Current Workaround**: +- Stricter prompt structure (STEP 1/STEP 2) +- Explicit "DO NOT" constraints +- Results: 90% reliability but still fragile + +**Recommended Solution**: **Safe Output Dependency Chains** (Minimal Addition) + +Add optional `depends_on` field to safe output messages that defers execution until dependencies resolve: + +```javascript +// Agent output with dependency +{ + "type": "link_sub_issue", + "parent_issue_number": 123, + "sub_issue_number": "aw_temp_001", + "depends_on": ["aw_temp_001"] // Defers until this temporary ID resolves +} +``` + +**Implementation**: +- Modify `actions/setup/js/handler_manager.cjs` to track unresolved dependencies +- Add dependency resolution phase before processing safe output messages +- Already partially implemented: `link_sub_issue.cjs` has deferred status for unresolved temp IDs (lines 70-88) + +**Impact**: Low complexity, natural extension of existing temporary ID system + +--- + +### 2. Missing Optional Parameters (island_id) + +**Problem**: `island_id` missing on `update_issue` causes aggregator to duplicate compliance sections instead of replacing them. + +**Root Cause**: +- Current `replace-island` operation uses `runId` as the island identifier (automatic) +- No way to specify a **named island** for deterministic, cross-run updates +- Aggregator can't update the same section across multiple runs + +**Current Workaround**: +- Moved to deterministic flow: read → find section → remove → rewrite +- Fragile and prone to formatting errors + +**Recommended Solution**: **Named Islands** (Minimal Addition) + +Extend `replace-island` operation to support optional `island_id` parameter: + +```yaml +# Frontmatter +safe-outputs: + update-issue: + body: true + operation: replace-island # Allow named islands + max: 5 +``` + +```javascript +// Agent output with named island +{ + "type": "update_issue", + "issue_number": 123, + "operation": "replace-island", + "island_id": "compliance-summary", // Named island (optional) + "body": "## Compliance Status\n\n- Worker 1: ✅ Complete\n- Worker 2: ✅ Complete" +} +``` + +**Implementation Changes**: + +1. **Schema Update** (`pkg/parser/schemas/main_workflow_schema.json`): + - Add `island_id` as optional string field in update-issue tool + +2. **JavaScript Update** (`actions/setup/js/update_pr_description_helpers.cjs`): + ```javascript + // Extend buildIslandStartMarker to support named islands + function buildIslandStartMarker(runId, islandId) { + if (islandId) { + return ``; + } + return ``; + } + ``` + +3. **Update Handler** (`actions/setup/js/update_issue.cjs`): + - Pass `island_id` from message to `updateBody()` function + - Default to `runId` if `island_id` not provided (backward compatible) + +**Benefits**: +- Deterministic updates: Same island updated across multiple workflow runs +- Aggregator can reliably update specific sections without duplicating +- Backward compatible: Falls back to `runId` if `island_id` not specified +- Minimal changes to existing code + +**Impact**: Low complexity, natural extension of existing replace-island feature + +--- + +### 3. Orchestration Timing Dependencies + +**Problem**: Aggregator runs before workers finish, reports everything as "Pending" + +**Root Cause**: +- `dispatch-workflow` launches workers asynchronously +- Aggregator starts in parallel, doesn't wait for workers +- No built-in synchronization mechanism + +**Current Workaround**: +- Polling/retry in aggregator (fragile) +- 90-second delays (still not enough) +- Timing-dependent and unreliable + +**Recommended Solution A**: **Workflow Completion Events** (Requires GitHub Actions Enhancement) + +**NOT RECOMMENDED** - Requires GitHub Actions to support `workflow_run` events for `workflow_dispatch` triggers, which is not currently supported. + +**Recommended Solution B**: **Polling with Project Status Fields** (Minimal, Uses Existing Infrastructure) + +Use GitHub Projects as a coordination mechanism: + +```yaml +# Worker workflow frontmatter +safe-outputs: + update-project: + project: "https://github.com/orgs/github/projects/24060" + max: 1 +``` + +```javascript +// Worker: Update project status when complete +update_project({ + project: "...", + content_type: "draft_issue", + draft_title: "Worker Status", + fields: { + "Status": "Complete", // Workers mark themselves complete + "Worker ID": "worker-1", + "Completed At": "2026-02-07T06:00:00Z" + } +}) +``` + +```javascript +// Aggregator: Poll project items until all workers complete +const allWorkersComplete = await checkProjectItems({ + project: "...", + filter: { "Status": "Complete" }, + expectedCount: 5 // Number of workers dispatched +}); + +if (!allWorkersComplete) { + // Defer aggregation or update status as "In Progress" + return; +} + +// All workers done, proceed with aggregation +``` + +**Implementation**: +- No new safe outputs needed - uses existing `update-project` +- Aggregator includes polling logic to check project status fields +- Workers update project status when complete + +**Benefits**: +- Uses existing GitHub Projects infrastructure +- Natural status tracking for monitoring +- No new safe outputs or features required +- Self-documenting: Project board shows worker progress + +**Alternative**: **Wait-for-Completion Safe Output** (New Feature, More Reliable) + +Add new `wait-for-workflows` safe output that blocks until dispatched workflows complete: + +```yaml +safe-outputs: + dispatch-workflow: + workflows: [worker-a, worker-b] + max: 10 + + wait-for-workflows: + timeout: 300 # 5 minutes + poll-interval: 15 # Check every 15 seconds +``` + +```javascript +// Orchestrator dispatches workers and waits +worker_a({ tracker_id: 123 }); +worker_b({ tracker_id: 123 }); + +// Wait for all dispatched workflows to complete +wait_for_workflows({ + timeout: 300 // 5 minutes max wait +}); + +// Now safe to aggregate +``` + +**Implementation Complexity**: Medium +- Requires tracking dispatched workflow run IDs +- Needs GitHub API polling for workflow status +- Timeout and error handling + +**Impact**: Medium complexity, but provides reliable synchronization + +**Recommendation**: Start with **Solution B (Project Status Polling)** as it requires no new features and is immediately implementable. Consider **wait-for-workflows** if Project polling proves insufficient. + +--- + +### 4. Event Trigger Re-entrancy / Cascading Runs + +**Problem**: Creating and labeling sub-issues triggers dispatcher again via `issues.labeled`, causing loops and unnecessary compute. + +**Root Cause**: +- Workflows triggered by `issues.labeled` or `issues.opened` +- Sub-issue creation/labeling creates events that re-trigger the dispatcher +- No built-in filtering to distinguish orchestrator-created issues from user-created issues + +**Current Workaround**: +- Filtering at GitHub Actions level (`if` conditions) +- Checking for specific labels or markers in workflow YAML +- Requires manual configuration in each workflow + +**Recommended Solution**: **Built-in Re-entrancy Protection** (Safe Output Enhancement) + +Add automatic re-entrancy markers to created issues: + +```yaml +# Frontmatter - enable re-entrancy protection +safe-outputs: + create-issue: + max: 10 + labels: [task] + prevent-retrigger: true # Add marker to prevent re-triggering +``` + +**Implementation**: + +1. **Automatic Marker Addition** (`actions/setup/js/create_issue.cjs`): + ```javascript + // Add hidden HTML comment marker to issue body + const reentrantMarker = ``; + const bodyWithMarker = reentrantMarker + "\n\n" + issueBody; + ``` + +2. **Workflow Conditional Generation** (`pkg/workflow/compiler_yaml.go`): + ```yaml + # Generated workflow includes automatic re-entrancy check + on: + issues: + types: [opened, labeled] + + jobs: + agent: + if: | + !contains(github.event.issue.body, '