feat(web): add URL query param deep-linking for dashboard nodes#126
Closed
PolyphonyRequiem wants to merge 56 commits intomicrosoft:mainfrom
Closed
feat(web): add URL query param deep-linking for dashboard nodes#126PolyphonyRequiem wants to merge 56 commits intomicrosoft:mainfrom
PolyphonyRequiem wants to merge 56 commits intomicrosoft:mainfrom
Conversation
When multiple workflow runs start in the same second (common when
orchestrating parallel runs), time.strftime('%Y%m%d-%H%M%S') produces
identical timestamps, causing all runs to write to the same file.
This corrupts event logs, checkpoint files, and CLI log files by
interleaving events from different runs.
Append a random 8-character hex suffix (via secrets.token_hex(4))
to filenames across all three affected locations:
- EventLogSubscriber (event_log.py)
- CheckpointManager.save_checkpoint (checkpoint.py)
- generate_log_path (cli/run.py)
Filenames change from:
conductor-workflow-20260416-014816.events.jsonl
to:
conductor-workflow-20260416-014816-a3b7c9f1.events.jsonl
Backward compatible: existing tools that glob *.events.jsonl,
*.json, or *.log continue to work.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a human gate is presented and a web dashboard has been started (e.g. via --web-bg), _handle_gate_with_web previously checked `self._web_dashboard.has_connections()` and fell back to the CLI-only path if no WebSocket client was currently connected. In practice users almost always open the per-run dashboard *after* seeing the gate-waiting notification, so `has_connections()` is typically False at the moment the gate is presented. Under --web-bg there is no attached stdin, so the CLI task blocks forever on `input()`, and when the user later connects and clicks approve, the `gate_response` WebSocket message is enqueued to `_gate_response_queue` with no coroutine awaiting it. The workflow hangs indefinitely. Fix: only bail to CLI-only when there is no web dashboard at all. Always start both the CLI task and the web-wait task in parallel when a dashboard exists; `wait_for_gate_response` happily awaits an empty queue until the user eventually connects and clicks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a human gate is presented and a web dashboard has been started (e.g. via --web-bg), _handle_gate_with_web previously checked `self._web_dashboard.has_connections()` and fell back to the CLI-only path if no WebSocket client was currently connected. In practice users almost always open the per-run dashboard *after* seeing the gate-waiting notification, so `has_connections()` is typically False at the moment the gate is presented. Under --web-bg there is no attached stdin, so the CLI task blocks forever on `input()`, and when the user later connects and clicks approve, the `gate_response` WebSocket message is enqueued to `_gate_response_queue` with no coroutine awaiting it. The workflow hangs indefinitely. Fix: only bail to CLI-only when there is no web dashboard at all. Always start both the CLI task and the web-wait task in parallel when a dashboard exists; `wait_for_gate_response` happily awaits an empty queue until the user eventually connects and clicks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
`wait_for_gate_response` previously re-queued any message whose `agent_name` did not match the expected agent, then slept 10ms and retried. Because `asyncio.Queue` has no deduplication, the same stale message would immediately be dequeued again on the next iteration — producing a tight loop that re-enqueues and re-inspects the same dict forever (cpu-bound, 100 iter/sec per stale message). In practice this never triggered during normal flow because conductor presents one gate at a time and the `has_connections()` short-circuit used to hide the issue. With that short-circuit now gone (PR to always race both tasks), and with any client ever producing a duplicate click, the spin becomes reachable. Fix: since conductor only ever awaits one gate at a time, any non-matching `gate_response` is definitionally stale (a duplicate click from a dashboard that missed the first resolution, or a message for a gate that already completed). Re-queueing can never deliver it — the matching `wait_for_gate_response` call is already gone. Discard stale messages with a warning log so the queue drains cleanly and the next `await .get()` blocks properly on an empty queue. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
`wait_for_gate_response` previously re-queued any message whose `agent_name` did not match the expected agent, then slept 10ms and retried. Because `asyncio.Queue` has no deduplication, the same stale message would immediately be dequeued again on the next iteration — producing a tight loop that re-enqueues and re-inspects the same dict forever (cpu-bound, 100 iter/sec per stale message). In practice this never triggered during normal flow because conductor presents one gate at a time and the `has_connections()` short-circuit used to hide the issue. With that short-circuit now gone (PR to always race both tasks), and with any client ever producing a duplicate click, the spin becomes reachable. Fix: since conductor only ever awaits one gate at a time, any non-matching `gate_response` is definitionally stale (a duplicate click from a dashboard that missed the first resolution, or a message for a gate that already completed). Re-queueing can never deliver it — the matching `wait_for_gate_response` call is already gone. Discard stale messages with a warning log so the queue drains cleanly and the next `await .get()` blocks properly on an empty queue. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds TestWaitForGateResponse with two cases: - Returns the matching message. - Discards stale (non-matching) messages without re-queueing, regression-covering the busy-loop bug fixed in this PR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds TestWaitForGateResponse with two cases: - Returns the matching message. - Discards stale (non-matching) messages without re-queueing, regression-covering the busy-loop bug fixed in this PR. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…rosoft#101) Add input_mapping field to AgentDef for type='workflow' agents. When present, each value is a Jinja2 expression rendered against the parent context to build sub-workflow inputs. When absent, existing behavior (forwarding parent's workflow.input.*) is preserved. - Schema: Add input_mapping to AgentDef with validation for workflow-only - Engine: Render input_mapping templates in _execute_subworkflow() - Tests: Schema validation for all agent types - Experimental workflows: test-input-mapping parent/child pair Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
) Remove validator restriction blocking type='workflow' in for_each groups. Wire execute_single_item() to call _execute_subworkflow_with_inputs() for workflow agents, rendering input_mapping with loop variables in scope. - Validator: Remove workflow rejection in for_each validation - Engine: Add workflow branch in execute_single_item(), new helper _execute_subworkflow_with_inputs() for pre-built inputs - Tests: Update test_workflow_in_for_each to validate (not reject) - Experimental workflows: test-for-each-workflow parent/child pair Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…acking (microsoft#103) Remove the circular reference path check that blocked workflows from referencing themselves. The existing MAX_SUBWORKFLOW_DEPTH=10 already prevents infinite recursion. Add optional per-agent max_depth field for tighter author-controlled bounds. - Engine: Remove self-reference path equality check in both _execute_subworkflow() and _execute_subworkflow_with_inputs() - Engine: Add per-agent max_depth enforcement alongside global limit - Schema: Add max_depth field to AgentDef with validation - Tests: Replace circular reference test with depth-limit tests - Experimental: test-recursive.yaml self-referential countdown Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a metadata field to WorkflowDef that allows workflow authors to
attach arbitrary key-value pairs for external tooling. The metadata
is included verbatim in the workflow_started event, enabling
downstream consumers (dashboards, trackers, enrichers) to adapt
behavior without parsing the YAML source.
Example usage in workflow YAML:
workflow:
name: twig-sdlc
metadata:
tracker: ado
project_url: https://dev.azure.com/org/Project
work_item_id_agent: intake
work_item_id_field: epic_id
The field defaults to an empty dict, so existing workflows are
unaffected.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add --metadata / -m flag to 'conductor run' that accepts key=value
pairs, merged on top of YAML-declared metadata. This enables callers
to inject dynamic values at invocation time:
conductor run twig-sdlc.yaml --metadata work_item_id=1814
CLI metadata is:
- Parsed separately from --input (different binding path)
- Merged on top of YAML metadata (CLI wins on conflicts)
- Forwarded through --web-bg background process spawning
- Included in the workflow_started event alongside YAML metadata
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
7 new tests verifying: - Schema: metadata defaults to empty dict, accepts arbitrary keys, independent from input/context fields - Loader: metadata round-trips through YAML, omission gives empty dict, nested values preserved, metadata and input are separate namespaces All 140 config tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Propagate the event log's random hex suffix as a run_id across all systems: - EventLogSubscriber: expose run_id property (was already generated) - WorkflowEngine: accept run_id + log_file params, include in workflow_started event - PID files: include run_id + log_file fields - Web dashboard: add /api/info endpoint returning run_id, log_file, workflow_name, started_at, metadata This enables the central dashboard to match per-run dashboards to event logs by exact run_id instead of fragile name/timestamp heuristics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a parent workflow calls a sub-workflow via type: workflow, the child engine now inherits the parent's agent outputs in its context. This allows sub-workflow agents to access parent agent state (e.g., task_manager.output, pr_group_manager.output) by declaring them in their input: list, even when input_mapping doesn't cover all fields. Previously, sub-workflow agents could only access workflow.input.* (populated via input_mapping) and their own sibling agents' outputs. Parent agent outputs were lost at the sub-workflow boundary, causing nested sub-workflows to see default values (0, '') instead of the actual parent state. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds full subworkflow awareness to the per-run web dashboard: State pollution fix: - Added wf_depth counter to workflow store — only depth-0 workflow_started initializes root context. Inner workflow events are routed to isolated SubworkflowContext objects. - Each subworkflow invocation gets its own nodes/routes/agents maps, keyed by (parentAgent, iteration). Repeated runs of the same subworkflow no longer share state. Subworkflow event handling: - Added TypeScript types for subworkflow_started, subworkflow_completed, subworkflow_failed events (mirrors engine emit). - Event handlers create/update child contexts and track the active context path for routing subsequent events. Breadcrumb navigation: - New BreadcrumbBar component shows the context stack above the graph (e.g., Root > twig-sdlc-planning > plan-issue). - Click any breadcrumb to navigate to that context level. - Double-click a workflow agent node in the graph to dive into its subworkflow context. - Graph rebuilds automatically when context changes. Context stack architecture: - SubworkflowContext[] tree structure mirrors workflow nesting. - activeContextPath tracks where live events are routed. - viewContextPath tracks what the user is viewing (independent). - getViewedContext() returns the correct nodes/routes for rendering. - All event handlers use activeTarget() helper to route to the correct context's nodes/groupProgress. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Phase 4-5 of breadcrumb navigation feature: WorkflowNode component: - New node type for type:'workflow' agents with dashed border and Layers icon (visually distinct from regular agent nodes). - Shows child workflow name, elapsed time, and a chevron indicator when a SubworkflowContext exists. - Double-click to navigate into the subworkflow graph. SubworkflowDetail panel: - New detail component shown when a workflow agent is selected. - Lists all subworkflow runs for that agent with status, agent count, and cost summary. - Click any run to navigate into its context. Context-aware rendering: - All graph node components (AgentNode, ScriptNode, GateNode, GroupNode, AnimatedEdge) now read from getViewedContext().nodes instead of root state.nodes — ensures correct status display when viewing child contexts. - DetailPanel reads from viewed context for node lookup. - GroupDetail reads groupProgress from viewed context. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…to install-combined
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
getViewedContext() creates a new object on every call, causing infinite re-render loops (React error #185) when used inside Zustand selectors. New hooks in use-viewed-context.ts use useMemo with stable state references: - useViewedNodes() — nodes map for current context - useViewedGroupProgress() — group progress for current context - useViewedHighlightedEdges() — edge highlights - useViewedSubworkflowContexts() — child contexts - useViewedGraphData() — full graph data for WorkflowGraph All graph components and detail panels updated to use these hooks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Conflicts: # src/conductor/web/static/index.html
When a parent workflow calls a sub-workflow via type: workflow, the child engine now inherits the parent's agent outputs in its context. This allows sub-workflow agents to access parent agent state (e.g., task_manager.output, pr_group_manager.output) by declaring them in their input: list, even when input_mapping doesn't cover all fields. Previously, sub-workflow agents could only access workflow.input.* (populated via input_mapping) and their own sibling agents' outputs. Parent agent outputs were lost at the sub-workflow boundary, causing nested sub-workflows to see default values (0, '') instead of the actual parent state. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Conflicts: # src/conductor/engine/event_log.py
When a subworkflow agent is running, the shared interrupt_event was being silently consumed by _check_interrupt() in web mode (line 857: 'if self._web_dashboard is not None: return None'). This meant Stop would pause the current agent but then resume and continue — the interrupt never propagated to the parent engine. Two fixes: 1. _check_interrupt(): when _subworkflow_depth > 0, raise InterruptError instead of silently consuming the interrupt. This unwinds the child engine back to the parent's _execute_subworkflow try/except, stopping the workflow. 2. _handle_web_pause(): in subworkflows, also watch interrupt_event alongside resume/kill/disconnect events. A second Stop click while an agent is paused now raises InterruptError immediately, without requiring Resume first. Root-level (depth 0) behavior is unchanged — Stop still pauses the current agent with Resume/Kill options in the dashboard. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Terminal gates: wrap prompt text in Rich Markdown for proper heading,
bold, list, and link rendering in the CLI
- Web dashboard: add GET /api/files/{path} endpoint to serve local files
relative to the workflow root with security guards (path traversal
protection, extension allowlist, 1MB size limit)
- Web dashboard: add FileViewer modal component for viewing linked files
with Markdown rendering support
- Web dashboard: intercept relative links in gate prompts to open the
FileViewer instead of navigating away; external URLs still open in
new tabs
- Tests: 13 file API security tests + 3 gate Markdown rendering tests
(79 total tests pass across both test suites)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ment feat: enrich human gates with Markdown rendering and file links
On Windows with Python 3.14+, the proactor event loop's accept callback can fire after Server.close() sets _sockets = None during shutdown, causing an AssertionError in base_events.py:_attach that crashes the workflow process. Fix: - Add _guarded_serve() wrapper that catches AssertionError when the uvicorn server is in shutdown state (should_exit = True) - Install a custom event-loop exception handler during server lifetime that suppresses the same race when it surfaces through callbacks - _is_proactor_shutdown_race() validates: AssertionError type, server shutdown state, and asyncio-originating traceback frames - Restore original exception handler in stop() The guard is narrowly scoped: only AssertionError during server shutdown is suppressed. All other exceptions delegate to the original handler. Tests: 9 new tests covering the race detection, exception handler delegation, guarded serve behavior, and edge cases. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
fix: guard against proactor accept-loop race on Windows (Python 3.14+)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…yers When navigating between subworkflow layers via breadcrumbs or double-click, old React Flow edges from the previous layer persisted as floating links disconnected from any visible nodes. Two fixes: - WorkflowGraph: explicitly clear nodes and edges when switching to an empty context (subworkflow data not yet populated) - graph-layout: filter edges against the actual node ID set to prevent orphan edges from routes referencing non-existent nodes Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add a markdown-aware post-processor that converts bare file paths and URLs in rendered gate prompts into clickable markdown links. This fixes the issue where Jinja2-rendered file paths appeared as plain grey text in the web dashboard instead of interactive file links. Processing: - Normalizes Jinja2 whitespace artifacts (collapses excessive blank lines) - Auto-detects bare http(s):// URLs and wraps in markdown link syntax - Auto-detects relative file paths (verified against workflow root), wraps in markdown links that the dashboard FileViewer can open - Markdown-aware: skips fenced code blocks, inline code, existing links - Reuses the server's extension allowlist for file detection - Security: path traversal outside workflow root is blocked Applied to both web dashboard (gate_presented event) and terminal (Rich markdown rendering in HumanGateHandler). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Auto-inject runtime diagnostics (PID, platform, Python version, cwd, conductor version, started_at, run_id, log_file, bg_mode) into the workflow_started event. Dashboard port/URL included when --web is active; parent_pid included in --web-bg mode. System metadata flows through: - JSONL event log (via EventLogSubscriber) - Web dashboard /api/info endpoint - Checkpoint files (for resume context) PID files are intentionally left unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds a web-based drag-and-drop visual designer for creating and editing conductor workflow YAML files. Launch with 'conductor designer' CLI command. Backend: - FastAPI server with REST endpoints for workflow CRUD, validation, export/import - WorkflowConfig <-> JSON bridge reusing existing Pydantic validation - YAML export via ruamel.yaml with clean formatting - CLI integration: conductor designer [workflow.yaml] [--port] [--no-open] Frontend (React + @xyflow/react + Zustand + TailwindCSS): - WorkflowConfig as source of truth (ReactFlow nodes/edges derived) - 8 node types: Agent, Gate, Script, Workflow, Parallel, ForEach, Start, End - Property panel with type-specific editors + workflow settings - Jinja2-aware prompt editor with syntax highlighting - YAML preview, validation panel, undo/redo (Ctrl+Z/Ctrl+Shift+Z) - Import/export YAML, save to disk, keyboard shortcuts (Ctrl+S) - Dagre auto-layout for DAG visualization Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Agents with multiple routes to the same target (e.g., researcher has two routes to synthesizer with different conditions) would generate duplicate edge IDs, causing React key collisions. Fixed by appending route index to edge IDs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…s NameError Includes both fixes: - workflow.input always available for script/sub-workflow templates in explicit mode - _execute_subworkflow_with_inputs context parameter fix (for-each + sub-workflow) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… variables
Adds three new template variables available in all agent contexts:
- {{ workflow.dir }} - absolute path to the workflow YAML's directory
- {{ workflow.file }} - absolute path to the workflow YAML file
- {{ workflow.name }} - workflow name from the YAML config
These enable script agents to resolve co-located scripts relative to
the workflow file rather than CWD, which is critical for registry-based
workflows where scripts live alongside the YAML in the registry directory.
Example:
args:
- -File
- {{ workflow.dir }}/scripts/detect-state.ps1
Available in all context modes (accumulate, last_only, explicit).
Empty strings are omitted from context to avoid polluting templates.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a script agent's stdout is valid JSON, the parsed fields are now
merged into the output dict alongside stdout/stderr/exit_code. This
makes parsed fields accessible in route conditions and downstream
templates as output.field_name — matching LLM structured output behavior.
Example: a script outputting JSON like {"route": "planning"} now allows
route conditions like 'when: route == "planning"' instead of requiring
opaque exit codes.
Backward compatible: stdout/stderr/exit_code still present. Non-JSON
stdout is unchanged. Only JSON object stdout gets merged fields.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ults
Optional workflow inputs without an explicit `default:` previously
defaulted to Python None, which renders as "None" in templates and
isn't caught by Jinja's `| default()` filter without the boolean flag.
Now uses type-appropriate zero values: "" for string, 0 for number,
false for boolean, [] for array, {} for object. This ensures templates
render cleanly without requiring `| default()` guards or `if X else Y`
workarounds.
Explicit `default:` values in the schema are still honored.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Enhanced `conductor validate` to scan Jinja2 templates in agent prompts,
script args, input_mapping, and workflow output for reference errors:
Level 1 — Template reference resolution:
- {{ X.output.Y }} where X is not a valid agent name → error
- {{ workflow.input.X }} where X is not a declared input → error
- In explicit mode, LLM agents referencing agent outputs not in their
input: list → warning
Also wires semantic validation (validate_workflow_config) into the CLI
`conductor validate` command, which previously only ran Pydantic schema
validation.
Catches errors like: stale agent name references after renames, missing
workflow input declarations, and undeclared dependencies in explicit mode
— all before runtime, at zero cost.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The 2-part form '- workflow.input' in an agent's input list now injects all declared workflow inputs, matching the documented behavior. Previously it silently injected nothing, requiring the verbose 3-part form '- workflow.input.<param>' for each parameter. - Add elif branch in _add_explicit_input() for len(parts)==2 - Update INPUT_REF_PATTERN regex to accept workflow.input without param - Update validator error message to mention both forms - Add tests for 2-part form (all inputs, merge with individual, optional) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… nodes
Parse ?agent={name} and ?subworkflow={name} query params on initial load
to auto-select and center the matching node in the workflow graph. This
enables the meta-dashboard (conductor-dashboard) to generate clickable
breadcrumb links that open the conductor UI focused on a specific node.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
|
Opened against wrong repo by mistake — closing immediately. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds URL query parameter support to the conductor web dashboard for deep-linking to specific nodes in the workflow graph.
Query Parameters
?agent={name}— auto-selects and centers the graph on the named agent node?subworkflow={name}— navigates into the named subworkflow contextExample URLs
Implementation
A small
useDeepLink()hook (~60 lines) that:agentandsubworkflowfromwindow.location.searchon mount?subworkflow=X— callsnavigateIntoSubworkflow()to switch breadcrumb context?agent=X— callsselectNode()thenfitView()to center with animationMounted as
<DeepLinkHandler />inside<ReactFlow>inWorkflowGraph.tsx.Motivation
Enables the meta-dashboard (conductor-dashboard) to generate clickable breadcrumb links that open the conductor UI focused on a specific workflow node.
Also included (prior fork commits)
workflow.inputin explicit context mode