Conversation
…or, thread-safety docs - Fix _file_stack from set to list to preserve insertion order in cycle detection - Use append/pop instead of add/discard for proper LIFO tracking - Use list initializations in load() and load_string() instead of sets - Remove sorted() from cycle chain display (list preserves order naturally) - Add UnicodeDecodeError handling with clear 'not valid UTF-8' message - Add TestFileTagNonUtf8 test class with test_non_utf8_file_raises_configuration_error - Clarify _create_file_tag_constructor_class docstring: cross-instance isolation only Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…low-syntax.md
- Add 'External File References' entry to Table of Contents
- Add comprehensive section after 'Tools' with syntax, content-type
detection, path resolution (with directory tree diagram), and
load_string() behavior notes
- Add four usage examples: prompt from .md, structured output schema
from .yaml, tool list from external file, nested inclusion pattern
- Add Environment Variables subsection explaining ${VAR} resolution
- Add Error Handling subsection with example ConfigurationError messages
- Add Limitations subsection (UTF-8 only, no globs, no URLs, etc.)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Epic 1: Extended AgentDef with type='script', command/args/env/working_dir/timeout fields, field/model validators for mutual exclusivity, and cross-reference validators rejecting scripts in parallel/for_each groups. Epic 2: Created ScriptExecutor with asyncio subprocess execution, Jinja2 template rendering, per-script timeout handling, env merging, and FileNotFoundError/OSError handling. Epic 3: Wired ScriptExecutor into WorkflowEngine's main run loop with context storage, iteration tracking, route evaluation, and workflow-level timeout enforcement via _execute_script() helper. Epic 4: Created examples/script-step.yaml demonstrating script → route → agent pattern using simpleeval exit_code routing. All 1192 tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- docs/workflow-syntax.md: update type field to include 'script', add Script Steps section documenting command/args/env/working_dir/timeout fields, output structure (stdout/stderr/exit_code), routing patterns, and restrictions (no parallel/for_each) - AGENTS.md: add ScriptExecutor to executor package description and workflow execution flow - README.md: add script steps to feature list and script-step.yaml to examples table - executor/script.py: clarify _verbose_log deferred-import comment - tests/test_executor/test_script.py: add test for working_dir with Jinja2 template and test documenting that env values are not rendered through Jinja2 - tests/test_engine/test_script_workflow.py: add negative integration test confirming script steps in parallel groups raise ConfigurationError at validation time Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…AGENTS.md numbering - Render working_dir through template engine in ScriptExecutor for consistency with command/args rendering (instead of passing raw string to subprocess) - Update test_working_dir_with_jinja2_template to test actual rendering behavior (remove manual pre-rendering workaround) - Fix duplicate step number '6.' in AGENTS.md Workflow Execution Flow section (renumber to 7. and 8.) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Move plan docs (file-tag, script-execution, logging-redesign) into docs/projects/usability-features/ - Add usability-features.brainstorm.md consolidating feature plans - Remove outdated architecture-decisions.md, planned-features.md, and brainstorming docs (caching, checkpointing, cost-tracking)
…t and LimitEnforcer - Add WorkflowContext.to_dict() serializing workflow_inputs, agent_outputs, current_iteration, and execution_history with deep copies - Add WorkflowContext.from_dict(data) classmethod reconstructing context from serialized dict with deep copy isolation - Add LimitEnforcer.to_dict() serializing current_iteration, max_iterations, and execution_history (excludes transient start_time, current_agent) - Add LimitEnforcer.from_dict(data, timeout_seconds) classmethod with restored iteration state, fresh start_time, and config-supplied timeout - Add 34 tests in tests/test_engine/test_context_serialization.py covering round-trips, edge cases, JSON serializability, and behavioral integration Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Extract main execution loop into _execute_loop() shared by run() and resume() - Save checkpoints on ConductorError, KeyboardInterrupt, and Exception - Add _save_checkpoint_on_failure() helper that never raises - Add resume() method that resets timeout and enters loop at specified agent - Add set_context() and set_limits() for external state restoration - Add workflow_path parameter to WorkflowEngine.__init__() - Track _current_agent_name and _last_checkpoint_path as instance variables - Change CheckpointManager.save_checkpoint() error param to BaseException - Add 14-test suite covering all acceptance criteria in test_resume.py - Update workflow-resume.plan.md: Epic 3 marked DONE Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… and remove weak hash-mismatch test - Replace 30-line inline MCP server building block in run_workflow_async() with single call to await _build_mcp_servers(config), completing the helper extraction and eliminating code duplication between run_workflow_async() and resume_workflow_async() - Remove test_hash_mismatch_warning which only asserted mock_resume.called without exercising the actual warning logic; test_hash_mismatch_warning_in_resume_async already provides genuine coverage by mocking at the engine level Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add _session_ids tracking and get_session_ids() to CopilotProvider - Add set_resume_session_ids() and resume_session() attempt with fallback - Wire session ID collection into WorkflowEngine checkpoint save - Wire session ID restoration in resume_workflow_async() via ProviderRegistry - Add unit tests for session ID tracking and resume fallback (11 tests) - Fix redundant except clause: (RuntimeError, Exception) -> Exception - Add checkpoint CLI commands and CheckpointError exception - Add checkpoint unit tests - Update usability brainstorm with workflow resume design Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Check last_activity_ref timestamp before declaring session stuck; if events arrived within the idle timeout window, reset and keep waiting instead of sending disruptive recovery prompts - Reset recovery_attempts counter when new activity is detected, giving each task (tool call, reasoning step) its own budget of max_recovery_attempts rather than sharing across the full session - Fix Rich Text.append() crashes when SDK returns None for event name attributes (use `or "unknown"` + str() wrapping) - Add tests for active-session bypass and counter reset behavior
- Redesign KeyboardListener to use dedicated daemon reader thread delivering bytes via asyncio.Queue, eliminating run_in_executor thread leaks; wait_for timeouts on queue.get() are cleanly cancellable - Fix Esc hint visibility: use _verbose_console.print() instead of verbose_log() so hint always displays regardless of --verbose flag - Replace tautological TestListenerCreation tests with integration-style tests that call real run_workflow_async() with mocked dependencies - Add TestReaderThread test class for the new reader thread architecture - Remove TestQueueGetBlocking (no longer applicable with asyncio.Queue) - Update all listener detection tests to feed bytes via asyncio.Queue.put_nowait() instead of mocking _read_byte_blocking Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Escape untrusted content in Rich panel using rich.markup.escape() to prevent markup injection in output previews and accumulated guidance - Strip guidance text whitespace before storing in InterruptResult to prevent whitespace injection into subsequent agent prompts - Add tests: test_panel_escapes_rich_markup_in_output_preview, test_panel_escapes_rich_markup_in_guidance, test_continue_guidance_is_stripped (35 tests total) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add user_guidance field, add_guidance(), get_guidance_prompt_section() to WorkflowContext - Update to_dict()/from_dict() with backward-compatible user_guidance serialization - Add optional guidance_section parameter to AgentExecutor.execute() - Wire guidance retrieval in WorkflowEngine._execute_loop() for regular agents - Add 8 tests for WorkflowContext guidance methods in test_context.py - Add 6 tests for executor guidance injection in test_agent_guidance.py - Update test_context_serialization.py to include user_guidance in empty context dict Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Wire interrupt event check into _execute_loop() and handle all InterruptResult actions. - Add InterruptHandler instance creation in WorkflowEngine.__init__() with skip_gates - Add _get_top_level_agent_names() helper method - Add _check_interrupt() async method: checks event, clears it, builds output preview, delegates to InterruptHandler - Add _handle_interrupt_result() async method: match/case for CONTINUE, SKIP, STOP, CANCEL - Insert interrupt check at end of while loop body (after route evaluation) covering regular agents, parallel groups, and for-each groups - Insert interrupt check before continue in script step path - Backward compatible: interrupt_event=None short-circuits immediately - 25 integration tests covering all actions, guidance accumulation, skip routing, checkpoint save, parallel group deferral Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fix 4 review issues from previous implementation: - FakeSession done_event: add optional done_event param; _deliver_post_abort sets it on completion so post-abort wait resolves immediately in tests - RuntimeWarning: use close_coro_and_raise side_effect to properly close the coroutine before raising TimeoutError in test_post_abort_timeout - send_followup model field: add model=self._default_model to AgentOutput constructor, consistent with all other execution paths - _abort_supported guard: early-return in _abort_session when flag is False (strict identity check), making the flag functional not merely diagnostic Add test_abort_skipped_when_previously_unsupported and assertion for model field in test_send_followup_sends_guidance. All 24 tests pass in 0.42s. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add interrupt_signal parameter to _execute_agentic_loop() with check at the top of each while loop iteration; if set, clear event and call _request_partial_output() to send a user message requesting emit_output - Change _execute_agentic_loop return type to 3-tuple (response, tokens, is_partial) - Update _execute_with_retry() to forward interrupt_signal and handle partial output (skip schema validation, return AgentOutput(partial=True)) - Update execute() to forward interrupt_signal through the call chain - Add _request_partial_output() helper that appends a user message asking Claude to call emit_output with best partial results - Add 17 tests in test_claude_interrupt.py covering interrupt detection, user message format, partial output parsing, schema validation skip, signal clearing, token accounting, guidance injection, and fresh conversation semantics Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add WorkflowEvent frozen dataclass with type, timestamp, data fields and to_dict() method - Add WorkflowEventEmitter with subscribe(), unsubscribe(), emit() using threading.Lock - Callback exception isolation: failing callbacks are logged but don't block others - emit() snapshots subscriber list under lock before iterating - Comprehensive unit tests covering all acceptance criteria Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Narrow script try/except in workflow.py to only wrap _execute_script() so post-processing (record_execution, check_timeout, evaluate_routes, check_interrupt) propagates naturally without emitting spurious script_failed - Add TestGateEvents class to test_event_emission.py with 3 tests: test_gate_presented_and_resolved, test_gate_resolved_to_end, test_gate_event_ordering — covering gate event emission, $end routing through gates, and event ordering - Test count increased from 19 to 22; all 761 tests pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add cancelled() guard in start() polling loop before calling .exception()
to prevent CancelledError propagation when serve task is cancelled during startup
- Raise RuntimeError('Server task was cancelled before starting') in that path
- Rewrite test_start_raises_on_server_failure to call await dashboard.start()
with unittest.mock.patch.object mocking uvicorn.Server.serve to raise OSError,
validating the actual production code path instead of inlining polling logic
- Add test_start_raises_on_cancelled_task covering the new cancelled() guard path
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Fix 6 review bugs in index.html: - BUG: Parallel groups now counted as single units in agentsTotal (removed child agent names from agentNames, added group name instead; groupAgents set prevents duplicate node creation in agent loop) - BUG: For-each group names added to agentNames so they are counted in agentsTotal, fixing '1/0 agents' display for for-each-only workflows - BUG: parallel_completed now uses server-authoritative data.failure_count instead of local groupProgress counter, consistent with for_each_completed and replay-safe - CODE QUALITY: workflowFailure variable moved to state declarations block, eliminating reliance on var hoisting - UX: workflow_failed handler calls setNodeState(data.agent_name, 'failed') to visually mark the running agent as failed on workflow failure - RELIABILITY: CDN scripts pinned to specific patch versions (cytoscape@3.30.4, dagre@0.8.5, cytoscape-dagre@2.5.0) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add --web, --web-port, --web-bg CLI flags to the run command, wire up emitter and dashboard lifecycle in run_workflow_async(), and add the web optional dependency extra to pyproject.toml. - Add [project.optional-dependencies] with web extra (fastapi, uvicorn, websockets) using PEP 621 syntax for pip extras compatibility - Add --web, --web-port, --web-bg Typer options to run command in app.py - Update run_workflow_async() with keyword-only web/web_port/web_bg params - Create WorkflowEventEmitter and pass to WorkflowEngine(event_emitter=) - Lazy-import WebDashboard with try/except ImportError giving actionable error message (pip install conductor-cli[web]) and exit code 1 - Non-fatal dashboard startup: wrap start() in try/except, warn and continue - Print dashboard URL to stderr regardless of --silent/--quiet - Post-execution lifecycle: --web-bg calls wait_for_clients_disconnect(), default --web blocks on asyncio.Event().wait() with Ctrl+C messaging - Always call dashboard.stop() in finally block for cleanup - Use contextlib.suppress(asyncio.CancelledError) per ruff SIM105 - Add 9 tests in test_web_flags.py covering flag acceptance, dependency check, and startup failure handling Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace blocking sys.stdin.buffer.read(1) in the keyboard listener's reader thread with select()-based polling (100ms timeout), allowing the thread to check _stop_flag and exit cleanly. Join the reader thread in stop() to ensure it releases the stdin lock before interpreter finalization, preventing the "could not acquire lock" fatal error. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ps required - Stream SDK events (reasoning, tool calls, messages) from providers through the event callback chain to the web dashboard in real-time - Add agent detail panel to dashboard with rendered prompt, activity stream (reasoning/tool calls/turn markers), and enriched output (input/output tokens) - Add --web-bg flag as standalone background mode that forks a detached child process, prints the dashboard URL, and exits the CLI immediately - Make web dependencies (fastapi, uvicorn, websockets) required instead of optional — remove [web] extras group and import guard - Disable interactive interrupt (Esc guidance) when running in --web mode - Update all documentation: README, CLI reference, AGENTS.md, examples, skill references Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Allow users to manually reposition nodes when the automatic dagre layout doesn't produce an ideal arrangement. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Multi-feature workflow for visually testing the web dashboard. Exercises script steps, parallel groups, conditional routing, loop-back patterns, human gates, and for-each dynamic parallel.
- Format workflow.py to pass ruff formatter check - Wrap blocking Rich Prompt.ask/IntPrompt.ask in typed helper functions to satisfy ty's strict DefaultType checking when used with asyncio.to_thread() Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Large feature branch adding five major usability features to Conductor, each implemented as a series of epics with full test coverage.
1. Script Execution Steps (
type: script)command+argsstdout,stderr,exit_codeinto workflow contextworking_dir, configurabletimeout_secondsexit_codefor conditional branching2. External File References (
!filetag)!filetag resolves external files inline during loading.md,.yaml,.json,.txtwith cycle detection3. Workflow Resume (Checkpoint & Restore)
conductor resumeCLI command to pick up from last checkpointWorkflowContextandLimitEnforcer4. Mid-Agent Interrupt System
5. Real-Time Web Dashboard
--webflag launches FastAPI + WebSocket dashboard alongside workflow--web-bgforks to background, prints URL, and exits6. Supporting Improvements
WorkflowEventEmitter) decoupling engine from renderersStats
Testing