Skip to content

Conversation

@t0mdavid-m
Copy link
Member

@t0mdavid-m t0mdavid-m commented Jan 26, 2026

Summary

This PR adds support for cancelling workflows that are running in Redis queue mode. Users can now stop long-running workflows between command executions, with proper cleanup and status reporting.

Key Changes

  • New WorkflowCancelled exception: Custom exception to signal user-initiated workflow cancellation
  • Cancellation check mechanism: Added set_cancellation_check() method to CommandExecutor that accepts a callable to check if a workflow should stop
  • Pre-command validation: _check_cancellation() is called before each command execution to allow stopping workflows between commands
  • Redis job integration: In execute_workflow(), set up a cancellation check that monitors the Redis job's is_stopped status
  • Graceful cancellation handling: Added dedicated exception handler for WorkflowCancelled that:
    • Logs cancellation to workflow log files
    • Cleans up the pid directory
    • Returns a proper response with cancelled: True flag

Implementation Details

  • The cancellation check is non-blocking and only evaluated before command execution, making it safe for long-running workflows
  • The Redis job status is refreshed on each check to get the latest cancellation state
  • Cleanup operations are wrapped in try-except blocks to ensure robustness even if cleanup fails
  • The solution maintains backward compatibility - the cancellation check is optional and only used when explicitly set

Summary by CodeRabbit

  • New Features
    • Workflow cancellation: Running workflows can now be cancelled at any time between command executions. Cancellations are handled gracefully with comprehensive logging to multiple audit files, detailed status messages, and automatic cleanup of all associated temporary resources and process information to ensure system integrity and prevent resource leaks.

✏️ Tip: You can customize this high-level summary in your review settings.

Check for job cancellation before each command execution in CommandExecutor.
This allows workflows running in Redis queue mode to stop at the next command
boundary when the user presses Stop, rather than requiring multiple presses.

https://claude.ai/code/session_013VFNJ1Sznugpony3r2Mgxt
@coderabbitai
Copy link

coderabbitai bot commented Jan 26, 2026

📝 Walkthrough

Walkthrough

The changes introduce a workflow cancellation mechanism by adding a WorkflowCancelled exception class and cancellation hook system to CommandExecutor, along with integration in tasks.py to register job state checks and handle cancellations with logging and resource cleanup.

Changes

Cohort / File(s) Summary
Cancellation Exception & Hook System
src/workflow/CommandExecutor.py
Adds WorkflowCancelled exception class, _should_stop callable attribute, set_cancellation_check() method to register cancellation predicates, and _check_cancellation() method to evaluate stop condition before command execution.
Cancellation Integration & Cleanup
src/workflow/tasks.py
Imports WorkflowCancelled, registers a cancellation check that evaluates job stop status, and adds exception handling to log cancellation notices, remove pid directory, and return cancellation response on workflow abort.

Sequence Diagram

sequenceDiagram
    participant Task
    participant CommandExecutor
    participant JobState
    
    Task->>CommandExecutor: set_cancellation_check(should_stop_func)
    CommandExecutor->>CommandExecutor: Store should_stop callable
    
    loop Command Execution Loop
        CommandExecutor->>CommandExecutor: _check_cancellation()
        CommandExecutor->>JobState: should_stop_func() invokes<br/>job.refresh() & check is_stopped
        alt Job is Stopped
            JobState-->>CommandExecutor: True
            CommandExecutor->>CommandExecutor: Raise WorkflowCancelled
            CommandExecutor-->>Task: WorkflowCancelled exception
            Task->>Task: Log cancellation notice<br/>Cleanup resources
        else Job is Running
            JobState-->>CommandExecutor: False
            CommandExecutor->>CommandExecutor: Execute run_command()
        end
    end
Loading

Poem

🐰 A workflow once raced without end,
But now with a stop, we can suspend!
Cancellation bells ring, and logs take the call,
The executor listens, and heeds to it all.
hop hop — workflows now gracefully fall! 🛑

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: adding workflow cancellation support for Redis queue mode, which is the primary objective of the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@src/workflow/CommandExecutor.py`:
- Around line 112-116: run_multiple_commands() currently swallows exceptions
raised by run_command() running in threads, so WorkflowCancelled may be ignored;
modify run_multiple_commands() to collect exceptions from each thread (e.g.,
append exception objects to a shared list or set via a thread-safe structure)
and also store per-task success flags, then after joining all threads check the
collected exceptions and if any contain a WorkflowCancelled instance re-raise
that exception (or raise the first WorkflowCancelled) before returning; ensure
run_command(), run_multiple_commands(), and any usage of _check_cancellation()
reflect that exceptions propagate out of the parallel join logic rather than
being masked by all(results).

Comment on lines +112 to +116
WorkflowCancelled: If the workflow was cancelled by the user.
Exception: If the command execution results in any errors.
"""
# Check for cancellation before starting the command
self._check_cancellation()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Propagate WorkflowCancelled out of parallel execution

When run_command() raises inside run_multiple_commands() threads, the exception is swallowed and all(results) can still return True, so cancellations can be ignored. Consider capturing thread exceptions and re-raising after joins.

Suggested fix
-        results = []
+        results = []
+        exceptions = []
         lock = threading.Lock()
 
         def run_and_track(cmd):
-            success = self.run_command(cmd)
-            with lock:
-                results.append(success)
+            try:
+                success = self.run_command(cmd)
+                with lock:
+                    results.append(success)
+            except Exception as e:
+                with lock:
+                    exceptions.append(e)
@@
         for thread in threads:
             thread.join()
 
+        if exceptions:
+            for e in exceptions:
+                if isinstance(e, WorkflowCancelled):
+                    raise e
+            raise exceptions[0]
+
-        return all(results)
+        return len(results) == len(commands) and all(results)
🤖 Prompt for AI Agents
In `@src/workflow/CommandExecutor.py` around lines 112 - 116,
run_multiple_commands() currently swallows exceptions raised by run_command()
running in threads, so WorkflowCancelled may be ignored; modify
run_multiple_commands() to collect exceptions from each thread (e.g., append
exception objects to a shared list or set via a thread-safe structure) and also
store per-task success flags, then after joining all threads check the collected
exceptions and if any contain a WorkflowCancelled instance re-raise that
exception (or raise the first WorkflowCancelled) before returning; ensure
run_command(), run_multiple_commands(), and any usage of _check_cancellation()
reflect that exceptions propagate out of the parallel join logic rather than
being masked by all(results).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants