Skip to content

Conversation

@m4dcoder
Copy link
Collaborator

@m4dcoder m4dcoder commented Mar 25, 2020

Refactor evaluation of inbound criteria for join task on inbound task completion and not on inbound task transition. This allows for different use cases such as multiple task transition from the same inbound task, inbound task transition on completion or inbound task transition on failure. Include additional check in the state machine to detect case where there is an unreachable join task where inbound criteria is partially satisfied but all inbound tasks have already completed and it will never be satisfied.

Fixes #123
Fixes #190

Add commands to Makefile to run tox checks to simplify running tests and checks.
Run `make schemas` command and update the JSON schemas on the specs for workflow definition.
Given one or more tasks that joins on complete but then some of the tasks failed, the workflow state machine determines the workflow failed because the inbound criteria for the join task is not satisfied and there are no more tasks to run. An option is added to skip evaluation of the inbound criteria for the join task during evaluation in the workflow state machine. The workflow state machine will identify that there is a potential next task to run and keep the workflow state as running.
Refactor evaluation of inbound criteria for join task on inbound task completion and not on inbound task transition. This allows for different use cases such as multiple task transition from the same inbound task, inbound task transition on completion or inbound task transition on failure. Include additional check in the state machine to detect case where there is an unreachable join task where inbound criteria is partially satisfied but all inbound tasks have already completed and it will never be satisified.
Add more advance test cases of join all and join count. Identified a bug related to inbound branches (failure at upstream tasks) not reaching the join task.
Update the docs section on defining join task to include the alternate use case of `join: <integer>`.
@codecov-io
Copy link

codecov-io commented Mar 26, 2020

Codecov Report

Merging #194 into master will decrease coverage by 0.00%.
The diff coverage is 94.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #194      +/-   ##
==========================================
- Coverage   94.02%   94.02%   -0.01%     
==========================================
  Files          41       41              
  Lines        2696     2728      +32     
  Branches      530      540      +10     
==========================================
+ Hits         2535     2565      +30     
- Misses         99      100       +1     
- Partials       62       63       +1     
Impacted Files Coverage Δ
orquesta/utils/jsonify.py 92.59% <0.00%> (ø)
orquesta/conducting.py 93.24% <94.11%> (-0.16%) ⬇️
orquesta/constants.py 100.00% <100.00%> (ø)
orquesta/exceptions.py 97.61% <100.00%> (+0.11%) ⬆️
orquesta/graphing.py 91.74% <100.00%> (+0.15%) ⬆️
orquesta/machines.py 86.00% <100.00%> (+0.50%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5d10f6d...b4a874a. Read the comment docs.

@m4dcoder m4dcoder requested a review from blag March 26, 2020 04:55
Copy link
Contributor

@blag blag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid PR, just some small PEP 8 changes to make, and some smaller, optional cosmetic changes that would be nice (IMHO).

| |
+-- parallel_task_3 --+
|
+-- barrier_task (only requires 2 of 3 tasks) --+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would draw this out like this:

.. code-block:: none

    =---- time (not to scale) ---->

    setup_task --+
                 |
                 +-- parallel_task_1 -------------------+
                 |
                 +-- parallel_task_2 --+
                 |                     |
                 +-- parallel_task_3 --+
                                       |
                                       +-- barrier_task (only requires 2 of 3 tasks) --+
                                                                                       |
                                                                                       +-- [finish]

That way it doesn't look like the join action is waiting on parallel_task_1.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. This is not accurate description of the workflow. This shows that the parallel_task_1 is on its own and does not join with the other parallel tasks at the barrier_task. But in fact, the join is satisfied as long as 2 out of 3 parallel tasks complete regardless of which one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about something like this?

.. code-block:: none

    =---- time (not to scale) ---->

    setup_task --+
                 |
                 +-- parallel_task_1 -------------------+
                 |                                      . (joins to barrier_task)
                 +-- parallel_task_2 --+                .
                 |                     |                .
                 +-- parallel_task_3 --+                .
                                       |                .
                                       +-- barrier_task (only requires 2 of 3 tasks) --+
                                                                                       |
                                                                                       +-- [finish]

I think we should make it visually apparent that parallel_task_1 is started with the other parallel tasks, but since it isn't finished before the other two parallel tasks finish, it doesn't join to the barrier_task with the other parallel tasks before that gets scheduled.

I use the dots to indicate that it joins barrier_task but isn't required. How does that look to you?

Copy link
Contributor

@blag blag Mar 31, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We talked on Slack and compromised on using * instead of | to indicate the transition to barrier_task to differentiate this join: 2 case from the join: all case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I changed to all asterisk for all lines from parallel tasks to the barrier task. Let's hope balance between accuracy and descriptive picture is good enough.

Copy link
Contributor

@blag blag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@m4dcoder m4dcoder merged commit 359a33f into master Apr 1, 2020
@m4dcoder m4dcoder deleted the fix-join-on-task-complete branch April 1, 2020 00:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Workflow join is not triggered on complete for failed task(s) Workflow succeeded before the join task is executed

4 participants