Prioritize task execution by recipe order#4
Conversation
|
That would be great! |
|
Is this still a draft or can I already try it? This feature would help me a lot to get more robust number for the CPU time, which I want to report in the paper. |
|
You can already try it, i think it works, but I need to change it a bit so the priority of the input variables used for derivation is correctly set and I need to add unit tests. |
|
I finally found some time to test this. Unfortunately the NCL scripts are crashing due to missing That's strange, since the |
|
Anything I can do to help with this? |
|
To fully make use of this feature, it would be useful to write the time required by each task in the debug log (this is done only for the provenance at the moment). In this way, the user can execute the recipe once and then reorder the tasks according to their runtimes. Is that possible? |
|
I tried to reorder the diagnostics in The task are now started in the order listed in the recipe, but this helps only for the preprocessor part of the diagnostics. See the list below: the two tasks to derive the variable So this PR helps a lot and should be merged, but it's probably not addressing #3 completely. Without task priority: With task priority: |
|
I added the run time and task priority to the info message, this should help with the ordering.
This is because the current implementation should start as many tasks as configured in config-user.yml, starting with the tasks with the lowest priority. However, it can only start a task if all of it's ancestors are ready. Could you post the results above, but including the task completed messages? Because then we can see if there is a problem or tasks are simply waiting for ancestor tasks. |
I think they are simply waiting for ancestors: This is a nice example, since |
In that case, I don't think there is any way in which changing the order could help any further with reducing runtime. Or do you have ideas? |
No, but I wonder why we don't get the 68 tasks of this recipe always executed in the same order. This is the starting order of the tasks in the 5 tests I've run with task1.txt The first 16 in the list are basically always the same (since I run with 16 parallel tasks). But afterwards the order changes quite randomly. For example, the bottleneck task Maybe I should try to look at the other tasks as well and try to optimize the order further. Should we mention this new feature somewhere in the documentation? |
That is to be expected, since they will be the highest priority tasks without ancestors, this is always the same
This is probably due to variation in the run-time of the ancestor tasks. This variation can be caused by waiting for a shared resource. This could be lots of things, it could be other preprocessing tasks if you're running too many tasks on the same node, or it could be access to storage affected by other user's demands...
Yes, that might help, but it really depends on what is causing the variation.
Good point, I'll add documentation to this pull request. |
|
Thanks for clarifying! |
|
@mattiarighi I ended up adding a lot more documentation than planned, because it looked like the concept of tasks was not explained anywhere, so it was difficult to talk about their priority. |
Well done, thanks! |
Changed the text accordingly |
Closes #3