You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
The implementation for get_next_schedulable_task() is not efficient due to lack of classification of task stages and task status, especially when there's thousands of tasks to be scheduled. And it's also not easy to:
stage-level priority-based scheduling
stage-level retry
speculative task scheduling
Describe the solution you'd like
To draw lessons from the Spark, it's better to divide the task scheduling into two levels' scheduling:
to introduce QueryStageScheduler for scheduling stages
to introduce StageManager for managing job stages and the state machine for each stage.
to let TaskScheduler for fetching tasks from running stages in StageManager