Skip to content

Conversation

@dstandish
Copy link
Contributor

The start and _on_child_started methods have inconsistent signatures for the dag processor subclass.

This PR resolves that.

Why bother? Well, I need to change the path handling a little bit to pass rel path and bundle info separately. And we should get rid of the dags folder concept. And when I tried to make these changes in the execution context, it broke dag parsing, because of this signature issue.

@boring-cyborg boring-cyborg bot added area:Scheduler including HA (high availability) scheduler area:task-sdk labels Jan 10, 2025
@dstandish dstandish requested review from ashb and kaxil January 10, 2025 06:48
constructor_kwargs=dict(
selector=self.selector,
),
target=_parse_file_entrypoint,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both subclasses accept target -- do we need to pull this out here? It feels "leaky" to have a subclass but always have to pass the same value to it.

Comment on lines +430 to +431
def _on_child_started(self, *, requests_fd: int, **kwargs):
"""Send startup message to the subprocess."""
ti: TaskInstance = kwargs["ti"]
path: str | os.PathLike[str] = kwargs["path"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do this change? Other than mypy thinking that subclasses have to have the same signature, isn't this functionally identical?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's just for mypy, and whatever god enforces liskov substitution principle

"""The handle connected to stdin of the child process"""

client: Client
client: Client | None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What broke with leaving this as it was?

I wonder if instead of doing this change it might be better to have three classes -- the "base" WatchedSubproces, and then create a new class like:

class ActivitySubprocess(WatchedSubprocess):
    id: UUID
    client: Client

    def _on_child_started(self, ti: TaskInstance, path: str | os.PathLike[str], requests_fd: int):
        ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no client is provided by dag processor it was explicitly given none; so clearly the base class is ok with that. it's only for task execution scenario that it is used and depended on. that's why the type checking asserts

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah but what broke by leaving it as it was?

Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was a bit messy I agree.

I think by having this as two subclasses (for a total of three classes) also means we can keep the public signature as it is taking ActivitySubprocess.start(dag_path, ti, client=client, logger=logger), and that could then pass the right things on to child_started_kwargs in the super call etc?

@dstandish
Copy link
Contributor Author

I think by having this as two subclasses (for a total of three classes) also means we can keep the public signature as it is taking ActivitySubprocess.start(dag_path, ti, client=client, logger=logger), and that could then pass the right things on to child_started_kwargs in the super call etc?

The dag processor does not pass a TI -- it only passes (optionally) a callback list.
I'll look into whether this can be further improved by having a base class but, i think if nothing else, this PR does not make the situation worse and just makes explicit and clear all the shenanigans that were sorta hidden before

dstandish and others added 2 commits January 10, 2025 09:34
…tures

The `start` and `_on_child_started` methods have inconsistent signatures for the dag processor subclass.

This PR resolves that.

Why bother?  Well, I need to change the path handling a little bit to pass rel path and bundle info separately.  And we should get rid of the dags folder concept.  And when I tried to make these changes in the execution context, it broke dag parsing, because of this signature issue.
Co-authored-by: Ash Berlin-Taylor <ash_github@firemirror.com>
@ashb
Copy link
Member

ashb commented Jan 10, 2025

By having a separate subclass for execution we get rid of the need to have if TYPE_CEHCKING: assert self.client which feels nicer

@dstandish dstandish force-pushed the unify-signatures-for-watched-subprocess-inheritors branch from b05006d to e7ea4bc Compare January 10, 2025 17:42
@dstandish dstandish requested review from ashb and kaxil January 10, 2025 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler area:task-sdk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants