-
-
Notifications
You must be signed in to change notification settings - Fork 748
Log state machine events #6092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log state machine events #6092
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -3,6 +3,7 @@ | |
| import heapq | ||
| import sys | ||
| from collections.abc import Callable, Container, Iterator | ||
| from copy import copy | ||
| from dataclasses import dataclass, field | ||
| from functools import lru_cache | ||
| from typing import Collection # TODO move to collections.abc (requires Python >=3.9) | ||
|
|
@@ -353,8 +354,61 @@ class AddKeysMsg(SendMessageToScheduler): | |
|
|
||
| @dataclass | ||
| class StateMachineEvent: | ||
| __slots__ = ("stimulus_id",) | ||
| __slots__ = ("stimulus_id", "handled") | ||
| stimulus_id: str | ||
| #: timestamp of when the event was handled by the worker | ||
| # TODO Switch to @dataclass(slots=True), uncomment the line below, and remove the | ||
| # __new__ method (requires Python >=3.10) | ||
| # handled: float | None = field(init=False, default=None) | ||
| _classes: ClassVar[dict[str, type[StateMachineEvent]]] = {} | ||
|
|
||
| def __new__(cls, *args, **kwargs): | ||
| self = object.__new__(cls) | ||
| self.handled = None | ||
| return self | ||
|
|
||
| def __init_subclass__(cls): | ||
| StateMachineEvent._classes[cls.__name__] = cls | ||
|
|
||
| def to_loggable(self, *, handled: float) -> StateMachineEvent: | ||
| """Produce a variant version of self that is small enough to be stored in memory | ||
| in the medium term and contains meaningful information for debugging | ||
| """ | ||
| self.handled = handled | ||
| return self | ||
|
|
||
| def _to_dict(self, *, exclude: Container[str] = ()) -> dict: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This dictionary conversion seems necessary because |
||
| """Dictionary representation for debugging purposes. | ||
|
|
||
| See also | ||
| -------- | ||
| distributed.utils.recursive_to_dict | ||
| """ | ||
| info = { | ||
| "cls": type(self).__name__, | ||
| "stimulus_id": self.stimulus_id, | ||
| "handled": self.handled, | ||
| } | ||
| info.update({k: getattr(self, k) for k in self.__annotations__}) | ||
| info = {k: v for k, v in info.items() if k not in exclude} | ||
| return recursive_to_dict(info, exclude=exclude) | ||
|
|
||
| @staticmethod | ||
| def from_dict(d: dict) -> StateMachineEvent: | ||
| """Convert the output of ``recursive_to_dict`` back into the original object. | ||
| The output object is meaningful for the purpose of rebuilding the state machine, | ||
| but not necessarily identical to the original. | ||
| """ | ||
| kwargs = d.copy() | ||
| cls = StateMachineEvent._classes[kwargs.pop("cls")] | ||
| handled = kwargs.pop("handled") | ||
| inst = cls(**kwargs) | ||
| inst.handled = handled | ||
| inst._after_from_dict() | ||
| return inst | ||
|
|
||
| def _after_from_dict(self) -> None: | ||
| """Optional post-processing after an instance is created by ``from_dict``""" | ||
|
|
||
|
|
||
| @dataclass | ||
|
|
@@ -372,6 +426,16 @@ class ExecuteSuccessEvent(StateMachineEvent): | |
| type: type | None | ||
| __slots__ = tuple(__annotations__) # type: ignore | ||
|
|
||
| def to_loggable(self, *, handled: float) -> StateMachineEvent: | ||
| out = copy(self) | ||
| out.handled = handled | ||
| out.value = None | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I understand the execution result is discarded because of the potentially large size of the result, and possibly the complexity of serialising/deserialising the result?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not discarding it would cause worker.stimulus_log to become effecitvely a copy of worker.data, except that it never loses any data!
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed! |
||
| return out | ||
|
|
||
| def _after_from_dict(self) -> None: | ||
| self.value = None | ||
| self.type = None | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I guess the execution result type is discarded here because it's merely a string representation at this point and one would have to deal with serialising/unserialising types. In any case, I think reconstructing the result of execution is non-trivial. How does this impact replayability of events on the Worker (out of interest?)
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these fields that are being discarded on a serialization round-trip should be inconsequential for the purpose of rebuilding the state. |
||
|
|
||
|
|
||
| @dataclass | ||
| class ExecuteFailureEvent(StateMachineEvent): | ||
|
|
@@ -384,6 +448,10 @@ class ExecuteFailureEvent(StateMachineEvent): | |
| traceback_text: str | ||
| __slots__ = tuple(__annotations__) # type: ignore | ||
|
|
||
| def _after_from_dict(self) -> None: | ||
| self.exception = Serialize(Exception()) | ||
| self.traceback = None | ||
|
|
||
|
|
||
| @dataclass | ||
| class CancelComputeEvent(StateMachineEvent): | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is the reason for the new method containing the self.handled = None assignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. clarified in comment