-
-
Notifications
You must be signed in to change notification settings - Fork 748
Add a lock to distributed.profile for better concurrency control
#6421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
a03792e
1044c95
2ed371c
a660bbc
c0313cc
1974e26
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -44,6 +44,9 @@ | |
| from distributed.metrics import time | ||
| from distributed.utils import color_of | ||
|
|
||
| #: This lock can be acquired to ensure that no instance of watch() is concurrently holding references to frames | ||
| lock = threading.Lock() | ||
|
|
||
|
|
||
| def identifier(frame: FrameType | None) -> str: | ||
| """A string identifier from a frame | ||
|
|
@@ -314,18 +317,6 @@ def traverse(state, start, stop, height): | |
| } | ||
|
|
||
|
|
||
| _watch_running: set[int] = set() | ||
|
|
||
|
|
||
| def wait_profiler() -> None: | ||
| """Wait until a moment when no instances of watch() are sampling the frames. | ||
| You must call this function whenever you would otherwise expect an object to be | ||
| immediately released after it's descoped. | ||
| """ | ||
| while _watch_running: | ||
| sleep(0.0001) | ||
|
|
||
|
|
||
| def _watch( | ||
| thread_id: int, | ||
| log: deque[tuple[float, dict[str, Any]]], # [(timestamp, output of create()), ...] | ||
|
|
@@ -337,24 +328,20 @@ def _watch( | |
|
|
||
| recent = create() | ||
| last = time() | ||
| watch_id = threading.get_ident() | ||
|
|
||
| while not stop(): | ||
| _watch_running.add(watch_id) | ||
| try: | ||
| if time() > last + cycle: | ||
| if time() > last + cycle: | ||
| recent = create() | ||
| with lock: | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we move this down to just above the
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I believe for what we're looking for we need to lock for the entire context the frame exists, i.e. if external code has the lock, there is no frame. In the context of GC, the frame is the offender that holds references to some objects we want to ensure are collected.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think it's a trade-off between more accurate timestamps and less locking. My thought was that in case we have some thread hogging the lock for a bit (e.g., a particularly expensive run of garbage collection), this would ensure that timestamps are only created once we have actually acquired the lock. However, in practice, this should be an edge case. At the same time, the reduction in lock time that we gain from moving the lock should be marginal, so I think I'd prefer more accurate timestamps. What's your take on this?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The issue I found in #6033 (comment) was not that the profiler was keeping hold of the
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. See #6364 (comment) for the motivation to do this. We essentially want a means to avoid that a profiler holds references to frames while |
||
| log.append((time(), recent)) | ||
| recent = create() | ||
| last = time() | ||
| try: | ||
| frame = sys._current_frames()[thread_id] | ||
| except KeyError: | ||
| return | ||
|
|
||
| process(frame, None, recent, omit=omit) | ||
| del frame | ||
| finally: | ||
| _watch_running.remove(watch_id) | ||
| try: | ||
| frame = sys._current_frames()[thread_id] | ||
| except KeyError: | ||
| return | ||
|
|
||
| process(frame, None, recent, omit=omit) | ||
| del frame | ||
| sleep(interval) | ||
|
|
||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed in favor of the lock to avoid inconsistencies.