Only collect async ID when it's safe to do so#197
Conversation
Overall package sizeSelf size: 9.57 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | source-map | 0.7.4 | 226 kB | 226 kB | | pprof-format | 2.1.0 | 111.69 kB | 111.69 kB | | p-limit | 3.1.0 | 7.75 kB | 13.78 kB | | delay | 5.0.0 | 11.17 kB | 11.17 kB | | node-gyp-build | 3.9.0 | 8.81 kB | 8.81 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
4d35050 to
62e041f
Compare
BenchmarksBenchmark execution time: 2025-03-06 17:10:57 Comparing candidate commit 79992a0 in PR branch Found 3 performance improvements and 0 performance regressions! Performance is the same for 90 metrics, 27 unstable metrics. scenario:profiler-idle-with-wall-profiler-20
scenario:profiler-light-load-no-wall-profiler-16
scenario:profiler-light-load-no-wall-profiler-18
|
62e041f to
a2384e5
Compare
| return; | ||
| } | ||
| auto isolate = Isolate::GetCurrent(); | ||
| if (!isolate || isolate->IsDead()) { |
There was a problem hiding this comment.
In which cases does GetCurrent returns null ?
There was a problem hiding this comment.
I honestly don't know – I'm just trying to be pedantically defensive and not unconditionally dereference a pointer. I can envision a race condition between signal delivery and an isolate going away? I know the logic is different, but even the V8 sampler manager checks for isolate on a sampler not being null.
There was a problem hiding this comment.
I could be more lenient and still invoke old_handler?
- don't sample a dead isolate - only try to retrieve an async ID when there's a context
a2384e5 to
bb0bc2d
Compare
Use relaxed load/store with appropriate signal fences for cheaper reading/writing of gcCount.
5be2a03 to
a694680
Compare
| withContexts?: boolean; | ||
| workaroundV8Bug?: boolean; | ||
| collectCpuTime?: boolean; | ||
| collectAsyncId?: boolean; |
There was a problem hiding this comment.
Well, that's new. I'm inclined to ignore it, seeing how existing boolean properties also don't follow this format.
3c86526 to
79992a0
Compare
| } | ||
| } | ||
|
|
||
| #define DD_WALL_PROFILER_GET_BOOLEAN_CONFIG(name) \ |
There was a problem hiding this comment.
No a big fan of macros, but here this nicely simplifies the code.
|
Overall this looks good ! |
I'd say I'm reasonably certain. In any case, I don't have an idea for other kinds of safeguards right now. All crashtracker reports (about 165) we have for the past month from people stuck on the unsafe version of the profiler were in GC, as well as 5 reports where the crashtracker failed to unwind so we can't tell. |
* Guard against collecting async ID while performing GC * Additional sanity checks: - don't sample a dead isolate - only try to retrieve an async ID when there's a context * Move GetAsyncId and GC callbacks out of the header file. Use relaxed load/store with appropriate signal fences for cheaper reading/writing of gcCount. * Have a config option for collecting async IDs
* Guard against collecting async ID while performing GC * Additional sanity checks: - don't sample a dead isolate - only try to retrieve an async ID when there's a context * Move GetAsyncId and GC callbacks out of the header file. Use relaxed load/store with appropriate signal fences for cheaper reading/writing of gcCount. * Have a config option for collecting async IDs
What does this PR do?:
Reenables async ID collection with guards. Prevents collecting async ID when either GC is ongoing (captures the ID at the GC prologue) or when the isolate has no current context.
Motivation:
We want to safely collect async ID.
Additional Notes:
Jira: PROF-11265