You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We refactor the native periodic thread implementation to be fork-safe
whereby all such threads that are running at the time of a fork are
automatically stopped before the fork, then restarted after it. We also
take care to avoid stopping and restarting threads in the parent process
if we detect an immediate call to fork again.
Some of the implications of this change are that there is no longer the
need for fork-safe synchronisation objects, such as Locks and Events.
This is because all (ddtrace) threads are guaranteed to be stopped
before a fork, and restarted afterwards. There is also no longer the
need to manually recreate threads, as currently done in many places. The
only thing that needs to be taken care of is to ensure that the state of
the periodic services is as expected after a fork. For example, many
products have the need to avoid sending duplicate values from different
processes. This can be achieved by either subclassing from
`ForksafeAwakeablePeriodicService` and implementing the `reset` method
to let any periodic service to automatically trigger the reset logic on
forks (the preferred way), or by the current approach of registering a
fork-safe hook.
## Performance Analysis
Stopping and restarting threads can be expensive but gives good
fork-safety guarantees. Because threads have to be joined before the
fork can continue, a forked process might get delayed by a thread
currently busy on I/O. Because our periodic threads run on periods that
are O(1s), we can expect these unfortunate event to be quite rare.
Besides, it is quite likely that a process forks at the very beginning
of its execution, where it is unlikely that the periodic threads have
had a chance to trigger their first periodic action.
The timer set around close `fork` invocation should reduce delays even
more by preventing the parent process from stopping and restarting
threads in between `fork` calls. To give some number, the invocation of
1000 forks in a tight loop with the tracer and the profiler enabled
require ~1.2s to complete on an M1. In comparison, the same loop without
`ddtrace` takes 0.3s. The same loop with stop/restart in between forks
takes ~1.5s.
The wall-time view through a profiler gives the following picture:
<img width="1490" height="350" alt="Screenshot 2025-07-29 at 11 27 49"
src="https://github.com/user-attachments/assets/8e336a9e-96b0-4c96-be62-f76cac8cd8e0"
/>
A good portion of the overhead comes from the tracing of `fork` itself,
which could be improved by moving it to the native layer in follow-up
work.
## Checklist
- [ ] PR author has checked that all the criteria below are met
- The PR description includes an overview of the change
- The PR description articulates the motivation for the change
- The change includes tests OR the PR description describes a testing
strategy
- The PR description notes risks associated with the change, if any
- Newly-added code is easy to change
- The change follows the [library release note
guidelines](https://ddtrace.readthedocs.io/en/stable/releasenotes.html)
- The change includes or references documentation updates if necessary
- Backport labels are set (if
[applicable](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting))
## Reviewer Checklist
- [ ] Reviewer has checked that all the criteria below are met
- Title is accurate
- All changes are related to the pull request's stated goal
- Avoids breaking
[API](https://ddtrace.readthedocs.io/en/stable/versioning.html#interfaces)
changes
- Testing strategy adequately addresses listed risks
- Newly-added code is easy to change
- Release note makes sense to a user of the library
- If necessary, author has acknowledged and discussed the performance
implications of this PR as reported in the benchmarks PR comment
- Backport labels are set in a manner that is consistent with the
[release branch maintenance
policy](https://ddtrace.readthedocs.io/en/latest/contributing.html#backporting)
---------
Co-authored-by: Alberto Vara <alberto.vara@datadoghq.com>
Co-authored-by: Munir Abdinur <munir.abdinur@datadoghq.com>
Co-authored-by: vianney <vianney.ruhlmann@datadoghq.com>
Co-authored-by: Brett Langdon <brett.langdon@datadoghq.com>
Co-authored-by: Gyuheon Oh <gyuheon.oh@datadoghq.com>
Co-authored-by: Gyuheon Oh <102937919+gyuheon0h@users.noreply.github.com>
Co-authored-by: Thomas Kowalski <thomas.kowalski@datadoghq.com>
Co-authored-by: Juanjo Alvarez Martinez <juanjo.alvarezmartinez@datadoghq.com>
0 commit comments