feat(capablities)!: sleep & spawn capabilities#1873
feat(capablities)!: sleep & spawn capabilities#1873Aaalibaba42 wants to merge 6 commits intomainfrom
Conversation
📚 Documentation Check Results📦
|
🔒 Cargo Deny Results📦
|
Clippy Allow Annotation ReportComparing clippy allow annotations between branches:
Summary by Rule
Annotation Counts by File
Annotation Stats by Crate
About This ReportThis report tracks Clippy allow annotations for specific rules, showing how they've changed in this PR. Decreasing the number of these annotations generally improves code quality. |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1873 +/- ##
==========================================
+ Coverage 71.78% 71.82% +0.04%
==========================================
Files 434 437 +3
Lines 69951 70156 +205
==========================================
+ Hits 50211 50391 +180
- Misses 19740 19765 +25
🚀 New features to boost your workflow:
|
This comment has been minimized.
This comment has been minimized.
Artifact Size Benchmark Reportaarch64-alpine-linux-musl
aarch64-unknown-linux-gnu
libdatadog-x64-windows
libdatadog-x86-windows
x86_64-alpine-linux-musl
x86_64-unknown-linux-gnu
|
0f2d27f to
628bd2c
Compare
628bd2c to
4145e42
Compare
| stats = StatsComputationStatus::DisabledByAgent { bucket_size }; | ||
| } | ||
|
|
||
| #[cfg(all(not(target_arch = "wasm32"), feature = "telemetry"))] |
There was a problem hiding this comment.
Can't this #[cfg...] mess be separated out in some way? This is barely readable.
There was a problem hiding this comment.
I'm not sure how without making spaghetti, do you have something in mind ?
There was a problem hiding this comment.
One possibility would be to accept that the two implementations are too different and make two separate versions of build. If there's really code duplication maybe some part can be moved into separate helpers?
There was a problem hiding this comment.
(or dually move the platform-dependents bit into bespoke functions?)
166e5c7 to
f4b19a7
Compare
59f8c03 to
f4b19a7
Compare
f4b19a7 to
be149e6
Compare
| { | ||
| RuntimeJoinHandle(ctx.spawn(future)) | ||
| } | ||
| } |
There was a problem hiding this comment.
Should this be replaced by NativeSpawnCapability
There was a problem hiding this comment.
I don't think it should:
- NativeSpawnCapability lives in libdd-capabilities-impl, which is a leaf crate, libdd-shared-runtime (a core crate) can't depend on it without inverting the layering
- libdd-capabilities-impl doesn't compile on wasm32, and this module needs to (the non-fork parts are wasm-compatible)
- We could make the fork methods generic and have the FFI caller pass a spawner in, but it adds API surface for no real benefit, the internal RuntimeSpawner is a 5-line ZST that avoids the dependency while keeping the fork hooks simple
There was a problem hiding this comment.
I might be missing something, but when you say:
NativeSpawnCapability lives in libdd-capabilities-impl, which is a leaf crate, libdd-shared-runtime (a core crate) can't depend on it without inverting the layering
Why should shared_runtime depend on it once we move everything to libdd-capabilities-impl? I think (but I'm not sure) @VianneyRuhlmann's question is, or at least the one I have as well, is why this code has to be in shared_runtime at all. It naively looks like it's entirely capability and native-specific, so should live in libdd-capabilities-impl. Do you need some private symbols from here?
There was a problem hiding this comment.
The fork methods are impl SharedRuntime and access private fields. They can't move to libdd-capabilities-impl without making those fields public. RuntimeSpawner is only consumed by these methods, so it's co-located with them. Moving just the spawner out wouldn't buy anything since the methods that use it must stay here regardless
There was a problem hiding this comment.
I see, thanks! Now I wonder if we shouldn't just move all of the shared runtime crate into libdd-capabilities 😛 but that's another debate for another day I guess. Current situation sounds reasonable for this PR.
Though, the new refactoring also removed shutdown and block_on from WASM interface. Is that ok?
There was a problem hiding this comment.
I think it is, we don't really want to do these since it would amount to blocking the EventLoop (and we can't shut it down). The point is mainly to have a "proxy" for shared runtime when throwing futures on the JS side.
Maybe moving more of the shared_runtime as a capability would make sens, but I don't think we can escape the fact that we would not have the same API on both platforms because they can't do the same things. So except having sub-capabilities for SharedRuntimeCapabilityCapabilities, which is a mouthful (amongst other problems), I don't know how we'd bridge that gap.
There was a problem hiding this comment.
That makes sense. I could see something like having a capability (AsyncCapability ? But JS does have async 🤔 ) that would only be implemented by native, but as you say, it doesn't change the state of affairs; it merely pushes the detail into the capability crate (which maybe is still nice to have, but not a fundamental improvement either)
be149e6 to
2bc0664
Compare
ce7a92b to
29583cd
Compare
There was a problem hiding this comment.
I have mixed feelings about the complexity and boilerplate added. Part of it is inherent to the capability approach. So the elephant in the room: is it acted and 100% decided that we want to go the WASM route and that we should support that? It has an impact of the whole of libdatadog, and it seems for now that the route for dd-trace-js is not entirely decided yet. It's obviously not one person's decision, but I wonder if we should move forward and merge things if this is not fully committed yet; since it's far-reaching, it might be harder to revert than just a git revert, as people will quickly build on it.
Another point is the shared runtime. My understanding was that the whole promise of capabilities (vs a more classical platform-specific swappable module implementation as we do for linux vs windows for example) was to push all of the WASM and node-specific concerns out of libdatadog. However the impression when looking at the shared runtime's changes is that it's full of WASM-specific concerns, which feels like we got the drawbacks of both approaches (complexity + wasm-specific code in libdatadogand cfg gates). Some functions are available on WASM, some others are not. I'm not sure yet what's the conclusion, but I think the interface of the shared runtime should be reworked if we want it to work with capabilities. Maybe the conclusion is that the whole shared runtime should be a capability, since it's offering an API that has fundamentally different implementation on WASM and native (and it seems impossible to delegate that to existing capabilities). Or maybe the answer is that we need to move some of the shared runtime into new capability.ies for async, but can still keep some code in libdd-shared-runtime. In any case, the current situation feels a bit wrong.
| /// `T` instead of `Result<T, JoinError>`. | ||
| /// | ||
| /// A `JoinError` means the spawned task panicked or was aborted. Workers use | ||
| /// `CancellationToken` for graceful shutdown, so `JoinError` indicates a bug. |
There was a problem hiding this comment.
It's really a detail, but I think they don't use cancellation tokens any more with the new shared runtime?
| { | ||
| RuntimeJoinHandle(ctx.spawn(future)) | ||
| } | ||
| } |
There was a problem hiding this comment.
I might be missing something, but when you say:
NativeSpawnCapability lives in libdd-capabilities-impl, which is a leaf crate, libdd-shared-runtime (a core crate) can't depend on it without inverting the layering
Why should shared_runtime depend on it once we move everything to libdd-capabilities-impl? I think (but I'm not sure) @VianneyRuhlmann's question is, or at least the one I have as well, is why this code has to be in shared_runtime at all. It naively looks like it's entirely capability and native-specific, so should live in libdd-capabilities-impl. Do you need some private symbols from here?
| assert!(agent_info::get_agent_info().is_none()); | ||
| let shared_runtime = SharedRuntime::new().unwrap(); | ||
| shared_runtime.spawn_worker(fetcher, true).unwrap(); | ||
| let spawner = NativeCapabilities::new(); |
There was a problem hiding this comment.
Can we remove the spawner and pass capabilities at shared_runtime init?
There was a problem hiding this comment.
Storing the spawner in SharedRuntime would make it generic (SharedRuntime<S>) or require type erasing a non object-safe trait. It's also not straightforward because the fork recovery paths use their own internal RuntimeSpawner, not the caller-provided one. In practice, production callers already have capabilities in scope so passing it to spawn_worker isn't a burden imho
| meta: StatsMetadata, | ||
| sequence_id: AtomicU64, | ||
| client: H, | ||
| capabilities: Cap, |
There was a problem hiding this comment.
ultra nit: Since we're going to pass the capabilities in a lot of functions it would be nice to have a "standardized" way of passing it e.g. "always the first param" to make it easier to recognize.
There was a problem hiding this comment.
Mmh I think this would be hard to enforce, and it can be conflicting with stuff etc. Generally I agree but I'm not sure it's realistic on a common codebase like this one. But clearly it's far from a hill I wish to die on.
c6789b6 to
e8146c9
Compare
fix: ZST spawner fix: docs and fixes fix(shared_runtime): implicit debug impl chore: format feat: change the Output to Result feat: mod native cfg gated for better readability chore: revert to main's version of thing
e8146c9 to
1e64b86
Compare
68919da to
a2db1c7
Compare
a2db1c7 to
6171c90
Compare
8b92663 to
1b99b14
Compare
What does this PR do?
Motivation
Before if we were to spawn tasks on wasm, we would have required a current thread tokio runtime blocking JS event loop until the end of the tokio runtime itself. Now in wasm we delegate tasks to the eventloop itself, making the whole thing non blocking, and enabling stuff like sleeping in a JS compatible way.
Additional Notes
/
How to test the change?
DataDog/libdatadog-nodejs#70