-
Notifications
You must be signed in to change notification settings - Fork 26
webrtc: Avoid future being dropped when channel is full #483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
webrtc: Avoid future being dropped when channel is full #483
Conversation
|
Draft for two reasons:
@lexnv should I wait for #465 to get merged and then merge |
The loop is unconditionally exited after one iteration and therefore pointless.
This comment was marked as outdated.
This comment was marked as outdated.
src/transport/webrtc/substream.rs
Outdated
| fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> { | ||
| self.rx.poll_recv(cx) | ||
| let item = self.rx.poll_recv(cx); | ||
| self.write_waker.wake(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could we tell from the poll_next rx channel that the poll_write tx channel has free capacity?
Isn't this introducing a slightly different variation of the same class of issue (CPU looping)?
What would happen if we save the waker in poll_write, but the channel has capacity (meaning we start sending right away)? Wouldn't we wake the context needlessly on every poll_next?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could we tell from the
poll_nextrx channel that thepoll_writetx channel has free capacity?
SubstreamHandle::rx is the other side of Substream::tx.
When a protocol wants to send data, it calls Substream::poll_write(), which writes the data to Substream::tx. WebRtcConnection::run() polls the associated SubstreamHandle, which pulls the data from SubstreamHandle::rx.
BTW, When I double-checked this, I noticed that the doc comments on SubstreamHandle::tx and SubstreamHandle::rx are incorrect. On rx it says "RX channel for receiving messages from peer." It probably should say something like "RX channel for receiving outbound messages from the associated Substream instance."
Isn't this introducing a slightly different variation of the same class of issue (CPU looping)?
I don't think so but don't know how to verify that. I'm out of my depth here on the inner workings of Rust async and tokio, so I'm relying on AI. 😐
What would happen if we save the waker in
poll_write, but the channel has capacity (meaning we start sending right away)? Wouldn't we wake the context needlessly on every poll_next?
Yes, there would be spurious wakes, although I'm not sure they would happen on every poll_next(). According to GPT 5.1:
In almost all realistic workloads, the cost of occasional spurious wakeups is much lower than the cost of introducing a heap allocation via
reserve_owned()+BoxFuture.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a test in b88b11e that is supposed to verify that the AtomicWaker approach works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep it makese sense 🙏
After digging a bit into this issue, I believe we could use tokio-util/PollSender here or similar. It will basically allocate the future once, and reutilize the memory allocated for subsequent polls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've replaced the AtomicWaker with PollSender. Unit tests pass. I'll manually test with litep2p-perf now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll manually test with litep2p-perf now.
works
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn backpressure_released_wakes_blocked_writer() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! 🚀
and fix doc comments on SubstreamHandle
This PR addresses part of #477. Specifically the issue described in this comment.