Skip to content

Conversation

@haikoschol
Copy link

This PR changes how multistream-select messages received on an opening outbound substream are handled on webrtc connections. For details please see the issue description.

The PR builds on top of #441 for the str0m update. But looking at it again I realized that it also made changes to multistream-select handling. I'll look into what those are and how they relate to my changes. In any case, I believe my changes bring the implementation closer to compliance with the multistream-select spec and enable interoperability with smoldot.

fixes #464

@haikoschol haikoschol force-pushed the haiko-webrtc-outbound-multistream-nego-fix branch from 93e6151 to 85fae21 Compare November 5, 2025 13:43
@haikoschol
Copy link
Author

haikoschol commented Nov 13, 2025

@lexnv I've made some more changes and think this PR is now in a pretty good state.

I had to change a few tests, but hopefully only their implementation, not their meaning.

If I haven't missed anything, all changes should now be limited to the webrtc-specific code.

I've tested the following with a litep2p listener based on this PR:

With a smoldot-ish dialer running in Chrome:

  • connection establishment
  • multistream-select negotiation on outbound substreams
  • multistream-select negotiation on inbound substreams
  • outbound ping
  • inbound ping

With a rust-libp2p dialer using webrtc.rs 0.13.0:

  • connection establishment
  • multistream-select negotiation on inbound substreams
  • perf behaviour

bytes.put_u8((proto.as_ref().len() + 1) as u8); // + 1 for \n
let _ = Message::Protocol(proto).encode(&mut bytes).unwrap();

let expected_message = bytes.freeze().to_vec();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dq: This slightly changes the meaning of the messages.

Before we were expecting:

[Protocols(/multistream/1.0.0), Protocol(/13371338/proto/1)]

Now we are expecting:

// \n added
[Header(/multistream/1.0.0 /\n),

// \n counted but not checked
Protocol(/13371338/proto/1 [])

Hmm, one of those representation can't be spec compliant right?

Also are we ignoring malformated messages:

  • we decode the frame size correctly
  • but we don't check then if the last character is \n or not sufficeint bytes provided?

Copy link
Author

@haikoschol haikoschol Nov 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the test ensures that the wire format that WebRtcDialerState::propose() produces is what we expect. Before it actually ensured that whatever propose() produces can be decoded with Message::decode(). So I did change the semantics of the test after all. But I would argue that the new meaning is more useful. We could also add decoding and assert we get the expected strings back, but due to the changes made, this can no longer be done by just calling Message::decode().

Copy link
Collaborator

@lexnv lexnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job so far! Feels like we are getting closer to figuring out the multistream select 🙏

My questions revolve around the spec correctness of the implementation, or if we basically implemented a tiny subset in the past? And mainly, which implementation is causing us to extensively parse the frames (libp2p or smoldot or both?) 🤔

- add logging calls
- clean up imports
- add constant for Protocol::try_from(&b"/multistream/1.0.0"[..])
@haikoschol
Copy link
Author

And mainly, which implementation is causing us to extensively parse the frames (libp2p or smoldot or both?) 🤔

The litep2p implementation is causing us to extensively parse the frames. :D This is because we always deal with the entire content of a WebRTC protobuf frame, which may contain one or more multistream messages.

In rust-libp2p the WebRTC implementation uses MessageIO and LengthDelimited in a layer below the multistream-select logic. The multistream-select logic gets fed from a stream that yields one multistream_select::protocol::Message at a time, even if the header and the selected protocol are in the same WebRTC protobuf frame.

I hope this makes some sense. I'll go into detail in our call today.

@lexnv lexnv requested a review from dmitry-markin November 18, 2025 12:50
Copy link
Collaborator

@lexnv lexnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic looks good to me!

Great job on getting str0m updated and figuring out the multistream select implementation 🙏

Copy link
Collaborator

@dmitry-markin dmitry-markin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good code-wise, but I haven't checked the conformance to the spec.

The only concern is the use of ChannelBufferedAmountLow event.

supported_protocols: &'a mut impl Iterator<Item = &'a ProtocolName>,
payload: Bytes,
) -> crate::Result<ListenerSelectResult> {
let payload = if payload.len() > 2 && payload[0..payload.len() - 2] != b"\n\n"[..] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way around this without allocation? As the comment in Message::decode says, the well-formed message is terminated with a newline. Also, why are we checking for two newlines, but are adding one?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be avoided by passing a BytesMut to the function. While looking into this, I noticed that there are (failing) tests in this module that I didn't run when working on this PR and then I noticed that the tests are not run in GH Actions either. Is this intentional? (cc @lexnv)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why are we checking for two newlines, but are adding one?

This is quite messy because we are misusing Message::Protocols to parse what is actually two distinct multistream-select messages that are often sent inside one WebRtcMessage. I wanted to keep the changes in this PR minimal, knowing that we need to rewrite the listener negotiation anyway to support the "back-and-forth" (as you pointed out in #75).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a way to avoid appending a newline and using Message::Protocols to decode, by reusing drain_trailing_protocols(). I've implemented this in c0a44c0 .

I have not retested this with libp2p and smoldot. Will do that tomorrow.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not retested this with libp2p and smoldot. Will do that tomorrow.

litep2p-perf with a libp2p dialer

works (d/l up to 512KiB, u/l up to 16MiB)

litep2p-perf with a smoldot dialer

litep2p crashes apparently during the noise handshake:

2025-11-27T12:45:32.307986Z DEBUG litep2p::webrtc: received non-stun message source=127.0.0.1:63971

thread 'main' (15158228) panicked at /Users/haiko/code/polkadot/litep2p/src/transport/webrtc/opening.rs:432:70:
to succeed: ParseError(InvalidData)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2025-11-27T12:45:32.308961Z  WARN litep2p::transport-service: transport service closed
2025-11-27T12:45:32.309016Z  WARN litep2p::transport-service: transport service closed
2025-11-27T12:45:32.309022Z  WARN litep2p::transport-service: transport service closed

This makes absolutely no sense to me yet. I know that I've tested this before and I haven't made any changes to the smoldot side since. I've tried a few previous commits for litep2p but always had the same result. Need to keep investigating...

webrtc-poc with a libp2p dialer

what works:

  • noise handshake
  • substream opening and multistream-select negotiation from libp2p to litep2p
  • ping from libp2p to litep2p

what doesn't work:

  • substream opening and multistream-select negotiation from litep2p to libp2p

I thought I had tested this before, but I'm not sure. Maybe I just focused on smoldot. My previous comment here doesn't mention this pairing either. I've also tried previous commits on this PR branch without success. Need to keep investigating here as well.

webrtc-poc with a smoldot dialer

works

Copy link
Author

@haikoschol haikoschol Dec 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

webrtc-poc with a libp2p dialer

what works:

  • noise handshake
  • substream opening and multistream-select negotiation from libp2p to litep2p
  • ping from libp2p to litep2p

what doesn't work:

  • substream opening and multistream-select negotiation from litep2p to libp2p

I figured out what is going on here and opened #494. The relevant insight in the context of this review is that the faulty behavior is not caused by a regression introduced in this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

litep2p-perf with a smoldot dialer

litep2p crashes apparently during the noise handshake:

2025-11-27T12:45:32.307986Z DEBUG litep2p::webrtc: received non-stun message source=127.0.0.1:63971

thread 'main' (15158228) panicked at /Users/haiko/code/polkadot/litep2p/src/transport/webrtc/opening.rs:432:70:
to succeed: ParseError(InvalidData)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
2025-11-27T12:45:32.308961Z  WARN litep2p::transport-service: transport service closed
2025-11-27T12:45:32.309016Z  WARN litep2p::transport-service: transport service closed
2025-11-27T12:45:32.309022Z  WARN litep2p::transport-service: transport service closed

This makes absolutely no sense to me yet. I know that I've tested this before and I haven't made any changes to the smoldot side since.

I've looked into this issue again and figured out what is going on. I hadn't actually tested this specific combination before (litep2p listener, smoldot dialer, perf behavior). The combinations I tested were:

  • libp2p listener, smoldot dialer, perf behavior
  • litep2p listener, smoldot dialer, ping in both directions

I have two branches in the ChainSafe smoldot fork; one for the perf behavior (haiko-webrtc-perf) and one for ping (haiko-libp2p-webrtc). When I tested the second combination listed above (and found the issue this PR is addressing) I ran into this crash in the noise handshake on the branch haiko-libp2p-webrtc and fixed it (or rather avoided it) in this commit. Making the same change on the other branch also avoids this crash.

What happens is that without this fix, smoldot sends a two byte long message (i.e. an incomplete WebRTC protobuf frame), which fails to decode here and panics due to an .expect() here.

In short, this crash is not a regression caused by the changes in this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dmitry-markin can you please go through my comments above and resolve this thread if you are happy with the change I made in c0a44c0?

Co-authored-by: Dmitry Markin <dmitry@markin.tech>
@haikoschol haikoschol force-pushed the haiko-webrtc-outbound-multistream-nego-fix branch from ead7704 to 90f0c86 Compare November 26, 2025 06:32
@haikoschol haikoschol force-pushed the haiko-webrtc-outbound-multistream-nego-fix branch from de599fc to c0a44c0 Compare November 26, 2025 14:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

webrtc: multistream-select negotiation not fully spec compliant

4 participants