-
Notifications
You must be signed in to change notification settings - Fork 3k
Description
I'm running into an issue where sometimes my client will successfully connect to my server and sometimes it will fail to connect. In those clients that successfully connect, here's the sequence of (relevant) system calls that are emitted (per strace):
socket(PF_INET, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, IPPROTO_IP) = 10
connect(10, {sa_family=AF_INET, sin_port=htons(25004), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
getsockopt(10, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
write(10, "GET /socket.io/1/websocket/BajUhY_8xzo-_XRIhP8h HTTP/1.1\r\nConnection: Upgrade\r\nUpgrade: websocket\r\nSec-WebSocket-Version: 13\r\nSec-WebSocket-Key: MTMtMTM3MDY0MDI5NzgwNA==\r\nHost: localhost:25004\r\n\r\n", 196) = 196
read(10, "HTTP/1.1 101 Switching Protocols\r\nUpgrade: websocket\r\nConnection: Upgrade\r\nSec-WebSocket-Accept: 7U1sT9kzbqM2W9Vv5cBYvnj2bGw=\r\n\r\n", 65536) = 129
setsockopt(10, SOL_TCP, TCP_NODELAY, [1], 4) = 0
read(10, "\201\0031::", 65536) = 5
When the client fails to connect, here's the sequence of system calls:
connect(10, {sa_family=AF_INET, sin_port=htons(25004), sin_addr=inet_addr("127.0.0.1")}, 16) = -1 EINPROGRESS (Operation now in progress)
getsockopt(10, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
write(10, "GET /socket.io/1/websocket/DeS7QX6gRSkl-_DIhP84 HTTP/1.1\r\nConnection: Upgrade\r\nUpgrade: websocket\r\nSec-WebSocket-Version: 13\r\nSec-WebSocket-Key: MTMtMTM3MDY0MDU2NDg0MA==\r\nHost: localhost:25004\r\n\r\n", 196) = 196
read(10, "HTTP/1.1 101 Switching Protocols\r\nUpgrade: websocket\r\nConnection: Upgrade\r\nSec-WebSocket-Accept: kEaSyYQBCSitOAWCk9adbN7e5BY=\r\n\r\n\201\0031::", 65536) = 134
setsockopt(10, SOL_TCP, TCP_NODELAY, [1], 4) = 0
AFAICT, the key difference is in what the read syscall gets as a value. In the non-working case, it seems like what should take two reads to receive is being collapsed into a single read. Supporting this is that I see the most connection issues when the system is under load. My theory is that when the system isn't loaded, the client has time to read the first value before the server sends the second one, but when the machine is loaded, it's more likely that the server will send both packets before the client can read the first.
My tracing of the server shows that the server is always doing separate writes to the socket. I'll try to use tcpdump/wireshark to observe the actual packets in flight to determine whether the packet is being formed incorrectly on the server or read incorrectly on the client.
node version: v0.10.8
socket.io-client version: 0.9.11
socket.io version: 0.9.14