Skip to content

Latency/Load/Overhead caused by integer indexed arrays indexOf/splice/shift etc #2199

@selay

Description

@selay

Recently I was debugging to find the reason for frequent server crashes, delays in transmission and server capacity issues.

I have found several issues in socket.io and hope it might be useful to report here. Well, I had to hack it to solve the issues and it provided unbelievable improvement. However, I have only changed the parts I use and that means my current changes are not mergable and not compatible with existing modules.

Problem:
Room etc uses integer based arrays to keep sockets and then uses indexOf multiple times to find the index of socket. IndexOf is used many times during a single emit process. Additionally splice is used to remove a socket, and that means it re-arranges multiple big arrays for each socket changes.
I did a small simulation with 10000 sockets connected and randomly leaving, emitting, connecting etc.
(Compared socket.io against my hacked version which uses "associative array" where index and value are the same. No need for indexOf and splice (because no need to rearrange array indexes).)

When 10000 sockets are connected, emit is 2040 times!!! faster in the hacked version than current socket.io implementation. The server load is at least twice low, and memory usage is 30% less.
The latency and delay issues disappeared.
(it apparently solved another unrelated issue - when server is busy to accept a new handshake quickly, client side continues to make new polling requests and as a result, multiple connections were happening with the same socket.id. When a message comes, it was getting echoed/duplicated many times.)

I dont understand what the reason is to use integer indexed array and then search each time. As location is not known, each searches the entire array (10000 sockets multiple times for a single emit and how about 1000 sockets emit and this overhead is multiplied by 1000, a single disconnect rearranges all 10000 sockets etc * 1000).

Actually, you can run a simple test to compare [] acess against indexOf and can see the performance difference which is over 3000 times.

So, the current socket.io implementation is only suitable for a small-scale project where you don't expect more than 3000 sockets connecting. Or you will need unnecessarily have load balancing, multiple nodes etc.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions