Prevent panics by closing channels on unregister#31
Prevent panics by closing channels on unregister#31daniellowtw wants to merge 2 commits intodonovanhide:masterfrom
Conversation
|
Thanks for the patch, looks good! Did you experience a panic in running code? Do you think you could add a test which recreates the scenario and fails with the old code and passes with the new? |
|
Yes, we sometimes experienced a panic when the client disconnects, resulting in the buffer to fill up. I would be happy to write a test, but this bug is subtle, and the panic happens in If you're interested, this is the test code which will cause the panic, but doesn't fail the test |
|
Test panics for me :-) Maybe have a look at these ideas for catching a panic using https://stackoverflow.com/questions/31595791/how-to-test-panics Oops, just realised the panic happens in another goroutine.... Will have a think and get back! |
|
This is admittedly a hard thing to test.... https://github.com/daniellowtw/eventsource/blob/dbe5ca5ad8ebe78f9650de72bfb83d084c360e65/server.go#L145 might both be called when a connection closes and invokes: and the |
|
Yes, the wrapping of As to closing an already closed channel, checking whether the subscription still exist should suffice. The out channel is not closed iff it is in the subscription map. |
|
So, I tried to rewrite your tests to only use the exported methods of Server and Stream with BufferSize set to a small number, and quickly came to the conclusion that a misbehaving client can actually block all calls to Publish() on the server, which is perhaps an even bigger bug :-) A timeout mechanism is obviously needed, probably using I'd be interested in hearing your views on this and if it might affect your approach to this actual issue, given what sounds like high loads in your application. |
|
Apologies for the diversion, but related and interesting reads: |
Those are indeed interesting reads.
Is a misbehaving client here a slow reading client, or one that refuses to close the connection? I don't see how it can block calls to
Do you mean associate with each subscriber a 'write timeout' so that if they take too long to read the channel, they should be unregistered? This could certainly solve the inactive client problem, but it might be tricky to do the bookkeeping. I think the current way of letting the buffer fill up and unregistering them as slow client is a good compromise. |
|
Sorry, got confused by the |
There is a possibility of writing to a closed channel in the code.
The select block chooses a case at random when multiple cases are ready. There is a possible race in the code where
case pub := <-srv.pub:is chosen overcase sub := <-srv.unregister:and as a result, writing to the closed channel.