-
Notifications
You must be signed in to change notification settings - Fork 56
Description
Prerequisite
- Server which stores tasks using tarantool/queue with custom driver based on utubettl
- Client which takes tasks from the server through
net.boxconnection
Scenario
- Client takes a task through
net.boxconnection and start processing it - Network glitch (or another issue) occur and
net.boxdecides to reconnect - Client finishes processing task and decides to
ackthe task. - Server responds to
ackcommand with error -Task was not taken in the session - After
ttrdelay task returned to theREADYstate
Problem description
This "ack error" happen because tarantool/queue locks "taken" task to box.session.id(). That means that not only the same client must take and ack the same task, but it has to be done inside the same net.box connection, since box.session.id() updates after implicit reconnect. That implies that client has no way to handle such error from the server and retry ack.
Possible solutions
There is a possible workaround to eliminate this kind of errors is to implement a simple buffer (fiber.channel) on the server with client ack commands and let another fiber to call an actual queue:ack. But this is not very handy, if you want to handle ack response on the client.
Another way to approach this is to use different id to lock a task. This id should determine the same client even through implicit net.box reconnect calls. This id may be set explicitly through some registration process (api breaking change) or implicitly using some tarantool client information (if exists).
Is this reasoning correct? Are there any different approaches/workarounds?