-
Notifications
You must be signed in to change notification settings - Fork 2
New zero-os client protocol (protocol v2) #75
Description
The current client protocol that uses redis commands causes issues with zero-os and make it hard to avoid conflicts (command ids, wrong command status (running or not)) etc due to the async nature of zero-os. This is happen due to the need for 2 communication channels an uplink for commands, and a down link for results.
For example, currently:
- we use RPUSH to push a command to a redis queue
- once command is pushed u pull on a result queue that is unique per command id
Such a design makes it impossible to for example handle a duplicate ids error. If u push 2 commands with same id, there is no way u tell the client that the id is duplicate since it's expecting the result on a the command result channel which uses the id in it's name. If u replied to an error, u will actually ruin this queue for the original command.
Another issues related to this is also when containers dies without setting correct state on the container jobs (jobs were running inside the container)
Streaming has many issues because of command state and it's not very efficient
Suggested new protocol
We still gonna use redis since it's light and efficient, but zos will implement it's own commands, for example zos will expose the following commands (not full list):
RUN <command>: receives a command object, return command ID or error, if no id is provided a unique one will be generatedSUB <id>: return a stream of job output. The stream terminates on job exit.INF <id>: information about running job (status, memory, cpu, etc...), or error. This can be implemented of course as a normal command that u execute withRUNbut it would be much more efficient to implement as a separate low level protocol commandGET <id> [timeout]: wait for job result.
This is all the commands i can think of at the moment to replace the current protocol cleanly and fix the issues of the current implementation. The list may grow (or shrink)