Specification
In the case of vaults clone we have a unary call. The call will be active for the duration of the cloning process which in the background is a compound operation of complex streams. For a very large vault we run into a problem with the client level unary call timing out before we can complete the call.
To fix this we need to change the unary call to a server streaming call. We then need to stream over progress updates to reset the timeout timer to prevent the time out. In the case of vaults clone this means sending over the cloning process periodically. Either every 5% or every few seconds or so. Then when the clone is complete we send over details such as the vault name and the new vault ID for that cloned vault. For most cases we can't really know what the absolute progress is. But we can output the amount of arbitrary progress made. This progress must be output on stderr like any other feedback output.
This change to streaming a progress updates need to apply to and client RPC call that waits for a complex or long running task. As far as I can tell this just applies to the vaults cloning, pulling and cross signing claims. But we'll need to investigate this deeper.
Additional context
- Related: Polykey-CLI#185 - The client call timing out is very likely leading to an orphaned agent-agent RPC call and leaking a resource. This is likely what's causing the agent to lock up when shutting down.
- Related: Polykey-CLI#198 - This also later triggers the agent crashing when the QUIC connection times out.
- Polykey-CLI#74
- js-rpc#52
- js-rpc#57
Tasks
- Identify all client RPC calls that wrap complex long running operations
- Refactor these RPC calls to be a server stream call that sends over progress updates. The details of the progress isn't actually important so much that they reset the timeout for the call.
Specification
In the case of
vaults clonewe have a unary call. The call will be active for the duration of the cloning process which in the background is a compound operation of complex streams. For a very large vault we run into a problem with the client level unary call timing out before we can complete the call.To fix this we need to change the unary call to a server streaming call. We then need to stream over progress updates to reset the timeout timer to prevent the time out. In the case of
vaults clonethis means sending over the cloning process periodically. Either every 5% or every few seconds or so. Then when the clone is complete we send over details such as the vault name and the new vault ID for that cloned vault. For most cases we can't really know what the absolute progress is. But we can output the amount of arbitrary progress made. This progress must be output onstderrlike any other feedback output.This change to streaming a progress updates need to apply to and client RPC call that waits for a complex or long running task. As far as I can tell this just applies to the vaults cloning, pulling and cross signing claims. But we'll need to investigate this deeper.
Additional context
Tasks