rpc: free buffer after client disconnect#7378
Conversation
| switch (cmd) { | ||
| case ALLOC_BUFFER: { | ||
| rpc_alloc_buffer(backend, input, output); | ||
| allocated_buffers.push_back(rpc_alloc_buffer(backend, input, output)); |
There was a problem hiding this comment.
add the allocated buffer into list.
| } | ||
| case FREE_BUFFER: { | ||
| rpc_free_buffer(input); | ||
| allocated_buffers.remove(rpc_free_buffer(input)); |
There was a problem hiding this comment.
remove the freed buffer from list
|
|
||
| for (auto buff: allocated_buffers) { | ||
| ggml_backend_buffer_free(buff); | ||
| } |
There was a problem hiding this comment.
free the reminding buffers.
|
I think a better approach would be to track allocated buffers with |
emm, just forgot the also, for the but agree that |
|
close this PR and wait for @rgerganov 's fix, related discussion: #7407 |
In PR #6829, @rgerganov add support to rpc backend, after using it for several days, I have noticed an issue:
Upon investigating the source code, I discovered that instead of releasing the memory, we simply exit the inner loop and immediately wait for a new connection (ggml-rpc.cpp#L1027).
So here I create this PR, which monitor the
ALLOC_BUFFERandFREE_BUFFERcommand, maintaining a list of allocated buffers, then free the remind buffer after client disconnect.