check FIM response for expected format#73
Conversation
Which version of |
FYI, these 2 changes from last week should have fixed the problem:
If you still spot the error with a build that includes these fixes, let me know. |
|
Now that you mention it, I haven't seen that "sequence 0" error yet this week. I update the server almost daily, so that probably explains it. I was using Qwen 2.5 14B coder, by the way. However, this PR wasn't designed to fix only that issue specifically, but more generally handle the case of unexpected server responses. I still get those almost daily when running over a network, as my home wifi is OK but not amazing. So it would still be useful to merge this, I think. |
Partially fixes #61 by handling errors gracefully (though it will not help users discover that their entire server is set up incorrectly).
Also does not explain why llama-server sometimes hasBut, it does handle unexpected server responses gracefully.sequence 0 does not start from the last position stored in the memoryerrors.This approach handles unexpected issues upon server response in
fim_on_response, avoiding taking up slots in the cache with invalid responses. You can test it with a couple of examples by entering these commands:Not valid JSON (endpoint returns HTML)
Valid JSON missing the
contentkeyIt might be overkill to check the JSON string before decoding, I was just worried about small performance hits from unnecessarily doing full JSON decodes, especially since this can happen on every keypress if the responses are invalid, since there will never be cache hits.