feature: unbuffered token stream

Now this should be quite easy at least for the llama.cpp backend: https://github.com/go-skynet/go-llama.cpp/pull/28 thanks to @noxer's contribution ( :heart: ) now it's just a matter of wiring things up in the SSE callback here in the server

- [x] go-llama.cpp
- [ ] gpt4all.cpp
- [ ] gpt2.cpp
- [x] rwkv.cpp