You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First off, thank you for the great work! I've been using koboldcpp for a while now and it works pretty well.
I have a few feature suggestions for the APIs though:
API keys: This would be great for when you want to expose the endpoint to the web. It would be good if multiple API keys could be used, perhaps they could be stored in a user specified file.
Endpoint stuff: The ability to be able to select the endpoint type(s) you want, whether to only expose the Kobold API, the OpenAI-compatible API, or both.
Chat template config file (and the ability to force it, ignoring the adapter object passed to the API and just using the config file): This would be really great so the code on the client side is cleaner, not having to specify the adapter (OpenAI compat API adapter #466).
I know this probably isn't koboldcpp's intended use case but this would be really nice to have. It would be a seamless replacement for the official OpenAI API.
Also, small question, with that adapter, would it be possible to use LLaMA 2 Chat templating as shown below?
[INST] <<SYS>>
You are a helpful assistant.
<</SYS>>
{prompt 0} [/INST] {response 0} </s><s>[INST] {prompt 1} [/INST] {response 1} </s><s>...[INST] {prompt N} [/INST]
First off, thank you for the great work! I've been using koboldcpp for a while now and it works pretty well.
I have a few feature suggestions for the APIs though:
adapterobject passed to the API and just using the config file): This would be really great so the code on the client side is cleaner, not having to specify theadapter(OpenAI compat API adapter #466).I know this probably isn't koboldcpp's intended use case but this would be really nice to have. It would be a seamless replacement for the official OpenAI API.
Also, small question, with that
adapter, would it be possible to use LLaMA 2 Chat templating as shown below?