server : add default-model preset and fallback logic#19855
server : add default-model preset and fallback logic#19855mikhail-shevtsov-wiregate wants to merge 1 commit intoggml-org:masterfrom
Conversation
329368a to
23783ea
Compare
|
Is there anything I can do to merge this "Quality of Life" feature improvement? Thanks, |
ngxson
left a comment
There was a problem hiding this comment.
This option is unclear, how the config below behave?
default-model = model_a
[model_b]
default-model = model_c
Is it |
| ).set_env(COMMON_ARG_PRESET_LOAD_ON_STARTUP).set_preset_only()); | ||
|
|
||
| args.push_back(common_arg( | ||
| {"default-model"}, "NAME", |
There was a problem hiding this comment.
You mentioned in the description:
Start the server in router mode with default-model = true in required preset.
It's confusing: default-model = true or default-model = NAME ?
There was a problem hiding this comment.
multiple models can be load-on-startup, but only one single can be default-model
unless this is made clear, I don't think this proposal can be accepted as-is.
There was a problem hiding this comment.
Ah...! I see. Thanks for clarification. That makes a lot of sense. I will rewrite this one.
There was a problem hiding this comment.
@ngxson I've added a simple condition that will show a warning when multiple default-mode = true options are detected. I've tried to keep logic as simple as possible to avoid any confusion. I've updated PR description to include example. I've also updated README.md. I've re-tested this changed 3 times and rebased commit to latest upstream. I've forced pushed this commit. Please take a look and let me know if you would like to make additional changes.
There was a problem hiding this comment.
I've updated only tools/server/server-models.cpp and tools/server/README.md files. Everything else stayed the same.
@ngxson I've updated autogenerated PR description to avoid any confusion. |
d3635a5 to
0196e22
Compare
|
@ngxson is there anything else you would like to see in this feature to get merged? |
0196e22 to
be9837b
Compare
|
@ngxson @ggerganov I've rebased this tiny feature with latest commit from |
|
this would be super useful, hope it gets merged soon |
be9837b to
e2df42d
Compare
|
@ngxson @ggerganov I've rebased the code with latest commit. I would really appreciate it if you let me know whether this PR has the future. Or this feature is not required? If it has the future but missing something like unit tests - please tell me. As I don't want to waste time of the maintainers. |
|
|
||
| We also offer additional options that are exclusive to presets (these aren't treated as command-line arguments): | ||
| - `load-on-startup` (boolean): Controls whether the model loads automatically when the server starts | ||
| - `default-model` (boolean): The model to use when no model is specified in a request or the model is not found. |
There was a problem hiding this comment.
The description is honestly confusing and wrong. The model to use indicates you state a name as as string. I think Use this model when... would be better. Also of course When multiple default-model options are found, only the first one will be used is incorrect. The first true will probably be the one selected?
Summary
default-model = truethat can be used in server router mode to specify a default model.default-modeloptions are found only the first one will be used.without a model parameter or with an unknown model use the default.
Motivation
When the server runs in router mode, clients may omit the model field or request a model that isn’t loaded.
Previously this caused an error; the new preset allows the server to automatically use a pre‑selected default model, improving robustness and usability.
How to test:
default-model = truein required preset in presets file.<your-default-model>.<your-default-model>instead of returning an error.Example:
Impact