llama: max ctx by default, fix fit magic number#18567
llama: max ctx by default, fix fit magic number#18567JohannesGaessler wants to merge 3 commits intoggml-org:masterfrom
Conversation
|
Hmmm, what's with the sign disparity between common and llama params? |
|
I think the biggest reason is simply that no one invested the effort to make things consistent. I would suggest that going forward, if we have non-negative quantities like the context size we store them internally as unsigned values but use signed values in |
60db254 to
bf2ee7b
Compare
ggerganov
left a comment
There was a problem hiding this comment.
An alternative approach would be keep the changes entirely in the common API.
This seems like the better approach.
| // https://github.com/ggml-org/llama.cpp/pull/7544 | ||
| struct llama_context_params { | ||
| uint32_t n_ctx; // text context, 0 = from model | ||
| int32_t n_ctx; // context size in tokens, use llama_model_n_ctx_train for values <= 0 |
There was a problem hiding this comment.
Hm, I don't see why this change is needed. Looks like it allows negative values that have the same functionality as n_ctx == 0. So the user code can simply used params.n_ctx = max(0, n_ctx); to achieve the same.
|
Superseded by #19070 . |
Fixes #18376 .
The intent with
llama_params_fitis to only change those parameters that a user is not setting explicitly themself. However, if a user is explicitly setting-c 0this is currently being changed byllama_params_fit. I would propose the following changes to fix this:llama_context_params::n_ctxfromuint32_ttoint32_twith all values <= 0 being interpreted to mean that the full context should be used.llama_context_default_paramsfrom 512 to -1. This would mean with the llama C API by default the full context of the model would now be used, consistent with the common API.llama_params_fit, check the context size against the new default value instead of vs. 0.--ctx-sizewhich by default both use the full context but differ w.r.t. whether they should be changed byllama_params_fit.An alternative approach would be keep the changes entirely in the common API.