Skip to content

llama-fit-params: keep explicit --ctx-size 0#19070

Merged
JohannesGaessler merged 1 commit intoggml-org:masterfrom
JohannesGaessler:llama-fp-fix-ctx-magic-3
Jan 24, 2026
Merged

llama-fit-params: keep explicit --ctx-size 0#19070
JohannesGaessler merged 1 commit intoggml-org:masterfrom
JohannesGaessler:llama-fp-fix-ctx-magic-3

Conversation

@JohannesGaessler
Copy link
Copy Markdown
Contributor

Fixes #18376 , alternative to #18567 . If the user explicitly sets --ctx-size 0, set the minimum context size to UINT32_MAX to prevent llama_params_fit from reducing it.

Copy link
Copy Markdown
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JohannesGaessler On a related note: for LlamaBarn we need a mechanism to query how much memory would be needed for a context of size -c N. Would you accept if we extend llama-fit-params to provide such information and if yes, can you recommend a way to implement it in terms of llama-fit-params UX and libllama API modifications (if needed)?

@JohannesGaessler
Copy link
Copy Markdown
Contributor Author

As of right now llama_params_fit internally sets llama_model_params::no_alloc = True, creates llama_model and llama_context instances, and then calls llama_context::memory_breakdown. I would suggest that rather than touch llama_params_fit an external project should simply retrieve this information itself. The problem is that currently the return format of llama_context::memory_breakdown is std::map<ggml_backend_buffer_type_t, llama_memory_breakdown_data> which is not C-compatible. So that functionality needs to either be exposed in a C-compatible way or it needs to be exposed in llama-cpp.h

@JohannesGaessler JohannesGaessler merged commit e9fd8dc into ggml-org:master Jan 24, 2026
77 of 78 checks passed
shaofeiqi pushed a commit to qualcomm/llama.cpp that referenced this pull request Feb 6, 2026
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: Maximum Context Size

2 participants