llama-fit-params: keep explicit --ctx-size 0 by JohannesGaessler · Pull Request #19070 · ggml-org/llama.cpp

JohannesGaessler · 2026-01-24T14:26:05Z

Fixes #18376 , alternative to #18567 . If the user explicitly sets --ctx-size 0, set the minimum context size to UINT32_MAX to prevent llama_params_fit from reducing it.

ggerganov

@JohannesGaessler On a related note: for LlamaBarn we need a mechanism to query how much memory would be needed for a context of size -c N. Would you accept if we extend llama-fit-params to provide such information and if yes, can you recommend a way to implement it in terms of llama-fit-params UX and libllama API modifications (if needed)?

JohannesGaessler · 2026-01-24T15:56:09Z

As of right now llama_params_fit internally sets llama_model_params::no_alloc = True, creates llama_model and llama_context instances, and then calls llama_context::memory_breakdown. I would suggest that rather than touch llama_params_fit an external project should simply retrieve this information itself. The problem is that currently the return format of llama_context::memory_breakdown is std::map<ggml_backend_buffer_type_t, llama_memory_breakdown_data> which is not C-compatible. So that functionality needs to either be exposed in a C-compatible way or it needs to be exposed in llama-cpp.h

llama-fit-params: keep explicit --ctx-size 0

d78cb64

JohannesGaessler requested a review from ggerganov as a code owner January 24, 2026 14:26

loci-dev mentioned this pull request Jan 24, 2026

UPSTREAM PR #19070: llama-fit-params: keep explicit --ctx-size 0 auroralabs-loci/llama.cpp#1021

Open

github-actions Bot added the examples label Jan 24, 2026

ggerganov approved these changes Jan 24, 2026

View reviewed changes

JohannesGaessler merged commit e9fd8dc into ggml-org:master Jan 24, 2026
77 of 78 checks passed

JohannesGaessler mentioned this pull request Jan 27, 2026

llama: max ctx by default, fix fit magic number #18567

Closed

shaofeiqi pushed a commit to qualcomm/llama.cpp that referenced this pull request Feb 6, 2026

llama-fit-params: keep explicit --ctx-size 0 (ggml-org#19070)

fd858e2

ggerganov mentioned this pull request Apr 20, 2026

fit-params : refactor + add option to output estimated memory per device #22171

Merged

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026

llama-fit-params: keep explicit --ctx-size 0 (ggml-org#19070)

fb8802a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama-fit-params: keep explicit --ctx-size 0#19070

llama-fit-params: keep explicit --ctx-size 0#19070
JohannesGaessler merged 1 commit intoggml-org:masterfrom
JohannesGaessler:llama-fp-fix-ctx-magic-3

JohannesGaessler commented Jan 24, 2026

Uh oh!

ggerganov left a comment

Uh oh!

JohannesGaessler commented Jan 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JohannesGaessler commented Jan 24, 2026

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

JohannesGaessler commented Jan 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants