Skip to content

llama : add fd-based model loading via llama_model_load_from_fd ( REWORK )#20402

Closed
Siddhesh2377 wants to merge 4 commits intoggml-org:masterfrom
Siddhesh2377:fd-loading
Closed

llama : add fd-based model loading via llama_model_load_from_fd ( REWORK )#20402
Siddhesh2377 wants to merge 4 commits intoggml-org:masterfrom
Siddhesh2377:fd-loading

Conversation

@Siddhesh2377
Copy link
Copy Markdown
Contributor

Adds llama_model_load_from_fd() to load GGUF models from a POSIX file descriptor instead of a file path.

On Android, apps accessing user files through SAF only get a file descriptor, not a path. The alternative is copying the model into app storage or requesting MANAGE_EXTERNAL_STORAGE, which gets rejected by Google Play. This happened with my app (ToolNeuron).

Reworked version of a previous PR that was rejected for code quality.

Not supported on Windows. The fd is dup'd internally so the caller retains ownership.

Tested locally with CI and a real model (vocab_only + mmap).

@Siddhesh2377 Siddhesh2377 changed the title llama : add fd-based model loading via llama_model_load_from_fd llama : add fd-based model loading via llama_model_load_from_fd ( REWORK ) Mar 11, 2026
@github-actions github-actions Bot added testing Everything test related ggml changes relating to the ggml tensor library for machine learning labels Mar 11, 2026
Comment thread ggml/src/gguf.cpp Outdated
Comment thread ggml/include/gguf.h Outdated

GGML_API struct gguf_context * gguf_init_empty(void);
GGML_API struct gguf_context * gguf_init_from_file(const char * fname, struct gguf_init_params params);
GGML_API struct gguf_context * gguf_init_from_fd(int fd, struct gguf_init_params params);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For your purposes, would it work to expose the current gguf_init_from_file_impl as gguf_init_from_file_ptr and to use that as the basis for the implementation instead? That way we would be able to also use this code on Windows in conjunction with ggml_fopen.

Copy link
Copy Markdown
Contributor Author

@Siddhesh2377 Siddhesh2377 Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, replaced gguf_init_from_fd with gguf_init_from_file_ptr(FILE *) and moved the dup+fdopen logic up to llama-model-loader

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The llama C API should also use a file pointer if at all possible, the conversion from file descriptor to file pointer should be in your user code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, switched the llama C API to FILE pointer as well. The test shows the fd to FILE* conversion on the caller side.

Copy link
Copy Markdown
Contributor

@JohannesGaessler JohannesGaessler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should have made my intent more clear: The GGUF and llama APIs should be using a file pointer rather than a file descriptor because that has Windows compatibility. But then that is also what should be used internally because conversions between the two add unnecessary complexity. Your PR should not be using file descriptors anywhere, please consistently use file pointers.

Comment thread ggml/src/gguf.cpp Outdated
Comment on lines +856 to +862
struct gguf_context * gguf_init_from_file_ptr(FILE * file, struct gguf_init_params params) {
if (!file) {
return nullptr;
}
return gguf_init_from_file_impl(file, params);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just rename gguf_init_from_file_impl to gguf_init_from_file_ptr and add the check there. Keep the check in gguf_init_from_file since it is associated with a warning.

Comment thread include/llama.h Outdated
struct llama_model_params params);

// Load a model from an open FILE pointer
LLAMA_API struct llama_model * llama_model_load_from_file_ptr(FILE * file, struct llama_model_params params);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the formatting consistent with the surrounding code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and sorry for the misunderstanding :)

Comment thread src/llama-mmap.h
Comment on lines -23 to +24
int file_id() const; // fileno overload
int file_id() const;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did "fileno" refer to here and why did you remove this comment?

Comment thread src/llama.cpp
Comment on lines -893 to +897
GGML_ASSERT((metadata == nullptr) != path_model.empty() && "exactly one out of metadata and path_model needs to be defined");
if (metadata == nullptr && path_model.empty() && !file) {
LLAMA_LOG_ERROR("%s: no model source provided\n", __func__);
return nullptr;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic should remain that exactly one out of the three things needs to be defined. Something like this should work:

GGML_ASSERT(int(metadata != nullptr) + int(path_model.empty()) + int(file != nullptr) == 1 && "exactly one out of metadata and path_model needs to be defined");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am currently adding file saving/loading to the recently added end-to-end tests in test-llama-archs.cpp via #20503 . This will provide test coverage for your newly added code so I don't think we need this additional test.

@Siddhesh2377 Siddhesh2377 closed this by deleting the head repository May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning testing Everything test related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants