Prerequisites
Feature Description
Add a way to check if a model is supported on a given hardware backend.
Motivation
Currently, when an unsupported quantization is encountered, at least on Metal, asserts fire, leading to a crash. At least on my machine, called from a Rust GUI frontend, this manages to lock up the whole machine. Finder is unresponsive. Can't force quit. A cold reboot is actually necessary.
I don't believe there's a way to catch the assert so I'm looking a way to check for support and fail gracefully in such cases.
Possible Implementation
Here I'd be looking for advice but my initial thoughts would be to use the existing ggml_supports functions to check if the model's tensors are supported.
I see llama_model has some ggml_tensor and a std::vector of llama_layer also containing tensors. I figure there may be more to it than that, so I'd need guidance here.
The function might return a bool if the model is supported or some status code.
Alternatively, what would solve my immediate issue is for ggml_metal_graph_compute to return an error code in the cases where a model is unsupported. I see it already returns a ggml_status where GGML_STATUS_FAILED indicates failure. I'm assuming there's a good reason this behavior isn't already the case.
I'd be happy to work on and contribute such a feature if it would be of use.
Prerequisites
Feature Description
Add a way to check if a model is supported on a given hardware backend.
Motivation
Currently, when an unsupported quantization is encountered, at least on Metal, asserts fire, leading to a crash. At least on my machine, called from a Rust GUI frontend, this manages to lock up the whole machine. Finder is unresponsive. Can't force quit. A cold reboot is actually necessary.
I don't believe there's a way to catch the assert so I'm looking a way to check for support and fail gracefully in such cases.
Possible Implementation
Here I'd be looking for advice but my initial thoughts would be to use the existing
ggml_supportsfunctions to check if the model's tensors are supported.I see
llama_modelhas someggml_tensorand astd::vectorofllama_layeralso containing tensors. I figure there may be more to it than that, so I'd need guidance here.The function might return a bool if the model is supported or some status code.
Alternatively, what would solve my immediate issue is for
ggml_metal_graph_computeto return an error code in the cases where a model is unsupported. I see it already returns aggml_statuswhereGGML_STATUS_FAILEDindicates failure. I'm assuming there's a good reason this behavior isn't already the case.I'd be happy to work on and contribute such a feature if it would be of use.