Feature Request: Set Vision Encoder device by arg, not env var

**Note: This issue was copied from [https://github.com/ggml-org/llama.cpp/issues/16012](https://github.com/ggml-org/llama.cpp/issues/16012)**

**Original Author:** @samteezy
**Original Issue Number:** #16012
**Created:** 2025-09-15T13:55:42Z

---

### Prerequisites

- [x] I am running the latest code. Mention the version if possible as well.
- [x] I carefully followed the [README.md](https://github.com/ggml-org/llama.cpp/blob/master/README.md).
- [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
- [x] I reviewed the [Discussions](https://github.com/ggml-org/llama.cpp/discussions), and have a new and useful enhancement to share.

### Feature Description

Per the issue I ran into #15804 and then getting directed to PR #14236, I suggest we add `--mmproj-device` as an arg to be consistent with the rest of how llama.cpp is used.

This should be available in both `llama-cli` and `llama-server`.

### Motivation

Users may want to specify different devices for different configs, and managing an env var is not ideal, nor is it consistent with how similar features are already set up within llama.cpp. In my setup, my more powerful, newer GPU is recognized as `Vulkan1` and an older one as `Vulkan0`. 

I should add that I'm actually having trouble getting the existing environment variable `MTMD_BACKEND_DEVICE` to work. I try setting `Vulkan1` or `1`, but llama-server doesn't do anything other than use the default `Vulkan0`. Thus vision inference tk/s performance takes a dive due to `Vulkan0` being a bottleneck.

### Possible Implementation

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Set Vision Encoder device by arg, not env var #136

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Set Vision Encoder device by arg, not env var #136

Description

Prerequisites

Feature Description

Motivation

Possible Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions