Skip to content

Add info about CUDA_VISIBLE_DEVICES#1682

Merged
SlyEcho merged 1 commit intomasterfrom
docs-update
Jun 3, 2023
Merged

Add info about CUDA_VISIBLE_DEVICES#1682
SlyEcho merged 1 commit intomasterfrom
docs-update

Conversation

@SlyEcho
Copy link
Copy Markdown
Contributor

@SlyEcho SlyEcho commented Jun 3, 2023

Add a sentence about GPU selection on CUDA.

Relevant: #1546

@SlyEcho SlyEcho merged commit d8bd001 into master Jun 3, 2023
@SlyEcho SlyEcho deleted the docs-update branch June 3, 2023 13:35
@roperscrossroads
Copy link
Copy Markdown

@SlyEcho

Does this actually work? I struggled with this a few days ago using a slightly older version of llama.cpp. It kept loading into my internal mobile 1050ti and running out of memory instead of using my 3090 (egpu). I was doing something like this:

CUDA_VISIBLE_DEVICES=1 ./main -ngl 60 -m models/model.bin

It always went to the internal GPU with id 0.

Tomorrow morning I will give it another try, but I do not think it works if you set the ID from nvidia-smi (in my case: 1050ti: 0, 3090: 1).

@JohannesGaessler
Copy link
Copy Markdown
Contributor

CUDA_VISIBLE_DEVICES does work on my test machine using the master branch. In any case, it should be possible to just control this via a CLI argument before long. My current plan is to add something like a --tensor-split argument for the compute-heavy matrix multiplication tensors and a --main-gpu argument for all other tensors where multi GPU wouldn't be worthwhile.

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants