ggml-blas: report memory when calling ggml_backend_blas_device_get_memory. by pwhelan · Pull Request #20150 · ggml-org/llama.cpp

pwhelan · 2026-03-06T00:57:46Z

Implement memory information retrieval for ggml_backend_blas_device_get_memory. It uses sysinfo on linux, which I tested.

I copied the windows version from #18578. I have no desire to test it on windows and the implementation looks sane. I am also assuming the original author was more familiar with windows.

My reason for fixing this bug is to allow for light weight testing of code that requires querying the currently used memory of a device using BLAS on my laptop.

…mory using sysinfo.

taronaeo · 2026-03-06T06:06:14Z

Unfortunately its not a bug. The BLAS backend uses host memory. Setting *free = 0; *total = 0; simply lets the CPU backend report the memory information instead and no code duplication is required across backend implementations.

See:

llama.cpp/src/llama.cpp

Lines 118 to 127 in f7db3f3

    
           // devices can return 0 bytes for free and total memory if they do not 
        
           // have any to report. in this case, we will use the host memory as a fallback 
        
           // fixes: https://github.com/ggml-org/llama.cpp/issues/18577 
        
           if (free == 0 && total == 0) { 
        
               ggml_backend_dev_t cpu_dev = ggml_backend_dev_by_type(GGML_BACKEND_DEVICE_TYPE_CPU); 
        
               if (cpu_dev == nullptr) { 
        
                   throw std::runtime_error(format("%s: no CPU backend found", __func__)); 
        
               } 
        
               ggml_backend_dev_memory(cpu_dev, &free, &total); 
        
           }

taronaeo · 2026-03-07T03:39:20Z

Closing as not a bug. Feel free to reopen if otherwise.

pwhelan added 2 commits March 5, 2026 21:29

ggml-blas: report memory when calling ggml_backend_blas_device_get_me…

60ecdfc

…mory using sysinfo.

ggml-blas: add support for WIN32 from ggml-org#18578.

4915153

pwhelan changed the title ~~ggml-blas: report memory when calling ggml_backend_blas_device_get_memory using sysinfo~~ ggml-blas: report memory when calling ggml_backend_blas_device_get_memory. Mar 6, 2026

github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label Mar 6, 2026

taronaeo mentioned this pull request Mar 6, 2026

ggml: update comments for backends which have no memory to report #20157

Merged

taronaeo closed this Mar 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-blas: report memory when calling ggml_backend_blas_device_get_memory.#20150

ggml-blas: report memory when calling ggml_backend_blas_device_get_memory.#20150
pwhelan wants to merge 2 commits intoggml-org:masterfrom
pwhelan:blas-device-get-memory

pwhelan commented Mar 6, 2026 •

edited

Loading

Uh oh!

taronaeo commented Mar 6, 2026

Uh oh!

taronaeo commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pwhelan commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taronaeo commented Mar 6, 2026

Uh oh!

taronaeo commented Mar 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pwhelan commented Mar 6, 2026 •

edited

Loading