Adding LLMKube to Infrastructure list on README by Defilan · Pull Request #20212 · ggml-org/llama.cpp

Defilan · 2026-03-07T19:58:50Z

LLMKube is a Kubernetes operator for llama.cpp. It handles model downloads, GPU scheduling (Nvidia CUDA and Apple Silicon Metal), health probes, and Prometheus metrics through Metal and InferenceService CRDs.

Related: #6546

Adding LLMKube to Infrastructure list on README

5dee568

Defilan requested a review from ggerganov as a code owner March 7, 2026 19:58

ggerganov merged commit a950479 into ggml-org:master Mar 8, 2026
1 check passed

bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 10, 2026

readme : update infra list (ggml-org#20212)

64aa130

Ethan-a2 pushed a commit to Ethan-a2/llama.cpp that referenced this pull request Mar 20, 2026

readme : update infra list (ggml-org#20212)

6ee64fb

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026

readme : update infra list (ggml-org#20212)

a3867e9

rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026

readme : update infra list (ggml-org#20212)

ea2a713

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding LLMKube to Infrastructure list on README#20212

Adding LLMKube to Infrastructure list on README#20212
ggerganov merged 1 commit intoggml-org:masterfrom
Defilan:docs/adding_llmkube

Defilan commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Defilan commented Mar 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants