Skip to content
@defilantech

Defilan Technologies

All Things Automation

Defilan Technologies

Open-source AI infrastructure for teams that need to own their stack.

We're a software company in Washington State building tools that make self-hosted LLM deployment practical on Kubernetes. Our work is open source, Apache 2.0 licensed, and designed for production use.


Our Projects

LLMKube — Kubernetes Operator for LLM Inference

A Kubernetes operator that turns LLM deployment into a two-line YAML problem. Define a Model and an InferenceService, and the operator handles the rest — downloading, caching, GPU scheduling, health checks, and exposing an OpenAI-compatible API.

llmkube deploy llama-3.1-8b --gpu

What makes it different:

  • Heterogeneous GPU support — NVIDIA CUDA and Apple Silicon Metal in the same cluster, managed by the same CRDs. The Metal Agent runs inference natively on macOS while Kubernetes handles orchestration.
  • OpenAI-compatible API — Drop-in replacement for OpenAI endpoints. Works with LangChain, LlamaIndex, and any OpenAI SDK.
  • Full observability — Prometheus metrics, OpenTelemetry tracing, and Grafana dashboards included.
  • Air-gap ready — Built for environments where cloud APIs aren't an option.

How We Work

Everything we build is open source first. We believe the best infrastructure software gets built in the open, with input from the people who actually use it.

We welcome contributions at every level — from filing issues and improving docs to adding new features. If you're interested in Kubernetes, GPU orchestration, or LLM infrastructure, we'd love to work with you.


Get in Touch

Star LLMKube on GitHub · Join the Discussion

Popular repositories Loading

  1. LLMKube LLMKube Public

    Kubernetes operator for GPU-accelerated LLM inference - air-gapped, edge-native, production-ready

    Go 29 4

  2. .github .github Public

  3. homebrew-tap homebrew-tap Public

    Homebrew tap for LLMKube

    Ruby

  4. issueparser issueparser Public

    LLM-powered GitHub issue theme analyzer. Scan repositories for issues and use AI to identify common pain points, recurring themes, and actionable insights.

    Go

Repositories

Showing 4 of 4 repositories

Top languages

Loading…

Most used topics

Loading…