Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
630e8c7
feat(diffusers): implement diffusers backend for image generation
ilopezluna Jan 14, 2026
1993372
feat(diffusers): add support for DDUF (Diffusers Unified Format) file…
ilopezluna Jan 14, 2026
c886d00
feat(dduf): implement DDUF format support and enhance model loading
ilopezluna Jan 15, 2026
3f7394e
feat(dduf): calculate total size of files and add human-readable size…
ilopezluna Jan 15, 2026
4c22678
feat(platform): restrict Diffusers support to Linux only until macOS …
ilopezluna Jan 15, 2026
c89429a
feat(diffusers): add support for DDUF file type handling in repositor…
ilopezluna Jan 15, 2026
a749093
feat(diffusers): sanitize log output for Diffusers arguments
ilopezluna Jan 15, 2026
e916b40
feat(docker): streamline Python server code copying in Dockerfile
ilopezluna Jan 15, 2026
d54ea4a
feat(docker): specify exact versions for Python packages in Dockerfile
ilopezluna Jan 15, 2026
2daa296
feat(model): add DDUF file support to packaging command and documenta…
ilopezluna Jan 15, 2026
cd090d5
Update pkg/distribution/internal/bundle/unpack.go
ilopezluna Jan 15, 2026
400f1a1
refactor(dduf): replace formatDDUFSize with formatSize and clean up u…
ilopezluna Jan 15, 2026
f62e4e5
feat(docker): add support for building and running Diffusers Docker i…
ilopezluna Jan 15, 2026
09d7117
Merge remote-tracking branch 'origin/add-stable-diffusion-backend' in…
ilopezluna Jan 15, 2026
a4f76e9
feat(client): add support for Diffusers format in GetSupportedFormats…
ilopezluna Jan 15, 2026
237b9dd
feat(docker): enhance GPU support for additional Docker image variants
ilopezluna Jan 15, 2026
7d16258
Merge branch 'main' into add-stable-diffusion-backend
ilopezluna Jan 15, 2026
200f84b
feat: add support for image-generation mode in backend operations
ilopezluna Jan 15, 2026
3562f8a
feat(loader): support fallback for image-generation mode in runner co…
ilopezluna Jan 16, 2026
cf31231
feat(diffusers): initialize Diffusers backend in main.go
ilopezluna Jan 16, 2026
587cdad
feat(diffusers): add error transformation for Python output and enhan…
ilopezluna Jan 16, 2026
78d8bcd
fix(scripts/docker-run): conditionally add nvidia runtime flags
doringeman Jan 16, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 59 additions & 1 deletion Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ ENV MODEL_RUNNER_PORT=12434
ENV LLAMA_SERVER_PATH=/app/bin
ENV HOME=/home/modelrunner
ENV MODELS_PATH=/models
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed $LD_LIBRARY_PATH because I didn't find any usage. Please let me know if I'm missing anything here 🙏

ENV LD_LIBRARY_PATH=/app/lib:$LD_LIBRARY_PATH
ENV LD_LIBRARY_PATH=/app/lib

# Label the image so that it's hidden on cloud engines.
LABEL com.docker.desktop.service="model-runner"
Expand Down Expand Up @@ -144,6 +144,60 @@ RUN curl -LsSf https://astral.sh/uv/install.sh | sh \
&& ~/.local/bin/uv pip install --python /opt/sglang-env/bin/python "sglang==${SGLANG_VERSION}"

RUN /opt/sglang-env/bin/python -c "import sglang; print(sglang.__version__)" > /opt/sglang-env/version

# --- Diffusers variant ---
FROM llamacpp AS diffusers

# Python package versions for reproducible builds
ARG DIFFUSERS_VERSION=0.36.0
ARG TORCH_VERSION=2.9.1
ARG TRANSFORMERS_VERSION=4.57.5
ARG ACCELERATE_VERSION=1.3.0
ARG SAFETENSORS_VERSION=0.5.2
ARG HUGGINGFACE_HUB_VERSION=0.34.0
ARG BITSANDBYTES_VERSION=0.49.1
ARG FASTAPI_VERSION=0.115.12
ARG UVICORN_VERSION=0.34.1
ARG PILLOW_VERSION=11.2.1

USER root

RUN apt update && apt install -y \
python3 python3-venv python3-dev \
curl ca-certificates build-essential \
&& rm -rf /var/lib/apt/lists/*

RUN mkdir -p /opt/diffusers-env && chown -R modelrunner:modelrunner /opt/diffusers-env

USER modelrunner

# Install uv and diffusers as modelrunner user
RUN curl -LsSf https://astral.sh/uv/install.sh | sh \
&& ~/.local/bin/uv venv --python /usr/bin/python3 /opt/diffusers-env \
&& ~/.local/bin/uv pip install --python /opt/diffusers-env/bin/python \
"diffusers==${DIFFUSERS_VERSION}" \
"torch==${TORCH_VERSION}" \
"transformers==${TRANSFORMERS_VERSION}" \
"accelerate==${ACCELERATE_VERSION}" \
"safetensors==${SAFETENSORS_VERSION}" \
"huggingface_hub==${HUGGINGFACE_HUB_VERSION}" \
"bitsandbytes==${BITSANDBYTES_VERSION}" \
"fastapi==${FASTAPI_VERSION}" \
"uvicorn[standard]==${UVICORN_VERSION}" \
"pillow==${PILLOW_VERSION}"

# Copy Python server code
USER root
COPY python/diffusers_server /tmp/diffusers_server/
RUN PYTHON_SITE_PACKAGES=$(/opt/diffusers-env/bin/python -c "import site; print(site.getsitepackages()[0])") && \
mkdir -p "$PYTHON_SITE_PACKAGES/diffusers_server" && \
cp -r /tmp/diffusers_server/* "$PYTHON_SITE_PACKAGES/diffusers_server/" && \
chown -R modelrunner:modelrunner "$PYTHON_SITE_PACKAGES/diffusers_server/" && \
rm -rf /tmp/diffusers_server
USER modelrunner

RUN /opt/diffusers-env/bin/python -c "import diffusers; print(diffusers.__version__)" > /opt/diffusers-env/version

FROM llamacpp AS final-llamacpp
# Copy the built binary from builder
COPY --from=builder /app/model-runner /app/model-runner
Expand All @@ -155,3 +209,7 @@ COPY --from=builder /app/model-runner /app/model-runner
FROM sglang AS final-sglang
# Copy the built binary from builder-sglang (without vLLM)
COPY --from=builder-sglang /app/model-runner /app/model-runner

FROM diffusers AS final-diffusers
# Copy the built binary from builder (with diffusers support)
COPY --from=builder /app/model-runner /app/model-runner
15 changes: 14 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ VLLM_BASE_IMAGE := nvidia/cuda:13.0.2-runtime-ubuntu24.04
DOCKER_IMAGE := docker/model-runner:latest
DOCKER_IMAGE_VLLM := docker/model-runner:latest-vllm-cuda
DOCKER_IMAGE_SGLANG := docker/model-runner:latest-sglang
DOCKER_IMAGE_DIFFUSERS := docker/model-runner:latest-diffusers
DOCKER_TARGET ?= final-llamacpp
PORT := 8080
MODELS_PATH := $(shell pwd)/models-store
Expand All @@ -25,7 +26,7 @@ DOCKER_BUILD_ARGS := \
BUILD_DMR ?= 1

# Main targets
.PHONY: build run clean test integration-tests test-docker-ce-installation docker-build docker-build-multiplatform docker-run docker-build-vllm docker-run-vllm docker-build-sglang docker-run-sglang docker-run-impl help validate lint
.PHONY: build run clean test integration-tests test-docker-ce-installation docker-build docker-build-multiplatform docker-run docker-build-vllm docker-run-vllm docker-build-sglang docker-run-sglang docker-run-impl help validate lint docker-build-diffusers docker-run-diffusers
# Default target
.DEFAULT_GOAL := build

Expand Down Expand Up @@ -117,6 +118,16 @@ docker-build-sglang:
docker-run-sglang: docker-build-sglang
@$(MAKE) -s docker-run-impl DOCKER_IMAGE=$(DOCKER_IMAGE_SGLANG)

# Build Diffusers Docker image
docker-build-diffusers:
@$(MAKE) docker-build \
DOCKER_TARGET=final-diffusers \
DOCKER_IMAGE=$(DOCKER_IMAGE_DIFFUSERS)

# Run Diffusers Docker container with TCP port access and mounted model storage
docker-run-diffusers: docker-build-diffusers
@$(MAKE) -s docker-run-impl DOCKER_IMAGE=$(DOCKER_IMAGE_DIFFUSERS)

# Common implementation for running Docker container
docker-run-impl:
@echo ""
Expand Down Expand Up @@ -151,6 +162,8 @@ help:
@echo " docker-run-vllm - Run vLLM Docker container"
@echo " docker-build-sglang - Build SGLang Docker image"
@echo " docker-run-sglang - Run SGLang Docker container"
@echo " docker-build-diffusers - Build Diffusers Docker image"
@echo " docker-run-diffusers - Run Diffusers Docker container"
@echo " help - Show this help message"
@echo ""
@echo "Backend configuration options:"
Expand Down
6 changes: 6 additions & 0 deletions cmd/cli/commands/compose_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,12 @@ func TestParseBackendMode(t *testing.T) {
expected: inference.BackendModeReranking,
expectError: false,
},
{
name: "image-generation mode",
input: "image-generation",
expected: inference.BackendModeImageGeneration,
expectError: false,
},
{
name: "invalid mode",
input: "invalid",
Expand Down
6 changes: 4 additions & 2 deletions cmd/cli/commands/configure_flags.go
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ func (f *ConfigureFlags) RegisterFlags(cmd *cobra.Command) {
cmd.Flags().StringVar(&f.HFOverrides, "hf_overrides", "", "HuggingFace model config overrides (JSON) - vLLM only")
cmd.Flags().Var(NewFloat64PtrValue(&f.GPUMemoryUtilization), "gpu-memory-utilization", "fraction of GPU memory to use for the model executor (0.0-1.0) - vLLM only")
cmd.Flags().Var(NewBoolPtrValue(&f.Think), "think", "enable reasoning mode for thinking models")
cmd.Flags().StringVar(&f.Mode, "mode", "", "backend operation mode (completion, embedding, reranking)")
cmd.Flags().StringVar(&f.Mode, "mode", "", "backend operation mode (completion, embedding, reranking, image-generation)")
}

// BuildConfigureRequest builds a scheduling.ConfigureRequest from the flags.
Expand Down Expand Up @@ -243,7 +243,9 @@ func parseBackendMode(mode string) (inference.BackendMode, error) {
return inference.BackendModeEmbedding, nil
case "reranking":
return inference.BackendModeReranking, nil
case "image-generation":
return inference.BackendModeImageGeneration, nil
default:
return inference.BackendModeCompletion, fmt.Errorf("invalid mode %q: must be one of completion, embedding, reranking", mode)
return inference.BackendModeCompletion, fmt.Errorf("invalid mode %q: must be one of completion, embedding, reranking, image-generation", mode)
}
}
40 changes: 32 additions & 8 deletions cmd/cli/commands/package.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,39 +38,43 @@ func newPackagedCmd() *cobra.Command {
var opts packageOptions

c := &cobra.Command{
Use: "package (--gguf <path> | --safetensors-dir <path> | --from <model>) [--license <path>...] [--mmproj <path>] [--context-size <tokens>] [--push] MODEL",
Short: "Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact.",
Long: "Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact, with optional licenses and multimodal projector. The package is sent to the model-runner, unless --push is specified.\n" +
Use: "package (--gguf <path> | --safetensors-dir <path> | --dduf <path> | --from <model>) [--license <path>...] [--mmproj <path>] [--context-size <tokens>] [--push] MODEL",
Short: "Package a GGUF file, Safetensors directory, DDUF file, or existing model into a Docker model OCI artifact.",
Long: "Package a GGUF file, Safetensors directory, DDUF file, or existing model into a Docker model OCI artifact, with optional licenses and multimodal projector. The package is sent to the model-runner, unless --push is specified.\n" +
"When packaging a sharded GGUF model, --gguf should point to the first shard. All shard files should be siblings and should include the index in the file name (e.g. model-00001-of-00015.gguf).\n" +
"When packaging a Safetensors model, --safetensors-dir should point to a directory containing .safetensors files and config files (*.json, merges.txt). All files will be auto-discovered and config files will be packaged into a tar archive.\n" +
"When packaging a DDUF file (Diffusers Unified Format), --dduf should point to a .dduf archive file.\n" +
"When packaging from an existing model using --from, you can modify properties like context size to create a variant of the original model.\n" +
"For multimodal models, use --mmproj to include a multimodal projector file.",
Args: func(cmd *cobra.Command, args []string) error {
if err := requireExactArgs(1, "package", "MODEL")(cmd, args); err != nil {
return err
}

// Validate that exactly one of --gguf, --safetensors-dir, or --from is provided (mutually exclusive)
// Validate that exactly one of --gguf, --safetensors-dir, --dduf, or --from is provided (mutually exclusive)
sourcesProvided := 0
if opts.ggufPath != "" {
sourcesProvided++
}
if opts.safetensorsDir != "" {
sourcesProvided++
}
if opts.ddufPath != "" {
sourcesProvided++
}
if opts.fromModel != "" {
sourcesProvided++
}

if sourcesProvided == 0 {
return fmt.Errorf(
"One of --gguf, --safetensors-dir, or --from is required.\n\n" +
"One of --gguf, --safetensors-dir, --dduf, or --from is required.\n\n" +
"See 'docker model package --help' for more information",
)
}
if sourcesProvided > 1 {
return fmt.Errorf(
"Cannot specify more than one of --gguf, --safetensors-dir, or --from. Please use only one source.\n\n" +
"Cannot specify more than one of --gguf, --safetensors-dir, --dduf, or --from. Please use only one source.\n\n" +
"See 'docker model package --help' for more information",
)
}
Expand Down Expand Up @@ -141,6 +145,15 @@ func newPackagedCmd() *cobra.Command {
}
}

// Validate DDUF path if provided
if opts.ddufPath != "" {
var err error
opts.ddufPath, err = validateAbsolutePath(opts.ddufPath, "DDUF")
if err != nil {
return err
}
}

// Validate dir-tar paths are relative (not absolute)
for _, dirPath := range opts.dirTarPaths {
if filepath.IsAbs(dirPath) {
Expand All @@ -167,6 +180,7 @@ func newPackagedCmd() *cobra.Command {

c.Flags().StringVar(&opts.ggufPath, "gguf", "", "absolute path to gguf file")
c.Flags().StringVar(&opts.safetensorsDir, "safetensors-dir", "", "absolute path to directory containing safetensors files and config")
c.Flags().StringVar(&opts.ddufPath, "dduf", "", "absolute path to DDUF archive file (Diffusers Unified Format)")
c.Flags().StringVar(&opts.fromModel, "from", "", "reference to an existing model to repackage")
c.Flags().StringVar(&opts.chatTemplatePath, "chat-template", "", "absolute path to chat template file (must be Jinja format)")
c.Flags().StringArrayVarP(&opts.licensePaths, "license", "l", nil, "absolute path to a license file")
Expand All @@ -182,6 +196,7 @@ type packageOptions struct {
contextSize uint64
ggufPath string
safetensorsDir string
ddufPath string
fromModel string
licensePaths []string
dirTarPaths []string
Expand All @@ -197,7 +212,7 @@ type builderInitResult struct {
cleanupFunc func() // Optional cleanup function for temporary files
}

// initializeBuilder creates a package builder from GGUF, Safetensors, or existing model
// initializeBuilder creates a package builder from GGUF, Safetensors, DDUF, or existing model
func initializeBuilder(cmd *cobra.Command, opts packageOptions) (*builderInitResult, error) {
result := &builderInitResult{}

Expand Down Expand Up @@ -246,7 +261,14 @@ func initializeBuilder(cmd *cobra.Command, opts packageOptions) (*builderInitRes
return nil, fmt.Errorf("add gguf file: %w", err)
}
result.builder = pkg
} else {
} else if opts.ddufPath != "" {
cmd.PrintErrf("Adding DDUF file from %q\n", opts.ddufPath)
pkg, err := builder.FromPath(opts.ddufPath)
if err != nil {
return nil, fmt.Errorf("add dduf file: %w", err)
}
result.builder = pkg
} else if opts.safetensorsDir != "" {
// Safetensors model from directory
cmd.PrintErrf("Scanning directory %q for safetensors model...\n", opts.safetensorsDir)
safetensorsPaths, tempConfigArchive, err := packaging.PackageFromDirectory(opts.safetensorsDir)
Expand Down Expand Up @@ -276,6 +298,8 @@ func initializeBuilder(cmd *cobra.Command, opts packageOptions) (*builderInitRes
}
}
result.builder = pkg
} else {
return nil, fmt.Errorf("no model source specified")
}

return result, nil
Expand Down
3 changes: 2 additions & 1 deletion cmd/cli/docs/reference/docker_model_configure.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ options:
swarm: false
- option: mode
value_type: string
description: backend operation mode (completion, embedding, reranking)
description: |
backend operation mode (completion, embedding, reranking, image-generation)
deprecated: false
hidden: false
experimental: false
Expand Down
16 changes: 13 additions & 3 deletions cmd/cli/docs/reference/docker_model_package.yaml
Original file line number Diff line number Diff line change
@@ -1,13 +1,14 @@
command: docker model package
short: |
Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact.
Package a GGUF file, Safetensors directory, DDUF file, or existing model into a Docker model OCI artifact.
long: |-
Package a GGUF file, Safetensors directory, or existing model into a Docker model OCI artifact, with optional licenses and multimodal projector. The package is sent to the model-runner, unless --push is specified.
Package a GGUF file, Safetensors directory, DDUF file, or existing model into a Docker model OCI artifact, with optional licenses and multimodal projector. The package is sent to the model-runner, unless --push is specified.
When packaging a sharded GGUF model, --gguf should point to the first shard. All shard files should be siblings and should include the index in the file name (e.g. model-00001-of-00015.gguf).
When packaging a Safetensors model, --safetensors-dir should point to a directory containing .safetensors files and config files (*.json, merges.txt). All files will be auto-discovered and config files will be packaged into a tar archive.
When packaging a DDUF file (Diffusers Unified Format), --dduf should point to a .dduf archive file.
When packaging from an existing model using --from, you can modify properties like context size to create a variant of the original model.
For multimodal models, use --mmproj to include a multimodal projector file.
usage: docker model package (--gguf <path> | --safetensors-dir <path> | --from <model>) [--license <path>...] [--mmproj <path>] [--context-size <tokens>] [--push] MODEL
usage: docker model package (--gguf <path> | --safetensors-dir <path> | --dduf <path> | --from <model>) [--license <path>...] [--mmproj <path>] [--context-size <tokens>] [--push] MODEL
pname: docker model
plink: docker_model.yaml
options:
Expand All @@ -30,6 +31,15 @@ options:
experimentalcli: false
kubernetes: false
swarm: false
- option: dduf
value_type: string
description: absolute path to DDUF archive file (Diffusers Unified Format)
deprecated: false
hidden: false
experimental: false
experimentalcli: false
kubernetes: false
swarm: false
- option: dir-tar
value_type: stringArray
default_value: '[]'
Expand Down
Loading
Loading