Skip to content

nlink-jp/gem-image

Repository files navigation

gem-image

CLI tool for image generation and editing using Vertex AI Gemini 2.5 Flash (native image generation).

Generate images from text prompts or edit existing images via the command line. Designed for batch workflows through shell scripts and pipelines.

Prerequisites

  • Google Cloud project with the Vertex AI API enabled
  • Application Default Credentials — run gcloud auth application-default login

Installation

git clone https://github.com/nlink-jp/gem-image.git
cd gem-image
make build
# Binary: dist/gem-image

Configuration

Variable Required Default Description
GEMIMAGE_PROJECT Yes GCP project ID
GEMIMAGE_LOCATION No us-central1 Vertex AI region
GEMIMAGE_MODEL No gemini-2.5-flash-image Gemini model name

Falls back to GOOGLE_CLOUD_PROJECT / GOOGLE_CLOUD_LOCATION if tool-specific variables are not set.

Alternatively, create ~/.config/gem-image/config.toml:

[gcp]
project  = "your-project-id"
location = "us-central1"

[model]
name = "gemini-2.5-flash-image"

Priority: CLI flags > environment variables > config file > defaults.

Usage

# Generate an image from a text prompt
gem-image -p "A cat sitting on a windowsill" -o cat.png

# Edit an existing image
gem-image -p "Add a rainbow in the sky" -i photo.png -o edited.png

# Multiple input images
gem-image -p "Combine these into a collage" -i a.png -i b.png -o collage.png

# JPEG output (auto-detected from extension)
gem-image -p "A sunset over the ocean" -o sunset.jpg

# Explicit format flag
gem-image -p "A mountain landscape" -o landscape.bin --format jpeg

# Stdin prompt (pipeline)
echo "A minimalist logo for a coffee shop" | gem-image -o logo.png

# Override model
gem-image -p "A watercolor painting" -o art.png -m gemini-2.5-flash-image

Flags

Flag Short Default Description
--prompt -p Image generation prompt (stdin if omitted)
--input -i Input image path (repeatable)
--output -o Output file path (required)
--format png Output format: png or jpeg
--config -c Config file path
--model -m Model name override
--force false Overwrite existing output file
--debug false Enable debug output

Output format resolution

  1. If -o has a .png/.jpg/.jpeg extension → use that format
  2. Otherwise, use --format flag value
  3. Default: png

Exit codes

Code Meaning
0 Success
1 General error
2 Input validation error
3 API error
4 Safety filter block

Token usage

Token consumption is displayed on stderr after each request:

tokens: input=218 output=1290 total=1508

Security

  • Prompt injection protection — user prompts are wrapped with nlk/guard nonce-tagged XML before API submission
  • Input validation — image files are verified by magic bytes (not just extension)
  • Path traversal prevention — all file paths are normalized and validated
  • Overwrite protection — existing files are not overwritten unless --force is specified
  • Image Bomb prevention — image dimensions are checked before decoding to prevent OOM attacks
  • No secrets in output — project IDs and tokens are never logged

License

MIT

About

CLI tool for image generation and editing using Vertex AI Gemini 2.5 Flash

Topics

Resources

License

Stars

Watchers

Forks

Contributors