gem-image

CLI tool for image generation and editing using Vertex AI Gemini 2.5 Flash (native image generation).

Generate images from text prompts or edit existing images via the command line. Designed for batch workflows through shell scripts and pipelines.

Prerequisites

Google Cloud project with the Vertex AI API enabled
Application Default Credentials — run gcloud auth application-default login

Installation

git clone https://github.com/nlink-jp/gem-image.git
cd gem-image
make build
# Binary: dist/gem-image

Configuration

Variable	Required	Default	Description
`GEMIMAGE_PROJECT`	Yes	—	GCP project ID
`GEMIMAGE_LOCATION`	No	`us-central1`	Vertex AI region
`GEMIMAGE_MODEL`	No	`gemini-2.5-flash-image`	Gemini model name

Falls back to GOOGLE_CLOUD_PROJECT / GOOGLE_CLOUD_LOCATION if tool-specific variables are not set.

Alternatively, create ~/.config/gem-image/config.toml:

[gcp]
project  = "your-project-id"
location = "us-central1"

[model]
name = "gemini-2.5-flash-image"

Priority: CLI flags > environment variables > config file > defaults.

Usage

# Generate an image from a text prompt
gem-image -p "A cat sitting on a windowsill" -o cat.png

# Edit an existing image
gem-image -p "Add a rainbow in the sky" -i photo.png -o edited.png

# Multiple input images
gem-image -p "Combine these into a collage" -i a.png -i b.png -o collage.png

# JPEG output (auto-detected from extension)
gem-image -p "A sunset over the ocean" -o sunset.jpg

# Explicit format flag
gem-image -p "A mountain landscape" -o landscape.bin --format jpeg

# Stdin prompt (pipeline)
echo "A minimalist logo for a coffee shop" | gem-image -o logo.png

# Override model
gem-image -p "A watercolor painting" -o art.png -m gemini-2.5-flash-image

Flags

Flag	Short	Default	Description
`--prompt`	`-p`	—	Image generation prompt (stdin if omitted)
`--input`	`-i`	—	Input image path (repeatable)
`--output`	`-o`	—	Output file path (required)
`--format`	—	`png`	Output format: `png` or `jpeg`
`--config`	`-c`	—	Config file path
`--model`	`-m`	—	Model name override
`--force`	—	`false`	Overwrite existing output file
`--debug`	—	`false`	Enable debug output

Output format resolution

If -o has a .png/.jpg/.jpeg extension → use that format
Otherwise, use --format flag value
Default: png

Exit codes

Code	Meaning
0	Success
1	General error
2	Input validation error
3	API error
4	Safety filter block

Token usage

Token consumption is displayed on stderr after each request:

tokens: input=218 output=1290 total=1508

Security

Prompt injection protection — user prompts are wrapped with nlk/guard nonce-tagged XML before API submission
Input validation — image files are verified by magic bytes (not just extension)
Path traversal prevention — all file paths are normalized and validated
Overwrite protection — existing files are not overwritten unless --force is specified
Image Bomb prevention — image dimensions are checked before decoding to prevent OOM attacks
No secrets in output — project IDs and tokens are never logged

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
cmd		cmd
docs		docs
internal		internal
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.ja.md		README.ja.md
README.md		README.md
config.example.toml		config.example.toml
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gem-image

Prerequisites

Installation

Configuration

Usage

Flags

Output format resolution

Exit codes

Token usage

Security

License

About

Uh oh!

Releases 3

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

gem-image

Prerequisites

Installation

Configuration

Usage

Flags

Output format resolution

Exit codes

Token usage

Security

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Contributors

Uh oh!

Languages