ComfyUI custom nodes for AI image generation via multiple API providers.
Generate and edit images directly in ComfyUI using Google Gemini, xAI Grok, OpenAI (DALL-E / GPT-Image), Alibaba Qwen, and ZhipuAI GLM — all through a unified, easy-to-use node interface.
- 5 AI Providers — Gemini, Grok, OpenAI-compatible, Qwen, and GLM in one plugin
- Text-to-Image — Generate images from text prompts with all providers
- Image Editing — Edit existing images with reference inputs (Gemini, Grok, OpenAI, Qwen)
- Flexible Image Input — Batch
ref_images+ 3 individualimage1/image2/image3slots per node - Inpainting — Mask-based inpainting support (Gemini, Grok, OpenAI, Qwen)
- Smart Image Encoding — JPEG compression to prevent payload bloat (see Image Encoding)
- Token Usage Tracking — Token consumption logged in console for all providers
- Persistent Configuration — API keys and settings saved across sessions in
api_config.json - Custom Models — Dynamically add/remove models through Config nodes
- Batch Generation — Generate up to 4 images per request
- Auto-Install — Missing Python packages are installed automatically on first load
| Node | Provider | Capabilities |
|---|---|---|
| Gemini Image Generate | Google Gemini | Text-to-image, editing (up to 14 ref images), inpainting, aspect ratio, resolution (1K/2K/4K) |
| Grok Image Generate | xAI Grok | Text-to-image, editing (1 ref image), inpainting, aspect ratio, resolution |
| OpenAI Image Generate | OpenAI / Compatible | Text-to-image (DALL-E 3, GPT-Image-1), inpainting, custom size, quality control |
| Qwen Image Generate | Alibaba Qwen | Text-to-image, editing (1-3 ref images), inpainting, negative prompt, watermark control |
| GLM Image Generate | ZhipuAI GLM | Text-to-image, quality (HD/standard), multiple size presets |
| API Config Loader | — | Load saved API configurations with optional overrides |
| API Config Saver | — | Save API keys, URLs, and manage custom model lists |
| API Image Save | — | Save generated images to output directory with metadata |
Search for "API Image Generator" in ComfyUI Manager and install.
cd ComfyUI/custom_nodes
git clone https://github.com/AyinMostima/ComfyUI-APIimage.gitDependencies are auto-installed on first load. To install manually:
pip install -r requirements.txt| Package | Version | Purpose |
|---|---|---|
google-genai |
>= 1.0.0 | Google Gemini SDK |
xai_sdk |
>= 1.5.0 | xAI Grok SDK |
dashscope |
>= 1.17.0 | Alibaba Qwen SDK |
Pillow |
>= 9.0.0 | Image processing |
requests |
>= 2.28.0 | HTTP requests (OpenAI, GLM) |
| Provider | Get Key |
|---|---|
| Gemini | Google AI Studio |
| Grok | xAI Console |
| OpenAI | OpenAI Platform |
| Qwen | DashScope Console |
| GLM | ZhipuAI Platform |
- Right-click canvas > Add Node > APIImage > Choose your provider
- Enter your API Key and prompt
- Connect the output
imagesto a Preview Image or API Image Save node - Click Queue Prompt
[Gemini Image Generate] --> [Preview Image]
|
+--> [API Image Save]
Simply enter a prompt and your API key. Select model and optional parameters:
- Gemini: Aspect ratio (1:1, 16:9, 9:16, etc.), Resolution (1K, 2K, 4K)
- Grok: Aspect ratio (1:1, 16:9, 9:16, 4:3, 3:4), Resolution (1k, 2k)
- OpenAI: Size (256x256 to 1792x1024), Quality (auto, high, standard)
- Qwen: Size presets (928x1664 to 1664x928), Negative prompt, Watermark toggle
- GLM: Quality (HD, standard), Multiple size presets per model
Every generation node provides two ways to supply reference images:
| Input | Type | Description |
|---|---|---|
ref_images |
IMAGE (batch) | Standard ComfyUI batch — connect a Batch Images node to pass multiple images as one tensor |
image1 |
IMAGE | Individual image slot #1 — connect a Load Image node directly |
image2 |
IMAGE | Individual image slot #2 |
image3 |
IMAGE | Individual image slot #3 |
All inputs are merged: ref_images batch + image1 + image2 + image3, in that order.
IMPORTANT: ComfyUI’s Batch Images node requires all images to have the same resolution. If your images have different sizes (e.g., 1390×1800 and 1024×1024), one will be forcibly resized to match the other. This distortion can trigger API content moderation errors (e.g.,
PROHIBITED_CONTENTon Gemini).Using
image1/image2/image3keeps each image at its original resolution — no resizing, no distortion.
[Load Image A] --> image1 --+
[Load Image B] --> image2 --+--> [Gemini Image Generate] --> [Preview Image]
prompt ---+
[Load Image A] --+
+--> [Batch Images] --> ref_images --> [Gemini Image Generate]
[Load Image B] --+
Reference image limits per model:
| Model | Max Ref Images |
|---|---|
gemini-2.5-flash-image |
3 |
gemini-3-pro-image-preview |
14 |
grok-imagine-image / grok-imagine-image-pro |
1 |
gpt-image-1 |
16 |
dall-e-2 |
1 |
qwen-image-edit |
3 (min 1) |
All limits apply to the total combined count from
ref_images+image1-3. Exceeding the limit triggers a warning and truncation.
Connect both a reference image and mask input:
[Load Image] ----> image1 ------+
+--> [Generate Node] --> [Preview Image]
[Mask Editor] ---> mask ---------+
All reference images are automatically encoded as JPEG (quality 95) before transmission to reduce payload size:
| Without optimization | With optimization | Savings |
|---|---|---|
| ~7 MB per image (PNG) | ~500 KB per image (JPEG q95) | 93% smaller |
This prevents:
- Exceeding API payload limits (Gemini 20MB, Grok 10-20MB)
- Timeout errors from large uploads
- Base64 string bloat (~9.5MB per PNG vs ~700KB per JPEG)
Exceptions:
- Masks (Grok/OpenAI) remain PNG for lossless encoding
- OpenAI inpainting uses PNG as required by the API spec (RGBA alpha channel)
All nodes log token consumption to the ComfyUI console after each generation:
[Gemini] Token usage | Attempt: official-primary | Prompt: 1856 | Output: 4231 | Total: 6087
[Gemini] Success | Model: gemini-3-pro-image-preview | Images: 1 | Tokens(prompt/output/total): 1856/4231/6087
Note: Some image generation APIs may not return token usage data (displayed as
N/A).
Use API Config Saver to persist your settings:
[API Config Saver] -- save api_key, base_url, model
|
[API Config Loader] -- load saved config --> [Generate Node]
Add new models via the API Config Saver node:
- Set
api_typeto your provider - Enter the model name in
add_custom_model_name - Queue the node — the model will appear in dropdowns
Or override any model by filling the custom_model field on generation nodes.
API configurations are stored in api_config.json within the plugin directory. You can edit this file directly or use the Config nodes.
{
"api_configs": {
"Gemini Native": {
"api_key": "YOUR_KEY",
"base_url": "(SDK - Automatic)",
"model_name": "gemini-2.5-flash-image",
"custom_models": []
},
"Grok API": {
"api_key": "YOUR_KEY",
"base_url": "https://api.x.ai",
"model_name": "grok-imagine-image-pro",
"custom_models": []
}
}
}All nodes support custom base_url for proxy or self-hosted endpoints. Enter a valid https:// URL to override the default endpoint.
| Provider | Models |
|---|---|
| Gemini | gemini-2.5-flash-image, gemini-3-pro-image-preview |
| Grok | grok-imagine-image, grok-imagine-image-pro |
| OpenAI | dall-e-3, dall-e-2, gpt-image-1 |
| Qwen | qwen-image-plus, qwen-image-edit |
| GLM | glm-image, cogview-4-250304 |
comfyui-apiimage/
__init__.py # Node registration & auto-install
config.py # Persistent config management
utils.py # Image tensor/PIL/bytes conversion utilities
nodes_gemini.py # Google Gemini node
nodes_grok.py # xAI Grok node
nodes_openai.py # OpenAI-compatible node
nodes_qwen.py # Alibaba Qwen node
nodes_glm.py # ZhipuAI GLM node
nodes_config.py # Config Loader & Saver nodes
nodes_save.py # Image save node
api_config.json # Saved API configurations
requirements.txt # Python dependencies
pyproject.toml # Package metadata
| Error | Solution |
|---|---|
Authentication failed (401) |
Check your API key is correct and active |
Rate limit exceeded (429) |
Wait a moment and retry, or switch models |
Model not found (404) |
Verify model name; use custom_model for new models |
Request timed out |
Check network; try a simpler prompt; HD/4K may be slow |
Content filtered |
Modify your prompt — it may have triggered safety filters |
Missing package |
Run pip install -r requirements.txt manually |
Contributions are welcome! Please open an issue or pull request.
- Fork the repository
- Create a feature branch (
git checkout -b feature/new-provider) - Commit your changes
- Push to your branch and open a Pull Request
