Cloud Run Job for high-performance image processing using sharp and Material Color Utilities. Processes images in parallel across multiple tasks for the slideshow backend.
This processor is designed to run as a Cloud Run Job (not a service) that:
- Fetches pending images from the backend API (up to 50 per execution)
- Shards work across multiple parallel tasks using
CLOUD_RUN_TASK_INDEXandCLOUD_RUN_TASK_COUNT - Processes each image for all registered device sizes:
- Downloads original from Google Cloud Storage
- Resizes using
sharpwith high-quality JPEG output - Extracts dominant colors using Material Color Utilities (256px longest-side proxy for better aspect ratio representation)
- Uploads processed images to GCS
- Dual metadata persistence:
- Writes JSON sidecars to GCS (
images/metadata/*.json) with 30-day auto-archive - Calls backend API to persist results to SQLite database
- Writes JSON sidecars to GCS (
- Reports failures back to backend for tracking and retry coordination
- No idle costs - Only billed while processing
- Parallel execution - Process 50 images across 10 tasks simultaneously
- Isolated failures - Each task retries independently (max 3 attempts)
- Better resources - 2Gi RAM + 2 vCPU per task vs 512Mi shared in web service
We use sharp (Node.js native module) instead of ImageMagick for:
- Performance: 4-5x faster resizing with libvips
- Quality: Superior Lanczos3 resampling by default
- Memory efficiency: Streaming processing, lower memory footprint
- Deno 2.x support: Native npm modules with
nodeModulesDir: "auto"
Colors are extracted from a 256px longest-side proxy (not cropped):
await sharp(buffer)
.resize(256, 256, { fit: "inside", withoutEnlargement: true })
.ensureAlpha()
.raw()
.toBuffer({ resolveWithObject: true });This preserves aspect ratio better than the original 128x128 crop, providing more representative color sampling for wide/tall images.
Dual persistence for resilience:
- Primary: Direct API calls to backend (
POST /api/processed-images) - Backup: JSON files in GCS polled by backend every 60s
- If API calls fail (network issue, backend restart), metadata sync recovers missing records
Required:
GCS_BUCKET_NAME- Google Cloud Storage bucket for imagesBACKEND_API_URL- Backend service URL (e.g.,https://backend-xyz.run.app)BACKEND_AUTH_TOKEN- Authentication token for processor API endpoints
Cloud Run provides automatically:
CLOUD_RUN_TASK_INDEX- Current task index (0-based)CLOUD_RUN_TASK_COUNT- Total number of parallel tasksCLOUD_RUN_TASK_ATTEMPT- Retry attempt number (0-2)
- Deno 2.6.4+
- Google Cloud credentials configured (
GOOGLE_APPLICATION_CREDENTIALS)
deno installexport GCS_BUCKET_NAME="your-bucket"
export BACKEND_API_URL="http://localhost:8080"
export BACKEND_AUTH_TOKEN="your-token"
deno task devSimulate task sharding:
export CLOUD_RUN_TASK_INDEX=0
export CLOUD_RUN_TASK_COUNT=2
deno task startgcloud builds submit --config cloudbuild.yaml \
--substitutions=_GCS_BUCKET_NAME=your-bucket,_BACKEND_API_URL=https://your-backend.run.appgcloud run jobs execute slideshow-processor --region=us-central1The backend service triggers this job when:
- New images are uploaded (batches of up to 50)
- Images have been pending for 30 seconds
- Orphaned processing images are recovered on restart
Update cloudbuild.yaml or pass via --substitutions:
_GCS_BUCKET_NAME- Your GCS bucket name_BACKEND_API_URL- Your backend service URL_BACKEND_AUTH_SECRET- Secret Manager path to auth token_SERVICE_ACCOUNT- Service account email (needsroles/storage.objectAdmin)
Configured in cloudbuild.yaml:
- Tasks: 10 parallel containers
- Max Retries: 3 attempts per task
- Timeout: 15 minutes per task
- CPU: 2 vCPU (sharp benefits from multi-core)
- Memory: 2Gi RAM
Backend queues up to 50 images per job execution (configurable in job-queue.ts).
gcloud run jobs executions list --job=slideshow-processor --region=us-central1gcloud run jobs executions logs <execution-id> --region=us-central1Each task logs:
π Task 3/10 starting (attempt 0)
π Total pending images: 50
π¦ Task 3 processing 5 images
πΌοΈ Processing abc123 (1/5)
Attempt 1, targeting 3 devices
π₯ Downloading gs://bucket/images/originals/abc123.jpg
β
Downloaded 2456789 bytes
π¨ Extracting colors from 256px proxy...
β
Extracted colors: #4A5D23, #8B7355, #D4C4B0
π Resizing for Kitchen Display (1920x1080)
β
Uploaded to gs://bucket/processed/Kitchen Display/abc123.jpg
πΎ Saved metadata to gs://bucket/images/metadata/abc123.json
β
Submitted 3 processed images to backend
π Completed processing for all 3 devices
β
Success
β¨ Task 3 complete: 5 processed, 0 failed
The Dockerfile pre-caches dependencies in a multi-stage build. If you see binding errors:
# Rebuild with --no-cache
docker build --no-cache -t processor .Verify PROCESSOR_AUTH_TOKEN matches backend's expected value:
# Check secret in backend
kubectl get secret processor-auth-token -o jsonpath='{.data.token}' | base64 -dEnsure service account has roles/storage.objectAdmin:
gcloud projects add-iam-policy-binding PROJECT_ID \
--member="serviceAccount:image-processor@PROJECT_ID.iam.gserviceaccount.com" \
--role="roles/storage.objectAdmin"Increase memory allocation in cloudbuild.yaml:
- '--memory=4Gi' # Increase from 2GiThe processor replaces worker-queue.ts in the backend:
| Old (Worker Queue) | New (Cloud Run Jobs) |
|---|---|
| Web Workers in backend container | Separate Cloud Run Job |
| 4 concurrent workers | 10 parallel tasks |
ImageMagick CLI (magick command) |
sharp (Node.js native module) |
| 512Mi shared memory | 2Gi per task (20Gi total) |
| Runs in web service (idle billing) | Job-only (no idle costs) |
| Single-instance processing | Distributed across multiple containers |
| 128x128 cropped color proxy | 256px longest-side preserved ratio |
- Google Photos API resizing: Code preserved in
worker-queue.tsfor future use (commented) - Layout detection: Moved to backend (pre-processing)
- Color palette similarity: Available in backend for pairing logic
MIT