Multi-Agent Image Processing System

A LangGraph-based system that coordinates multiple AI agents for image processing tasks using the Agent-Supervisor pattern.

Overview

This project implements a multi-agent system based on LangGraph's Agent-Supervisor pattern, where a supervisor agent coordinates multiple specialized image processing agents. The system demonstrates the use of LangGraph's edgeless graph architecture and Command construct for agent coordination.

System Architecture

The system consists of:

Supervisor Agent
- Coordinates the workflow
- Makes intelligent decisions about task sequencing
- Routes requests to appropriate agents using LangGraph's Command construct
Task Agents
- Image Generation Agent: Handles image creation requests
- Text Overlay Agent: Adds text to images
- Background Removal Agent: Removes image backgrounds

The graph visualization above shows:

The initial entry point (START) connecting to the Supervisor
The Supervisor node which coordinates all task agents
Task agent nodes for specific image processing operations
The potential paths through the system based on user requests

Key LangGraph Features Used

Edgeless Graph Architecture
- Instead of explicit edges between nodes, routing is handled by agent Commands
- Each agent returns a Command that specifies the next agent to run
- Simplifies graph structure and makes it more flexible

Command Construct

Command(
    goto="next_agent",
    update={
        "next_agent": "next_agent",
        "current_task": "current_task",
        "messages": [...],
    }
)

goto: Specifies the next agent to execute
update: Updates the state that's passed between agents

StateGraph

builder = StateGraph(AgentState)
builder.add_node("supervisor", create_supervisor_agent())
builder.add_edge(START, "supervisor")

Manages state transitions between agents
Only requires initial edge from START to supervisor

Setup

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Create a .env file with your OpenAI API key:

OPENAI_API_KEY=your-key-here

Running the System

From the project root directory, run:

python -m src.main

Example inputs to try:

"Generate an image of a sunset and add text 'Beautiful Evening' to it"
"Create an image of a mountain landscape and remove its background"
"Generate an image of a cat with 'Hello' text"

The system will:

Take your input
Use the supervisor to determine the sequence of tasks
Route the request through appropriate agents
Show the execution path and final result

Evaluation Framework

The system includes an evaluation framework to assess the performance and correctness of the multi-agent workflow.

For detailed information about the evaluation framework, see Evaluation Documentation.

Project Structure

image_processing_agents/
├── src/
│   ├── agents/
│   │   ├── supervisor.py      # Supervisor agent implementation
│   │   ├── image_generation.py
│   │   ├── text_overlay.py
│   │   └── background_removal.py
│   ├── evaluation/          # Evaluation framework
│   │   ├── evaluators.py    # Evaluation functions
│   │   ├── create_dataset.py # Test dataset creation
│   │   └── run_evaluation.py # Main evaluation script
│   ├── agent_types/
│   │   └── state.py          # State type definitions
│   ├── config/
│   │   └── settings.py       # Configuration settings
│   └── main.py              # Main execution script
├── .env                     # Environment variables
├── .gitignore
└── requirements.txt

Implementation Details

State Management
- Uses TypedDict for type-safe state management
- Tracks messages, current task, and image URLs
- Maintains execution history
Agent Communication
- Agents communicate through state updates
- Each agent adds its actions to the message history
- Supervisor makes decisions based on complete context
Routing Logic
- Supervisor analyzes both original request and current state
- Makes sequential decisions about task execution
- Uses LLM to understand complex requests

Based On

This implementation follows the LangGraph Agent-Supervisor tutorial: LangGraph Multi-Agent Tutorial

Evaluation Components

Test Dataset
- Predefined test cases with expected outcomes
- Stored in LangSmith for tracking and analysis
LLM Judge (GPT-4)
- Evaluates task completion accuracy
- Analyzes agent execution patterns
- Provides detailed reasoning for scores
Metrics
- Task Completion Score (0.0 - 1.0)
- Node Execution Score (0.0 - 1.0)
- Execution Time
Results Storage
- Evaluation results are stored in LangSmith
- Detailed logs of agent interactions
- Performance metrics and analysis

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
workflow_graph.png		workflow_graph.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Agent Image Processing System

Overview

System Architecture

Key LangGraph Features Used

Setup

Running the System

Evaluation Framework

Project Structure

Implementation Details

Based On

Evaluation Components

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors 1

Languages

Folders and files

Latest commit

History

Repository files navigation

Multi-Agent Image Processing System

Overview

System Architecture

Key LangGraph Features Used

Setup

Running the System

Evaluation Framework

Project Structure

Implementation Details

Based On

Evaluation Components

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors 1

Languages

Packages