Skip to content

rohitrmd/multi-agent-supervisor-system

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Agent Image Processing System

A LangGraph-based system that coordinates multiple AI agents for image processing tasks using the Agent-Supervisor pattern.

Overview

This project implements a multi-agent system based on LangGraph's Agent-Supervisor pattern, where a supervisor agent coordinates multiple specialized image processing agents. The system demonstrates the use of LangGraph's edgeless graph architecture and Command construct for agent coordination.

System Architecture

Workflow Graph

The system consists of:

  1. Supervisor Agent

    • Coordinates the workflow
    • Makes intelligent decisions about task sequencing
    • Routes requests to appropriate agents using LangGraph's Command construct
  2. Task Agents

    • Image Generation Agent: Handles image creation requests
    • Text Overlay Agent: Adds text to images
    • Background Removal Agent: Removes image backgrounds

The graph visualization above shows:

  • The initial entry point (START) connecting to the Supervisor
  • The Supervisor node which coordinates all task agents
  • Task agent nodes for specific image processing operations
  • The potential paths through the system based on user requests

Key LangGraph Features Used

  1. Edgeless Graph Architecture

    • Instead of explicit edges between nodes, routing is handled by agent Commands
    • Each agent returns a Command that specifies the next agent to run
    • Simplifies graph structure and makes it more flexible
  2. Command Construct

    Command(
        goto="next_agent",
        update={
            "next_agent": "next_agent",
            "current_task": "current_task",
            "messages": [...],
        }
    )
    • goto: Specifies the next agent to execute
    • update: Updates the state that's passed between agents
  3. StateGraph

    builder = StateGraph(AgentState)
    builder.add_node("supervisor", create_supervisor_agent())
    builder.add_edge(START, "supervisor")
    • Manages state transitions between agents
    • Only requires initial edge from START to supervisor

Setup

  1. Create a virtual environment:
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
  1. Create a .env file with your OpenAI API key:
OPENAI_API_KEY=your-key-here

Running the System

From the project root directory, run:

python -m src.main

Example inputs to try:

  • "Generate an image of a sunset and add text 'Beautiful Evening' to it"
  • "Create an image of a mountain landscape and remove its background"
  • "Generate an image of a cat with 'Hello' text"

The system will:

  1. Take your input
  2. Use the supervisor to determine the sequence of tasks
  3. Route the request through appropriate agents
  4. Show the execution path and final result

Evaluation Framework

The system includes an evaluation framework to assess the performance and correctness of the multi-agent workflow.

For detailed information about the evaluation framework, see Evaluation Documentation.

Project Structure

image_processing_agents/
├── src/
│   ├── agents/
│   │   ├── supervisor.py      # Supervisor agent implementation
│   │   ├── image_generation.py
│   │   ├── text_overlay.py
│   │   └── background_removal.py
│   ├── evaluation/          # Evaluation framework
│   │   ├── evaluators.py    # Evaluation functions
│   │   ├── create_dataset.py # Test dataset creation
│   │   └── run_evaluation.py # Main evaluation script
│   ├── agent_types/
│   │   └── state.py          # State type definitions
│   ├── config/
│   │   └── settings.py       # Configuration settings
│   └── main.py              # Main execution script
├── .env                     # Environment variables
├── .gitignore
└── requirements.txt

Implementation Details

  1. State Management

    • Uses TypedDict for type-safe state management
    • Tracks messages, current task, and image URLs
    • Maintains execution history
  2. Agent Communication

    • Agents communicate through state updates
    • Each agent adds its actions to the message history
    • Supervisor makes decisions based on complete context
  3. Routing Logic

    • Supervisor analyzes both original request and current state
    • Makes sequential decisions about task execution
    • Uses LLM to understand complex requests

Based On

This implementation follows the LangGraph Agent-Supervisor tutorial: LangGraph Multi-Agent Tutorial

Evaluation Components

  1. Test Dataset

    • Predefined test cases with expected outcomes
    • Stored in LangSmith for tracking and analysis
  2. LLM Judge (GPT-4)

    • Evaluates task completion accuracy
    • Analyzes agent execution patterns
    • Provides detailed reasoning for scores
  3. Metrics

    • Task Completion Score (0.0 - 1.0)
    • Node Execution Score (0.0 - 1.0)
    • Execution Time
  4. Results Storage

    • Evaluation results are stored in LangSmith
    • Detailed logs of agent interactions
    • Performance metrics and analysis

About

Multi Agent Supervisor System

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages