SAM Backend -- AI-Powered Diagnostic Agent for Siemens S7-1500 PLC Fault Diagnosis

Overview

A maintenance engineer at an industrial plant opens the TIA Portal diagnostic buffer and finds a cascade of cryptic hex event codes -- 16#08:0006, 16#06:4001, 16#01:002B -- across multiple modules and PROFINET stations. To make sense of these, they must cross-reference thousands of pages of German-language Siemens PDF manuals, correlate timestamps, identify which event triggered the cascade, and determine whether the fault involves safety-critical F-System components that require qualified personnel. The current workflow -- call the Siemens hotline, wait on hold, email screenshots, call back the next day, reach a different support agent, repeat the explanation from scratch -- costs hours of unplanned downtime per incident.

SAM is a diagnostic agent backend that reasons over Siemens technical documentation to deliver instant, source-cited diagnostic answers. It parses free-text queries, TIA Portal diagnostic buffer tables, and symptom descriptions (LED states, module types, PROFINET station names). A 14-node LangGraph DAG orchestrates hybrid retrieval -- parallel BM25 keyword search with a hex-code-preserving tokenizer and semantic search over 3072-dimensional embeddings, merged via Reciprocal Rank Fusion (k=60) and reranked by Cohere rerank-v4.0-pro. For multi-event inputs, causal chain analysis identifies root causes through temporal ordering and known S7-1500 fault cascade patterns. Safety-critical F-System events trigger deterministic hazard detection and fixed-template disclaimers. The entire diagnostic flow streams to the client in real time via Server-Sent Events with node-by-node progress visibility.

The system enforces multi-tenant isolation via plant_id across all incident and session data. LangGraph's checkpoint-based interrupt/resume enables human-in-the-loop clarification flows with a 2-round guard against infinite loops. Every diagnosis persists a structured incident record to PostgreSQL, enabling 180-day recurrence detection and SQL-based pattern analysis across stations, modules, event-IDs, and temporal dimensions. Citation enforcement is deterministic -- a post-processing node verifies that every response contains [Quelle: document, section, S. page] references mapping to source documents. Safety disclaimers for F-System and electrical hazard events use fixed templates in German, English, and French -- never LLM-generated text.

Tech Stack

Layer	Technology
Language	Python 3.11+
API framework	FastAPI + Uvicorn (async ASGI)
Agent orchestration	LangGraph (14-node deterministic DAG, AsyncPostgresSaver checkpointer)
LLM	OpenAI GPT-4o (structured output via `with_structured_output()`, temperature=0)
Embeddings	OpenAI text-embedding-3-large (3072-dim)
Reranking	Cohere rerank-v4.0-pro (cross-encoder)
Keyword search	BM25 via rank-bm25 (hex-code-preserving tokenizer)
Vector store	ChromaDB (persistent, cosine similarity)
Database	PostgreSQL 15+ (async via asyncpg, SQLAlchemy 2.0 ORM)
Object storage	MinIO (S3-compatible, presigned PDF URLs)
Authentication	FastAPI-Users (JWT bearer, email verification via AWS SES, 8-digit codes)
Streaming	SSE-Starlette (Server-Sent Events, 8 event types)
State persistence	LangGraph-Checkpoint-PostgreSQL (interrupt/resume)
Admin	SQLAdmin dashboard (session auth, superuser-only)
Validation	Pydantic v2 (runtime schema enforcement)
Observability	LangSmith tracing (optional)

Key Capabilities

Capability	Description
Input Processing
Structured extraction	GPT-4o parses event-IDs (`16#XX:XXXX`), LED states, module types, slots, PROFINET stations from free text, TIA Portal tables, and symptom descriptions
Bilingual support	Detects German/English input via heuristic analysis, responds in the detected language
Format detection	Distinguishes `free_text`, `tia_table`, `symptom_only`, and `mixed` input formats
Diagnostic Intelligence
Intent classification	Routes to 6 intent types: `diagnostic_single`, `diagnostic_multi`, `symptom_based`, `pattern_query`, `clarification_needed`, `general_knowledge`
Safety classification	Deterministic (no LLM) detection of F-System events (`16#06:XXXX`) and electrical hazards via regex pattern matching
Causal chain analysis	Few-shot prompted root cause identification for multi-event diagnostic buffer dumps with temporal ordering across 8 known S7-1500 fault cascade mechanisms
Confidence evaluation	3-tier threshold system (top-1 >= 0.25, top-3 >= 0.40, semantic-only >= 0.50) gates escalation vs. response paths
Retrieval Pipeline
Hybrid search	Parallel BM25 (top-20) + semantic search (top-20), RRF fusion (k=60), Cohere rerank, top-10 chunks
Hex-preserving tokenizer	Custom BM25 tokenizer treats `16#08:0006` as atomic tokens via regex placeholder substitution
Per-event queries	Multi-event inputs generate one BM25 query per event-ID for targeted exact-match retrieval
Response Quality
Citation enforcement	Deterministic post-processing appends `[Quelle: document, section, S. page]` references from retrieved chunks
Safety disclaimers	Fixed templates (DE/EN/FR) prepended for F-System and electrical hazard events -- never LLM-generated
Escalation honesty	Insufficient-confidence responses cite what was searched, explain why nothing matched, and suggest Siemens Support resources
Memory and Patterns
Incident persistence	Every diagnosis writes a structured incident record to PostgreSQL with `plant_id` tenant isolation
Recurrence detection	180-day sliding window queries for station, module, and event-ID matches across historical incidents
Pattern analysis	SQL aggregation by station, module/slot, event-ID, hour-of-day, and day-of-week dimensions
Conversation
Human-in-the-loop	LangGraph interrupt/resume for clarification with 2-round guard against infinite loops
Chat sessions	Persistent sessions with safety-critical message flagging, auto-resolve with incident confirmation
Real-time streaming	SSE events for node transitions, token-by-token response, sources, clarification interrupts

Architecture

SAM processes every diagnostic query through a 14-node deterministic DAG built on LangGraph, with conditional routing driven entirely by state field checks -- no LLM calls in edge decisions.

Agent Decision Flow

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#f8fafc', 'lineColor': '#64748b', 'edgeLabelBackground': '#ffffff'}}}%%
flowchart TD
    START([START]) --> input_parse
    input_parse["input_parse"] --> input_validate["input_validate"]
    input_validate --> intent_classify["intent_classify"]

    intent_classify -- "diagnostic_*\nsymptom_based" --> safety_check["safety_check"]
    intent_classify -- "pattern_query" --> pattern_analysis["pattern_analysis"]
    intent_classify -- "clarification_needed\n(count < 2)" --> clarification["clarification"]
    intent_classify -- "clarification_needed\n(count >= 2)" --> escalation["escalation"]
    intent_classify -- "general_knowledge" --> retrieval["retrieval"]

    safety_check --> retrieval
    retrieval --> confidence_eval["confidence_eval"]

    confidence_eval -- "sufficient" --> memory_query["memory_query"]
    confidence_eval -- "insufficient" --> escalation

    memory_query -- "multi + 2+ events" --> causal_analysis["causal_analysis"]
    memory_query -- "otherwise" --> response_gen["response_gen"]

    causal_analysis --> response_gen

    clarification -- "resume" --> input_parse

    response_gen --> citation_enforce["citation_enforce"]
    escalation --> citation_enforce
    pattern_analysis --> citation_enforce

    citation_enforce --> incident_write["incident_write"]
    incident_write --> STOP([END])

    style input_parse fill:#dbeafe,stroke:#93c5fd,color:#1e40af
    style input_validate fill:#dbeafe,stroke:#93c5fd,color:#1e40af
    style intent_classify fill:#f1f5f9,stroke:#cbd5e1,color:#334155
    style confidence_eval fill:#f1f5f9,stroke:#cbd5e1,color:#334155
    style safety_check fill:#fee2e2,stroke:#fca5a5,color:#991b1b
    style retrieval fill:#fef3c7,stroke:#fcd34d,color:#92400e
    style memory_query fill:#fef3c7,stroke:#fcd34d,color:#92400e
    style pattern_analysis fill:#fef3c7,stroke:#fcd34d,color:#92400e
    style clarification fill:#ffedd5,stroke:#fdba74,color:#9a3412
    style escalation fill:#ffedd5,stroke:#fdba74,color:#9a3412
    style causal_analysis fill:#ede9fe,stroke:#c4b5fd,color:#5b21b6
    style response_gen fill:#d1fae5,stroke:#6ee7b7,color:#065f46
    style citation_enforce fill:#d1fae5,stroke:#6ee7b7,color:#065f46
    style incident_write fill:#d1fae5,stroke:#6ee7b7,color:#065f46

Hybrid Retrieval Pipeline

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#f8fafc', 'lineColor': '#64748b', 'edgeLabelBackground': '#ffffff'}}}%%
flowchart LR
    query["User Query"] --> bm25["BM25 Search\n(top-20)"]
    query --> semantic["Semantic Search\nChromaDB (top-20)"]
    bm25 --> rrf["RRF Fusion\n(k=60)"]
    semantic --> rrf
    rrf --> rerank["Cohere Rerank\nrerank-v4.0-pro\n(top-10)"]
    rerank --> confidence["Confidence\nEvaluation"]

    style query fill:#dbeafe,stroke:#93c5fd,color:#1e40af
    style bm25 fill:#fef3c7,stroke:#fcd34d,color:#92400e
    style semantic fill:#fef3c7,stroke:#fcd34d,color:#92400e
    style rrf fill:#f1f5f9,stroke:#cbd5e1,color:#334155
    style rerank fill:#fef3c7,stroke:#fcd34d,color:#92400e
    style confidence fill:#d1fae5,stroke:#6ee7b7,color:#065f46

Data architecture. PostgreSQL handles relational data -- users, incidents, chat sessions, verification codes -- and stores LangGraph checkpoint snapshots for interrupt/resume via AsyncPostgresSaver. The incidents table uses 11 indexes including a GIN index on event_ids (text array) for efficient overlap queries and a conditional composite index on (plant_id, station_name, created_at DESC) filtered to high/medium confidence incidents for recurrence detection. ChromaDB persists document chunk embeddings (3072-dim, cosine similarity) for semantic retrieval across 6,702 elements extracted from 6 German-language Siemens S7-1500 PDF manuals. MinIO stores source PDF documents and serves them via presigned URLs with 1-hour expiry. The LangGraph checkpointer stores graph state snapshots in PostgreSQL, enabling interrupt/resume for clarification flows and thread forking when a completed conversation receives follow-up queries.

Streaming Protocol

Event	Payload	When
`metadata`	`{thread_id, session_id, timestamp}`	First event -- identifies the conversation thread
`node_transition`	`{node, status, timestamp}`	On entry/exit of each graph node
`token`	`{content}`	Token-by-token streaming from response_gen LLM
`sources`	`{sources: [{document, document_id, section, page, relevance_score}]}`	After retrieval completes
`content_replace`	`{content}`	After citation_enforce if enforced text differs from streamed draft
`clarification`	`{text, thread_id, requires_response}`	When clarification interrupt fires
`done`	`{incident_id, timestamp}`	Graph execution complete
`error`	`{message, code}`	On execution failure

API Reference

Method	Endpoint	Auth	Description
Authentication
POST	`/api/auth/register`	--	Create account with email, display_name, plant_id
POST	`/api/auth/login`	--	Obtain JWT access token (requires verified email)
POST	`/api/auth/logout`	JWT	Invalidate token
POST	`/api/auth/request-verify-token`	--	Request email verification token
POST	`/api/auth/verify`	--	Verify email with JWT token
POST	`/api/auth/verify-code`	--	Verify email with 8-digit code
POST	`/api/auth/forgot-password`	--	Request password reset email
POST	`/api/auth/reset-password`	--	Reset password with JWT token
POST	`/api/auth/reset-password-code`	--	Reset password with 8-digit code
GET	`/api/auth/me`	JWT	Current user profile
PATCH	`/api/auth/me`	JWT	Update user profile
Diagnosis
POST	`/api/diagnose`	JWT	Stream diagnostic analysis via SSE
POST	`/api/diagnose/resume`	JWT	Resume after clarification interrupt
Chat Sessions
GET	`/api/chat/sessions`	JWT	List sessions for current user
GET	`/api/chat/sessions/{session_id}`	JWT	Get session with all messages
PATCH	`/api/chat/sessions/{session_id}`	JWT	Update session status/title
DELETE	`/api/chat/sessions/{session_id}`	JWT	Delete session and messages
Incidents and Patterns
GET	`/api/incidents/detail/{incident_id}`	JWT	Fetch single incident
GET	`/api/incidents/{plant_id}`	JWT	List incidents with filters (station, event_id, time_range)
PATCH	`/api/incidents/{incident_id}/feedback`	JWT	Submit feedback (confirm/correct/reject)
POST	`/api/patterns/{plant_id}`	JWT	Trigger pattern aggregation (by station, event, module/slot)
Documents
GET	`/api/documents/{document_id}/url`	JWT	Presigned MinIO URL for source PDF
Health
GET	`/api/health`	--	Dependency status (PostgreSQL, ChromaDB, MinIO, BM25 index)
Admin
GET	`/admin`	Session	SQLAdmin dashboard (superuser only)

Project Structure

app/
├── __init__.py
├── main.py                              # FastAPI app factory, lifespan, CORS, admin setup
├── config/
│   └── settings.py                      # Pydantic Settings with all environment variables
├── agent/
│   ├── state.py                         # AgentState TypedDict + 8 Pydantic models
│   ├── graph.py                         # 14-node StateGraph with deterministic edge routing
│   ├── checkpointer.py                  # AsyncPostgresSaver initialization
│   ├── nodes/
│   │   ├── input_parse.py               # GPT-4o structured extraction of events, LEDs, modules
│   │   ├── input_validate.py            # Rule-based validation (regex, range checks)
│   │   ├── intent_classify.py           # 6-intent classification with deterministic fallback
│   │   ├── safety_check.py              # Regex-based F-System and electrical hazard detection
│   │   ├── retrieval.py                 # Hybrid retrieval orchestration (BM25 + semantic)
│   │   ├── confidence_eval.py           # 3-tier threshold gating (no LLM)
│   │   ├── memory_query.py              # 180-day incident history lookup
│   │   ├── causal_analysis.py           # Few-shot causal chain identification
│   │   ├── pattern_analysis.py          # 7 SQL aggregation queries for trend detection
│   │   ├── clarification.py             # LangGraph interrupt for follow-up questions
│   │   ├── escalation.py                # Honest "insufficient documentation" responses
│   │   ├── response_gen.py              # 5-section diagnostic response synthesis
│   │   ├── citation_enforce.py          # Deterministic citation and safety disclaimer injection
│   │   └── incident_write.py            # Structured incident extraction and persistence
│   └── prompts/
│       ├── extraction.py                # Input parsing and incident write prompts
│       ├── classification.py            # Intent classification prompt
│       ├── analysis.py                  # Causal, memory summary, and pattern prompts
│       ├── clarification.py             # Follow-up question generation prompt
│       ├── escalation.py                # Escalation response prompt
│       ├── generation.py                # 5-section response generation prompt
│       └── safety_templates.py          # Fixed safety disclaimers (DE/EN/FR)
├── auth/
│   ├── models.py                        # User and VerificationCode ORM models
│   ├── schemas.py                       # UserRead, UserCreate, UserUpdate schemas
│   ├── router.py                        # Auth endpoint aggregation (fastapi-users)
│   ├── code_router.py                   # 8-digit code verification and password reset
│   ├── backend.py                       # JWT bearer transport configuration
│   ├── dependencies.py                  # current_active_user, current_superuser
│   ├── manager.py                       # UserManager with auto-verify on register
│   ├── email.py                         # AWS SES email dispatch
│   ├── verification.py                  # Verification code generation and validation
│   └── admin.py                         # SQLAdmin views and session auth
├── chat/
│   ├── models.py                        # ChatSession and ChatMessage ORM models
│   ├── schemas.py                       # Chat request/response schemas
│   ├── router.py                        # Session CRUD (list, detail, update, delete)
│   └── service.py                       # Message persistence and session management
├── diagnosis/
│   ├── router.py                        # POST /diagnose and /diagnose/resume endpoints
│   ├── schemas.py                       # DiagnoseRequest, ResumeRequest schemas
│   ├── service.py                       # Graph invocation, SSE stream generation
│   └── streaming.py                     # 8 SSE event factory functions
├── memory/
│   ├── models.py                        # Incident ORM model with 11 indexes
│   ├── schemas.py                       # Incident and pattern response schemas
│   ├── router.py                        # Incident CRUD and pattern analysis endpoints
│   └── service.py                       # Incident storage, querying, and aggregation
├── retrieval/
│   ├── pipeline.py                      # Hybrid retrieval orchestration (BM25 + semantic + rerank)
│   ├── bm25.py                          # Hex-preserving BM25 tokenizer and index
│   ├── semantic.py                      # ChromaDB query with distance-to-similarity conversion
│   ├── fusion.py                        # Reciprocal Rank Fusion (k=60) merge
│   └── rerank.py                        # Cohere rerank-v4.0-pro cross-encoder
├── documents/
│   └── router.py                        # Presigned MinIO URL generation
├── health/
│   ├── router.py                        # GET /api/health endpoint
│   └── service.py                       # PostgreSQL, ChromaDB, MinIO, BM25 checks
├── postgres/
│   ├── base.py                          # UUIDMixin, TimestampMixin, DeclarativeBase
│   └── client.py                        # Async session factory, init/close
├── chroma/
│   └── client.py                        # ChromaDB persistent client initialization
└── minio/
    └── client.py                        # MinIO client initialization
alembic/
└── versions/
    ├── a6d2d1066a7e_create_incidents_table.py
    ├── c9e1c0c1fbb0_create_users_table.py
    ├── d3f8a2b1c4e5_auth_upgrade_verification_codes.py
    ├── e4a7b2c3d5f6_rename_codepurpose_enum.py
    └── f5a8b3c4d6e7_create_chat_sessions_and_messages.py
scripts/
└── ingest.py                            # Document ingestion: PDF -> ChromaDB + BM25 + MinIO
production-data/
├── manifest.json                        # Document inventory with UUIDs
├── documents/                        # 40 source PDFs (UUID-named)
└── chunks/                           # 40 JSON chunk files (6,702 elements total)
requirements.txt                         # 22 pinned dependencies
alembic.ini                              # Alembic configuration
Production-Scenarios.md                  # 13 diagnostic scenarios driving architecture decisions
.env.example                             # All environment variables with defaults

How the Agent Works

1. Parse and Classify

The input_parse node invokes GPT-4o with with_structured_output(ParsedInput) to extract event-IDs, LED states, module types, slot numbers, PROFINET station names, firmware versions, and user context from any input format -- free text, TIA Portal diagnostic buffer tables, or bare symptom descriptions. The prompt includes 4 few-shot examples covering single-event, multi-event, symptom-only, and mixed-format inputs. A language heuristic detects German or English and sets the response language accordingly. The input_validate node then runs deterministic checks (regex pattern 16#[0-9a-fA-F]{1,4}(:[0-9a-fA-F]{1,4})? for event-ID format, range validation for slots 0-31, timestamp plausibility from 2000-01-01 to now) and flags issues without blocking the pipeline. intent_classify routes to one of six intent types using LLM classification with a deterministic fallback: if all events are invalid, route to clarification; if 2+ events exist, route to diagnostic_multi. safety_check executes regex-based detection for F-System event prefixes (16#06:) and electrical hazard keywords (Kurzschluss, short-circuit, Überlast, overload) -- no LLM involvement. Known F-System event-IDs (16#06:4000 through 16#06:4003) and F-System keywords (F-CPU, F-DI, F-DQ, Passivierung, Not-Aus, fehlersicher) trigger the safety flag.

2. Retrieve and Reason

The retrieval node builds per-event BM25 queries (one per event-ID) and a semantic query from event descriptions, module types, and user context, then executes BM25 and ChromaDB searches in parallel (top-20 each). Results merge via Reciprocal Rank Fusion (k=60), and Cohere rerank-v4.0-pro selects the top-10 chunks. The BM25 tokenizer preserves hex codes like 16#08:0006 as atomic tokens by substituting them with HEXTOKEN placeholders before whitespace splitting and restoring them after -- preventing codes from being split at colons and hash symbols into meaningless subwords. confidence_eval then applies a 3-tier threshold gate: top-1 score >= 0.25, all top-3 scores >= 0.40, or semantic-only fallback >= 0.50. If confidence is insufficient, the graph routes to escalation instead of generating a speculative response. For multi-event inputs with 2+ events, causal_analysis identifies root cause chains using temporal ordering and 3 few-shot examples of known S7-1500 fault cascades -- short-circuit propagation, PROFINET station loss, F-passivation sequences -- leveraging the S7-1500's chronological diagnostic buffer where the earliest event is the root cause. memory_query searches the 180-day incident history by event-ID, station name, IP address, and module type to detect recurrence patterns, summarizing related incidents via LLM when matches exist.

3. Enforce and Persist

The citation_enforce node deterministically verifies every response contains [Quelle: ...] source references and appends them from retrieved chunks if missing. For safety-critical events, it prepends the appropriate fixed-template disclaimer -- F-System, electrical, or both -- selected by language (DE/EN/FR) and safety category. If the response mentions restart phrases (Neustart, restart, reboot) without qualifying context (Ursache, root cause, prüfen Sie), an additional restart warning is appended. If the enforced text differs from the streamed draft, a content_replace SSE event sends the corrected version to the client. incident_write extracts a structured incident record via LLM with IncidentRecord structured output (with a deterministic fallback using parsed events and causal chain) and persists it to PostgreSQL with plant_id, event-IDs, primary event-ID, causal chain (JSONB), confidence level, and safety-critical flag. The incident's feedback_status starts as unverified and progresses through confirmed, corrected, or rejected via the feedback API -- enabling closed-loop recurrence detection and pattern analysis across the plant's fault history.

Getting Started

# 1. Clone the repository
git clone https://github.com/Fawzi-AI/s7-1500-backend.git && cd s7-1500-backend

# 2. Create a Python 3.11+ virtual environment
python3.11 -m venv venv && source venv/bin/activate

# 3. Install dependencies
pip install -r requirements.txt

# 4. Configure environment variables
cp .env.example .env
# Edit .env: set OPENAI_API_KEY, COHERE_API_KEY, POSTGRES_*, MINIO_*, JWT_SECRET

# 5. Run database migrations
alembic upgrade head

# 6. Ingest Siemens documentation into ChromaDB + BM25 index
python -m scripts.ingest --data-dir production-data

# 7. Start the development server
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

# 8. Verify: Swagger UI at http://localhost:8000/docs, Admin at http://localhost:8000/admin

Development Approach

PR	Title	What was built
#1	Initial Setup	14-node LangGraph diagnostic agent with SSE streaming API
#2	README and .gitignore	Documentation and data protection
#3-#4	Integration Validation	Closed 5 scenario-vs-implementation gaps in pipeline
#5	API and Config	Fixed SSE double-wrapping, structured output schema, Alembic migration
#6	Persistence and Checkpointing	PostgreSQL checkpointer for interrupt/resume, incident ORM fix
#7	Human-in-the-Loop	Clarification interrupt detection, Python 3.11 upgrade
#8	Auth v2	JWT authentication, resume flow fix, incidents endpoint
#9	Auth v3	Email verification, password reset, admin panel (AWS SES)
#13	Chat Flow	Auth startup fix, enum migration for password reset
#14	Chat Flow v2	Chat session persistence with DB models, CRUD API, diagnosis integration
#15	UI Enhancements	Citation pipeline enhancement, document preview API, ingestion metadata fix

Development follows a scenario-driven approach: Production-Scenarios.md defines 13 diagnostic scenarios -- from single event-ID lookups to 47-error cascade failures with burnt insulation -- with expected inputs, outputs, and edge cases before code is written. Scenarios cover:

Single and multi-event diagnostic buffer analysis
Symptom-based diagnosis without event-IDs (LED states only)
Safety-critical F-CPU passivation events
Ambiguous input requiring clarification
Copy-paste TIA Portal table parsing
Firmware mismatch detection
Hybrid retrieval (exact hex-code match + semantic context)
Pattern detection across 6-week incident history
Mixed-language inputs (English question + German diagnostic buffer)
Escalation when knowledge base lacks coverage

All conditional edges in the graph use deterministic state field checks -- no LLM calls in routing decisions. Safety handling follows defense-in-depth: regex-based hazard detection feeds fixed disclaimer templates, ensuring safety text is never LLM-generated. Every diagnosis produces an auditable incident record with a feedback lifecycle (unverified -> confirmed / corrected / rejected). Each PR is self-contained and independently deployable.

Database Schema

%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#f8fafc', 'lineColor': '#64748b', 'edgeLabelBackground': '#ffffff'}}}%%
erDiagram
    users {
        uuid id PK
        string email UK
        string hashed_password
        string display_name
        string plant_id
        boolean is_active
        boolean is_superuser
        boolean is_verified
        datetime created_at
        datetime updated_at
    }

    verification_codes {
        int id PK
        string email
        string code
        text token
        enum purpose
        datetime created_at
        datetime expires_at
        boolean used
    }

    chat_sessions {
        uuid id PK
        string user_id FK
        string plant_id
        string thread_id
        string title
        string status
        string station_name
        string primary_event_id
        string module_type
        int slot
        datetime last_active_at
        datetime created_at
        datetime updated_at
    }

    chat_messages {
        uuid id PK
        uuid session_id FK
        string role
        text content
        string incident_id
        jsonb sources
        boolean is_safety_critical
        datetime created_at
    }

    incidents {
        uuid incident_id PK
        string plant_id
        text_array event_ids
        string primary_event_id
        string station_name
        inet ip_address
        string module_type
        int slot
        int channel
        string line_area
        text diagnosis_summary
        text resolution_suggested
        string confidence
        boolean is_safety_critical
        string intent
        jsonb causal_chain
        string user_id
        string session_id
        string thread_id
        string feedback_status
        text feedback_notes
        datetime corrected_at
        datetime created_at
    }

    chat_sessions ||--o{ chat_messages : "has"

Released under the MIT License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAM Backend -- AI-Powered Diagnostic Agent for Siemens S7-1500 PLC Fault Diagnosis

Overview

Tech Stack

Key Capabilities

Architecture

Agent Decision Flow

Hybrid Retrieval Pipeline

Streaming Protocol

API Reference

Project Structure

How the Agent Works

1. Parse and Classify

2. Retrieve and Reason

3. Enforce and Persist

Getting Started

Development Approach

Database Schema

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
alembic		alembic
app		app
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
Production-Scenarios.md		Production-Scenarios.md
README.md		README.md
alembic.ini		alembic.ini
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

SAM Backend -- AI-Powered Diagnostic Agent for Siemens S7-1500 PLC Fault Diagnosis

Overview

Tech Stack

Key Capabilities

Architecture

Agent Decision Flow

Hybrid Retrieval Pipeline

Streaming Protocol

API Reference

Project Structure

How the Agent Works

1. Parse and Classify

2. Retrieve and Reason

3. Enforce and Persist

Getting Started

Development Approach

Database Schema

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages