A platform for testing Dialogflow CX agents with AI-powered evaluation, modern UI, and detailed analytics.
- GitHub: dialogflow-test-suite
- Clone:
git clone https://github.com/your-org/dialogflow-test-suite.git
- β Dataset Management: Create, edit, and organize test datasets with direct route access
- β Advanced Question Management: Add, edit, and bulk import questions with dedicated full-screen interface
- β Dynamic Parameter Evaluation: Revolutionary AI evaluation system with fully configurable parameters (Similarity Score, Empathy Level, No-Match Detection, and custom parameters)
- β Legacy-Free Evaluation: New test runs use ONLY dynamic parameter-based scoring - no more hardcoded similarity/empathy fields
- β Enhanced CSV Exports: Comprehensive parameter breakdown exports with unlimited parameters including scores, weights, and reasoning
- β Intelligent HTML Processing: Automatic detection and optional removal of HTML tags from CSV imports with user-controlled settings
- β Dynamic Metadata Editing: Revolutionary key-value pair editor for question metadata (no more raw JSON!)
- β Advanced Search & Filtering: Real-time search across questions and test results with live filtering
- β Table Management: Complete sorting, pagination, and filtering for large datasets
- β Dialogflow Testing: Execute tests against your Dialogflow agents with user-specific access
- β LLM Judge Integration: AI-powered response evaluation using Google Gemini 2.0 Flash with weighted parameter scoring
- β Computed Analytics: Real-time score computation from parameter weights - no stored legacy scores, full backward compatibility
- β Project Selection: Dynamic Google Cloud project selection based on user permissions
- β Quick Test: Instantly test prompts against Dialogflow agents with flow/page selection
- β Enhanced Bulk Import: Optimized CSV upload workflow with proper column mapping, HTML detection, and file handling
- β Test Reporting: View detailed results and analytics with color-coded scoring and parameter visualization
- β Session Parameters Management: Centralized management of quick-add session parameters with full CRUD operations
- β Business Dashboard: Comprehensive analytics dashboard with performance metrics, trends, and insights for stakeholders
- β Questions Search: Full-text search across question text, expected answers, tags, and priority
- β Test Results Search: Comprehensive search across questions, answers, reasoning, and error messages
- β Live Filtering: Real-time search results with instant feedback and smart pagination
- β Advanced Sorting: Click-to-sort functionality for all data columns with visual indicators
- β Configurable Pagination: 10, 25, 50, 100 results per page with proper result counting
- β Empty State Handling: Contextual messages for no results vs no search matches
- β Performance Optimization: Memoized filtering and sorting for smooth interactions
- β
Arrow-Back Navigation: Clean, intuitive
βback buttons replacing complex breadcrumbs - β Dark Theme Design: Professional #121212 dark theme with blue (#0066CC) accents
- β Vertical Space Optimization: Maximized content area with consolidated navigation
- β Consolidated Configuration Accordion: All test run configuration details (test config, timing, message sequence, session parameters) in a single collapsible section
- β Two-Column Responsive Layout: Efficient use of horizontal space with side-by-side configuration display that adapts to screen size
- β Horizontal Message Display: Pre/post-prompt messages shown as compact chips with wrapping instead of vertical lists
- β Full-Screen Editing: Dedicated pages for complex forms instead of cramped modals
- β Responsive Layout: Consistent spacing, padding, and mobile-friendly design
- β Smart File Handling: Proper file input reset and state management for re-uploads
- β Real-time Updates: Auto-refresh functionality for test run monitoring with live status and results
- β Enhanced Tables: Full sorting, pagination, and data display with Material-UI components
- β Intelligent Auto-Refresh: Background polling for running test runs with selective row updates
- β Agent URL Navigation: Corrected Google Cloud Console links with proper location routing
- β User Authentication: Google OAuth with individual IAM permission respect
- β Security Model: Each user accesses only agents they have permissions for
- β User Attribution: Full user tracking with creator information displayed across all test runs and dashboard activity
- β Creator Visibility: "Created By" column in test runs showing full name and email of test creator
- β Dashboard User Context: Recent activity feed shows user attribution for all test activities
- β Multi-User Support: Proper user relationship management with real-time user information display
- β Comprehensive Preferences: Both Quick Test and Create Test Run settings automatically saved and restored
- β Dialogflow Configuration Memory: Project, agent, flow, page, and playbook selections preserved across sessions
- β Session Parameter Persistence: Custom session parameters remembered for each screen independently
- β Session Parameters Management: Centralized management interface for creating, editing, and organizing common session parameters
- β Quick Add Functionality: Pre-configured parameter chips for instant addition to test configurations (no duplicates allowed)
- β Quick Test Preferences: Project, agent, flow, page, playbook, model, and session parameters saved automatically
- β Test Run Preferences: Separate preference system for Create Test Run screen with all Dialogflow Configuration fields
- β API-Based Storage: RESTful endpoints for preference management with proper schema validation
- β Duplicate Prevention: Smart validation prevents duplicate session parameter keys in both frontend and backend
- β Generic Configuration: Flexible key-value session parameters for specialized agent behavior
- β Type Safety: Full TypeScript integration with proper schema alignment between frontend and backend
First time setting up? See the comprehensive docs/setup/developer-setup.md guide for detailed step-by-step instructions.
TL;DR Minimal Setup:
git clone https://github.com/your-org/dialogflow-test-suite.git
cd dialogflow-test-suite
# Configure Google OAuth (required for login)
# Create .env in project root (NOT in backend/ or frontend/)
cp .env.example .env
# Edit .env and add GOOGLE_CLIENT_ID, GOOGLE_CLIENT_SECRET
# See docs/setup/oauth-setup.md for getting OAuth credentials
docker-compose up -d
# Wait 2-3 minutes for first build
# Access: http://localhost:3000
# Login: Use your Google account (all users get admin role by default)- Docker Desktop installed and running
- PowerShell or Command Prompt
- Google Cloud Platform account with Dialogflow CX access (optional - for testing real agents)
cd "C:\Projects\your-workspace\Dialogflow Agent Tester"
docker-compose up -dThe application will automatically:
- β Build all containers (backend, frontend, database, Redis for local caching)
- β Initialize the database with all required tables and columns
- β Run unified migration system to ensure schema consistency (column additions, complex operations, data backfills)
- β Enable hot reload for instant code updates without rebuilds (see Development Workflow below)
- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Documentation: http://localhost:8000/docs
- Authentication: Google OAuth via landing page (requires Google Cloud access for full functionality)
- Webhook Controls: Both Quick Test and Test Runs include webhook enable/disable toggles (defaults to enabled)
- Production: Google OAuth with individual user credentials managed via GitHub Actions
- Infrastructure: Fully managed via Terraform with automated deployments
- Project Access: Users see only Google Cloud projects they have access to
- Agent Access: Users see only Dialogflow agents they have IAM permissions for
- OAuth Configuration: Automatically configured via GitHub Actions environment variables
- Setup Guide: See
docs/setup/for comprehensive setup documentation
- Dashboard: http://localhost:3000/dashboard
- Dataset Management: http://localhost:3000/datasets
- Edit Dataset: http://localhost:3000/datasets/1/edit
- Manage Questions: http://localhost:3000/datasets/1/questions
- Session Parameters: http://localhost:3000/session-parameters
- Quick Test: http://localhost:3000/quick-test
- Test Runs: http://localhost:3000/test-runs
Quick Setup:
# From project root (dialogflow-test-suite/)
cp .env.example .env
# Edit .env and add your values (see below)Minimal Configuration (Required for Login):
# Edit: /.env (project root - NOT /backend/.env or /frontend/.env.local)
# Required for Google OAuth login - YOU MUST HAVE THESE:
GOOGLE_CLIENT_ID=your-client-id.apps.googleusercontent.com
GOOGLE_CLIENT_SECRET=your-client-secret
GOOGLE_REDIRECT_URI=http://localhost:8000/api/v1/auth/google/callback
# Optional - for Dialogflow agent testing:
GOOGLE_CLOUD_PROJECT=your-gcp-project-id
GOOGLE_API_KEY=your-google-api-key-hereImportant File Locations:
- β
/.env(project root) - Used by docker-compose - THIS IS THE ONE YOU NEED - β
/.env.example(project root) - Template with all available variables - β
/backend/.env- Only for direct Python development (not needed for Docker) - β
/frontend/.env.local- Already configured for local Docker (no changes needed)
Detailed Setup Guides:
- π
docs/setup/oauth-setup.md- START HERE - How to get OAuth credentials (REQUIRED) - π
docs/setup/developer-setup.md- Complete first-time setup walkthrough - π
docs/setup/google-auth.md- Google Cloud project setup (optional - for Dialogflow testing) - π
docs/oauth-environment-variables.md- Complete environment variable reference - π
frontend/ENVIRONMENT_CONFIG.md- Frontend-specific environment configuration
Authentication Flow:
- β All users login via Google OAuth SSO
- β First-time users are automatically created
- β Other domains get viewer role
- β No default accounts exist - OAuth setup is mandatory
docker-compose downAfter making code changes, rebuild containers (don't just restart):
# ONLY NEEDED for dependency changes (requirements.txt, package.json)
# Code changes now use hot reload - no rebuild required!
# Backend dependency changes
docker-compose build backend && docker-compose up -d backend
# Frontend dependency changes
docker-compose build frontend && docker-compose up -d frontendHot reload is NOW ENABLED! Code changes appear instantly without Docker rebuilds.
- Backend (Python): Uvicorn watches
.pyfiles β auto-reloads in 1-3 seconds - Frontend (React/Vite): Vite HMR watches source files β updates browser instantly (<1 sec)
- Volume Mounts: Your local code is mounted into containers - saves are live!
# Start containers ONCE (typically on first boot of the day)
docker-compose up -d
# Edit code in VS Code and save - changes appear automatically!
# No docker commands needed for code changes
# Check logs to see hot reload in action
docker-compose logs -f backend # Watch Python files reload
docker-compose logs -f frontend # Watch Vite HMR updatesYou ONLY need docker-compose build when changing:
- β
Python dependencies (
requirements.txt) - β
npm packages (
package.json) - β Dockerfiles (system packages, environment variables)
- β docker-compose.yml configuration
- β NOT for
.py,.ts,.tsx,.cssfile changes - hot reload handles these!
# Backend test: Edit any .py file, save, and check logs
docker-compose logs -f backend
# You'll see: "WatchFiles detected changes... Reloading..."
# Frontend test: Edit any React component, save, and watch browser
# Browser updates instantly without refresh!- User Authentication: JWT-based with role management and secure access
- Business Dashboard: Comprehensive analytics dashboard with performance metrics, trends, and stakeholder insights
- Dataset Management: Create, edit, upload and organize test datasets with direct navigation
- Question Management: Dedicated interface for adding, editing, and bulk importing questions
- Test Execution: Run comprehensive tests against Dialogflow agents
- Webhook Control: Enable/disable webhooks for both Quick Test and Test Runs with per-test configuration
- Results Analysis: View detailed test outcomes and performance metrics
- Project Filtering: Multi-project support with Google Cloud project-based data filtering
- Direct Routing: Navigate directly to dataset editing and question management
- Dark Theme UI: Modern Material-UI interface with responsive design
- Real-time Updates: Auto-refresh functionality for test runs with background polling
- Agent URL Navigation: Corrected Google Cloud Console agent links with proper global location
- API Documentation: Auto-generated with FastAPI
- Infrastructure as Code: Complete Terraform management with automated deployments via GitHub Actions
- OAuth Management: Automated OAuth secret management and environment variable handling
- β
TestRunDetailPage UI Space Optimization (Latest - Sept 30, 2025): Consolidated all configuration sections into single collapsible accordion with two-column responsive layout
- Unified Configuration accordion combines Test Config, Timing, Message Sequence, and Session Parameters
- Two-column Grid layout (50/50 split on desktop, stacks on mobile) for optimal horizontal space usage
- Left column: Test Configuration and Timing information
- Right column: Message Sequence (Pre/Post-Prompt chips) and Session Parameters table
- Accordion collapsed by default for minimal screen real estate usage (~70% reduction in vertical scrolling)
- Maintains horizontal chip display for pre/post prompt messages from previous optimization
- Responsive design automatically adapts to screen size
- β Preference System Bug Fixes (Sept 26, 2025): Fixed critical user preference restoration issues affecting dropdown loading and state persistence
- β Page Dropdown Loading Fix: Resolved timing dependency issues where page dropdowns failed to load based on logged-in user preferences on both QuickTest and CreateTestRun pages
- β Session ID Persistence: Fixed Session ID field not saving/loading properly on QuickTest page - now correctly saves all values including empty strings
- β Duplicate API Call Prevention: Eliminated race conditions causing duplicate page loading API calls and 404 errors by removing conflicting manual loadPages() calls
- β LLM Model Preference Restoration: Fixed LLM Model preferences not restoring properly when Playbook is selected on CreateTestRun page by implementing immediate save pattern
- β Preference Restoration Consistency: Standardized preference saving across QuickTest and CreateTestRun pages to use immediate onChange saves instead of complex useEffect logic
- β Debug Logging Cleanup: Removed all frontend debug console.log statements while preserving essential error handling for production readiness
- β Duplicate Preference API Calls: Fixed duplicate PUT calls to preferences API by removing conflicting useEffect hooks that duplicated immediate onChange saves
- β
FastAPI Route Ordering Bug Fixes: Fixed critical routing issues where
/exportand/importendpoints were being interpreted as parameter IDs causing 422 validation errors - β
CSV Export Standardization: Created shared
csv_utils.pymodule for consistent RFC 4180 compliant CSV escaping across all export functionality - β Test Run CSV Export API: Added dedicated backend endpoint for comprehensive test run CSV export with multi-parameter evaluation breakdown
- β
Authentication Token Standardization: Fixed frontend authentication to use
access_tokenconsistently across all export operations and API calls - β
Route Collision Prevention: Moved
/exportand/importroutes before parameterized routes (/{parameter_id}) in all parameter management endpoints - β Business Dashboard Implementation: Comprehensive analytics dashboard with overview metrics, performance trends, and agent breakdown
- β Dashboard Analytics API: Complete backend API with 5 key endpoints for business insights and performance monitoring
- β Project-Filtered Analytics: All dashboard components respect Google Cloud project selection for multi-project environments
- β Performance Metrics: Total tests, average scores, success rates, and trend analysis with time-based filtering
- β Agent Performance Breakdown: Individual agent scoring and test volume analytics with visual comparisons
- β Recent Activity Feed: Real-time test execution tracking with user attribution and timestamp display
- β Parameter Performance Analysis: Detailed breakdown of evaluation parameter effectiveness across test runs
- β Data Scope Indicators: Clear user context display showing personal vs system-wide data access
- β Modern Dashboard UI: Material-UI cards, charts, and responsive layout with dark theme consistency
- β User Permission Integration: Dashboard respects user roles (admin, test_manager, viewer) for appropriate data visibility
- β Webhook Control System: Implemented webhook enable/disable functionality for both Quick Test and Test Runs with default enabled state
- β Dialogflow API Integration: Added QueryParameters.disable_webhook support to DialogflowService with comprehensive backend implementation
- β UI Controls: Added Material-UI Switch components for webhook toggle in both QuickTestPage and CreateTestRunPage
- β Database Schema: Enhanced TestRun model with enable_webhook column and proper migration support
- β Pure Dynamic Evaluation System: Completely eliminated legacy evaluation fields - all scoring is computed from configurable parameters
- β Enhanced CSV Exports: Added comprehensive parameter breakdown exports with unlimited parameters including individual scores, weights, and reasoning
- β Computed Score Display: UI dynamically computes overall scores from parameter weights - backward compatible with legacy data but future-focused
- β Backend Schema Updates: Enhanced API responses with overall_score field and proper parameter data structures
- β Docker Deployment Improvements: Streamlined deployment process with full system prune and health checks
- β Auto-Refresh Fixed: Test runs page now properly auto-refreshes running/pending tests every 5 seconds
- β
Agent URL Correction: Fixed agent links to use
/locations/global/instead of/locations/us-central1/ - β Background Polling: Implemented efficient Redux action for status updates without full page refresh
- β API Compatibility: Fixed backend API calls to handle single status filtering properly
graph TB
subgraph "Internet"
USER["π€ User Browser"]
end
subgraph "Google Cloud Platform"
subgraph "Firebase Hosting"
FH["Firebase Hosting<br/>your-app.web.app<br/>(reverse proxy to Cloud Run)"]
end
subgraph "Cloud Run β Public (ingress=all)"
FE["Frontend Service<br/>nginx + React SPA<br/>Port 8080"]
end
subgraph "VPC Network"
VPC_CONN["VPC Connector"]
subgraph "Cloud Run β Internal (ingress=internal)"
BE["Backend Service<br/>FastAPI + Python 3.11<br/>Port 8080"]
end
subgraph "Private Services"
DB[("Cloud SQL PostgreSQL 15")]
end
end
AR["Artifact Registry<br/>Docker Images"]
end
subgraph "External APIs"
DFCX["Dialogflow CX API"]
GEMINI["Google Gemini<br/>LLM Judge"]
end
USER -->|"HTTPS"| FH
FH -->|"Cloud Run rewrite"| FE
USER -.->|"Direct access also works"| FE
FE -->|"nginx /api/* proxy<br/>via VPC Connector<br/>(egress=all-traffic)"| BE
BE -->|"VPC Connector"| DB
BE -->|"HTTPS"| DFCX
BE -->|"HTTPS"| GEMINI
AR -.->|"Image pull"| FE
AR -.->|"Image pull"| BE
style FH fill:#ff9800,color:#000
style FE fill:#2196f3,color:#fff
style BE fill:#9c27b0,color:#fff
style DB fill:#4caf50,color:#fff
style VPC_CONN fill:#607d8b,color:#fff
style AR fill:#795548,color:#fff
Key Security Design:
- The backend is not publicly accessible (
ingress=internal) β all API traffic flows through the frontend's nginx reverse proxy via the VPC connector - Both frontend and backend Cloud Run services use the VPC connector (
egress=all-traffic) so that frontendβbackend traffic is treated as "internal" by Cloud Run - Firebase Hosting provides a clean URL (
*.web.app) and proxies all requests to the Cloud Run frontend - The DNS resolver inside the VPC-connected frontend container uses
169.254.169.254(GCE metadata server) since public DNS (8.8.8.8) is unreachable through the VPC connector
- Frontend: React 18 + TypeScript + Material-UI + Redux Toolkit
- Backend: FastAPI + Python 3.11 + SQLAlchemy + Celery
- Database: PostgreSQL 15
- Reverse Proxy: nginx (Cloud Run) + Firebase Hosting (proxy)
- Session Management: In-memory sessions (production), Redis (local development)
- Deployment: Docker + Docker Compose + GCP Cloud Run + Firebase Hosting
Local Development:
Frontend (React) β Port 3000 (nginx proxies /api/* to backend)
Backend (FastAPI) β Port 8000
Database (PostgreSQL)β Port 5432
Cache (Redis) β Port 6379
Production (GCP):
Firebase Hosting β your-app.web.app (proxy to Cloud Run)
Frontend (Cloud Run) β nginx + React SPA, port 8080 (public)
Backend (Cloud Run) β FastAPI, port 8080 (internal only, via VPC)
Database (Cloud SQL) β PostgreSQL 15 (VPC-connected)
The application features an evaluation architecture that eliminates hardcoded scoring fields in favor of a fully dynamic, parameter-driven system.
- Similarity Score (Default weight: 60%) - Semantic similarity between expected and actual responses
- Empathy Level (Default weight: 30%) - Empathetic tone evaluation for customer service contexts
- No-Match Detection (Default weight: 10%) - Validates appropriate "can't help" responses
- β Unlimited Parameters: Add custom evaluation criteria (accuracy, completeness, relevance, etc.)
- β Configurable Weights: Set parameter importance from 0-100%
- β Custom Prompts: Define LLM evaluation instructions for specialized parameters
- β User-Created Parameters: Each user can create organization-specific evaluation criteria
-- Legacy (deprecated, nullable)
similarity_score: INTEGER NULL
empathy_score: INTEGER NULL
overall_score: INTEGER NULL
-- New dynamic system (primary)
TestResultParameterScore {
parameter_id: INTEGER (FK to EvaluationParameter)
score: INTEGER (0-100)
weight_used: INTEGER (0-100)
reasoning: TEXT
}// Real-time score calculation
const overallScore = parameterScores.reduce((total, ps) =>
total + (ps.score * ps.weight_used), 0
) / parameterScores.reduce((total, ps) => total + ps.weight_used, 0)Dialogflow Agent Tester/
βββ .agents/ # AI agent context and handoff docs
βββ .github/workflows/ # CI/CD pipeline configuration
βββ backend/ # FastAPI Python backend
β βββ app/
β β βββ api/ # API route handlers
β β βββ core/ # Configuration and database
β β βββ models/ # SQLAlchemy models and schemas
β β βββ services/ # Business logic services
β β βββ main.py # FastAPI application entry
β βββ sql/ # Database scripts and migrations
β βββ Dockerfile
β βββ requirements.txt
βββ design/ # Architecture and design documentation
βββ docs/ # User and setup documentation
β βββ setup/ # Setup guides (developer, GitHub, auth)
β βββ guides/ # User guides and tutorials
β βββ README.md # Documentation index
βββ frontend/ # React TypeScript frontend
β βββ src/
β β βββ components/ # React components
β β βββ pages/ # Page components
β β βββ store/ # Redux store and slices
β β βββ App.tsx # Main React app
β βββ Dockerfile
β βββ package.json
βββ terraform/ # Infrastructure as Code (GCP)
βββ test-data/ # CSV files for testing
βββ docker-compose.yml # Local development containers
βββ PRODUCTION_DEPLOYMENT.md # Live production infrastructure details
βββ README.md # This file - project overview
docker-compose ps# All services
docker-compose logs
# Specific service
docker-compose logs backend
docker-compose logs frontend# Specific service
docker-compose build backend
docker-compose up -d backend
# All services
docker-compose build
docker-compose up -ddocker exec -it agent-evaluator-db psql -U postgres -d agent_evaluator- β Backend Unit Tests: 11 comprehensive tests covering CSV utilities and core functionality
- β Frontend Unit Tests: Vitest-based testing for React components and utilities
- β CI/CD Integration: Automated testing on every push to main and pull requests
- β Quality Gates: Tests must pass before deployment to production
Backend Tests:
cd backend
python -m pytest tests/ --no-header -vFrontend Tests:
cd frontend
npm testAll Tests:
# Backend
cd backend && python -m pytest tests/ --no-header -v
# Frontend
cd frontend && npm test- Backend: CSV utilities, mock infrastructure, data validation
- Frontend: Basic functionality, component rendering, utility functions
- Integration: API endpoints validated through CI/CD pipeline
- Documentation-only changes: Pipeline skips unnecessary builds (*.md, docs/, design/)
- Code changes: Full test suite runs before deployment
- Pull Requests: Tests run without deployment
- Main branch pushes: Tests run followed by automated deployment
POST /auth/register- User registrationPOST /auth/login- User loginGET /auth/me- Get current user
GET /datasets/- List datasetsPOST /datasets/- Create datasetPOST /datasets/{id}/upload- Upload dataset fileGET /datasets/{id}- Get dataset details
GET /test-runs/- List test runsPOST /test-runs/- Create test runPOST /test-runs/{id}/execute- Execute test runGET /test-runs/{id}- Get test run details
GET /results/- List test resultsGET /results/test-run/{id}- Get results for test runGET /results/{id}- Get specific result
GET /health- Service health check
# Database
POSTGRES_SERVER=postgres
POSTGRES_USER=postgres
POSTGRES_PASSWORD=password
POSTGRES_DB=agent_evaluator
# Authentication
SECRET_KEY=your-super-secret-key-change-this-in-production
# Google Cloud (for production)
GOOGLE_CLOUD_PROJECT=your-gcp-project-id
# Redis (local development only - production uses in-memory sessions)
REDIS_URL=redis://redis:6379Using Docker Compose for local development and testing.
β
Active CI/CD Pipeline: Complete GitHub Actions workflow with Workload Identity Federation
β
Infrastructure Deployed: Terraform-managed infrastructure on GCP
β
Secure Authentication: No service account keys - uses WIF for GitHub Actions
β
Database Operational: PostgreSQL with auto-generated secure passwords
β
Redis Removed: Cost optimization - removed Redis cache (~$26/month savings)
β
OAuth Integration: Google OAuth working with proper redirect URLs
β
API Endpoints: All frontend API calls use centralized service pattern
- π Firebase Hosting:
https://your-frontend-url.web.app(proxy to Cloud Run frontend) - π₯οΈ Cloud Run Frontend: nginx + React SPA (
ingress=all) - π Cloud Run Backend: FastAPI + Python (
ingress=internal, not publicly accessible) - ποΈ Cloud SQL PostgreSQL:
dialogflow-tester-postgres-devwith backup configuration - π VPC Networking: Private network with VPC connector on both frontend and backend Cloud Run services
- π Workload Identity Federation:
github-actions-dialogflow@your-gcp-project-idservice account - π Multi-Environment: Dev environment operational
- β
Backend Security: Backend Cloud Run set to
ingress=internalβ no longer publicly exposed on ports 80/443 - β Frontend on Cloud Run: Moved frontend from Firebase static hosting to Cloud Run with nginx reverse proxy
- β
Firebase Hosting Proxy: Firebase Hosting now proxies to Cloud Run frontend (clean
*.web.appURL preserved) - β
VPC Connector on Frontend: Frontend uses VPC connector (
egress=all-traffic) so proxy traffic to backend is "internal" - β
Internal DNS Resolution: nginx uses
169.254.169.254(GCE metadata DNS) since public DNS is unreachable through VPC
- β Redis Removal: Eliminated Redis dependency for cost savings (~$26/month)
- β Session Management: Backend now uses in-memory sessions (suitable for single-instance)
- β OAuth Fixes: Resolved authentication redirects and token management
- β API Consistency: Fixed "Failed to construct 'URL'" errors across frontend
- β Terraform Updates: Infrastructure as code properly maintained and deployed
β Fully Operational:
- Project:
your-gcp-project-id - Backend Service: Healthy and responding
- Frontend Application: Deployed and accessible
- Database: Operational with secure connections
- OAuth: Working with Google authentication
graph TB
subgraph "CI/CD Pipeline"
DEV["Developer Push<br/>to main branch"] --> GHA["GitHub Actions"]
GHA --> WIF["Workload Identity Federation"]
WIF --> SA["Service Account"]
end
subgraph "Build"
SA --> BB["Backend Docker Build"]
SA --> FB["Frontend Docker Build"]
BB --> AR["Artifact Registry"]
FB --> AR
end
subgraph "Deploy"
SA --> TF["Terraform Apply"]
TF --> BE_DEPLOY["Cloud Run Backend<br/>(ingress=internal)"]
TF --> FE_DEPLOY["Cloud Run Frontend<br/>(ingress=all)"]
TF --> DB_DEPLOY["Cloud SQL PostgreSQL"]
TF --> VPC_DEPLOY["VPC + Connector"]
SA --> FBH["Firebase Hosting Deploy<br/>(proxy config only)"]
end
subgraph "Live Services"
FBH_LIVE["π your-app.web.app"]
FE_LIVE["π₯οΈ Cloud Run Frontend (nginx + React)"]
BE_LIVE["π Cloud Run Backend (FastAPI)"]
DB_LIVE["ποΈ Cloud SQL PostgreSQL"]
end
FBH --> FBH_LIVE
FBH_LIVE -->|proxy| FE_LIVE
FE_LIVE -->|nginx /api/* via VPC| BE_LIVE
BE_LIVE -->|VPC| DB_LIVE
style FBH_LIVE fill:#ff9800,color:#000
style FE_LIVE fill:#2196f3,color:#fff
style BE_LIVE fill:#9c27b0,color:#fff
style DB_LIVE fill:#4caf50,color:#fff
Option 1: Automated GitHub Actions (Recommended) β READY
# Repository secrets configured:
# - WIF_PROVIDER
# - WIF_SERVICE_ACCOUNT
# - GCP_PROJECT_ID_DEV
git add . && git commit -m "Deploy infrastructure" && git push
# Monitor deployment: https://github.com/your-org/dialogflow-test-suite/actionsOption 2: Manual Terraform Deployment β AVAILABLE
cd terraform
terraform plan -var-file="terraform.tfvars.dev"
terraform apply -var-file="terraform.tfvars.dev"WORKLOAD_IDENTITY_SETUP_COMPLETE.md- GitHub Actions authentication setupGCP_ADMIN_SETUP_GUIDE.md- GCP administrator configuration.agents/deployment-guide.md- Comprehensive deployment instructionsGOOGLE_OAUTH_SETUP.md- OAuth application configuration
Development environment configured for <5 users with minimal resource allocation:
- Cloud SQL: db-f1-micro (shared CPU, 0.6GB RAM)
- Cloud Run: Pay-per-request with automatic scaling
- Session Management: In-memory sessions (no external cache required)
- Total Cost Savings: ~$26/month (Redis removal)
# Check Docker Desktop is running
docker-compose down
docker-compose up -d
# Check logs
docker-compose logsIf you encounter errors loading data or "Can't connect to API" messages:
-
Check Frontend Container Logs:
docker-compose logs frontend -
Check Backend Container Logs:
docker-compose logs backend -
Verify API Endpoints:
- API should be accessible at: http://localhost:8000
- API docs should load at: http://localhost:8000/docs
- Frontend should be at: http://localhost:3000
-
Test Internal Container Communication:
# Test if backend is accessible from frontend container docker-compose exec frontend curl http://backend:8000/api/v1/datasets/
-
Common API Configuration Issues:
- β Wrong: Hardcoded
http://localhost:8000in frontend API calls - β
Correct: Relative URLs (empty
baseURLin axios) - β Wrong: Missing trailing slash
/api/v1/datasetsβ causes 307 redirects - β
Correct: Proper trailing slash
/api/v1/datasets/β direct 200 response
- β Wrong: Hardcoded
# Reset database
docker-compose down -v
docker-compose up -dEnsure ports 3000, 8000, 5432, and 6379 are available.
The application includes sample data structure for testing. The CSV bulk upload feature provides a dedicated page experience with column mapping capabilities:
CSV Bulk Upload Process:
- Navigate to any dataset's "Manage Questions" page
- Click "Bulk Add Questions" to open the dedicated upload page
- Select "Upload CSV File" mode
- Choose your CSV file and preview the data
- Map your CSV columns to question fields using the interactive interface
- Import questions with proper authentication and error handling
CSV File Format: Upload CSV files with the following format:
question,expected_intent,expected_entities
"What is my balance?",account.balance,"{""account_type"": ""checking""}"
"Transfer $100",money.transfer,"{""amount"": ""100""}"HTML Content Processing: The application automatically detects HTML content in CSV files and provides intelligent processing options:
- Smart Detection: Analyzes rows to identify HTML tags in your data
- User Choice: Provides options to automatically strip HTML tags while preserving text content
- Safe Processing: Uses BeautifulSoup4 for reliable HTML parsing and tag removal
- Focused Interface: Shows HTML removal options only for selected Question and Answer columns
- Large Dataset Support: Optimized for handling CSV files with thousands of rows efficiently
- π Documentation Index - Complete documentation navigation
- π Developer Setup - Complete setup guide for new developers
- π§ GitHub Setup - Repository and CI/CD configuration
- π Google Cloud Auth - Authentication configuration
- π OAuth Setup - OAuth configuration
- β‘ Quick Testing - How to quickly test features
- π Scripts Documentation - Automation and utility scripts
- ποΈ Architecture & Design - System architecture and design documents
- π Production Deployment - Live production infrastructure details
- π€ AI Agent Context - Agent handoff and technical context
- Follow the existing code structure and patterns
- Maintain TypeScript types and Python type hints
- Use the Material-UI dark theme for UI consistency
- Test changes locally with Docker before deployment
- Update documentation when adding new features
Need Help? Check the comprehensive documentation in the .agents/ folder for detailed setup and troubleshooting guides.
docker-compose up -d- Frontend: http://localhost:3000
- Backend API: http://localhost:8000
- API Docs: http://localhost:8000/docs
cd backend
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
# Set up environment
cp .env.example .env
# Edit .env with your settings
# Initialize database
python app/init_db.py
# Start development server
uvicorn app.main:app --reload --port 8000cd frontend
# Install dependencies
npm install
# Start development server
npm run devapp/
βββ api/ # FastAPI route handlers
βββ core/ # Configuration, database, security, migrations
β βββ migrations.py # MigrationManager - unified orchestrator
β βββ migration_files/ # Complex migration operations
β βββ add_quick_add_parameters_table.py
β βββ make_evaluation_model_required.py
βββ models/ # SQLAlchemy models and Pydantic schemas
βββ services/ # Business logic (Dialogflow, LLM, testing)
βββ main.py # FastAPI application entry point
The application uses a unified migration system orchestrated by MigrationManager in backend/app/core/migration_manager.py:
Architecture:
- Single Entry Point: All migrations run automatically on application startup via
MigrationManager.run_migrations() - Three Migration Types:
- Column Additions: Inline tuples in MigrationManager for simple
ALTER TABLE ADD COLUMNoperations - Function Handlers: Complex migrations in
migration_files/(CREATE TABLE, constraints, indexes) - Data Migrations: Inline SQL lists for UPDATE queries with automatic row count logging
- Column Additions: Inline tuples in MigrationManager for simple
Key Features:
- β Idempotent: Safe to run multiple times, automatically skips already-applied changes
- β Error Handling: Gracefully handles "already exists", permission errors, missing tables
- β Timeout Support: Optional timeout for long-running migrations using threading
- β Fresh Deployment: Individual migration files preserved for new environment initialization
- β Automatic Execution: Runs on every application startup, no manual intervention needed
Example Migration Patterns:
# Column Addition (inline in MigrationManager.migrations list)
{
'name': 'add_new_columns',
'description': 'Add new feature columns',
'columns': [
('table_name', 'column_name', 'TEXT')
]
}
# Complex Operation (function handler from migration_files/)
{
'name': 'create_new_table',
'description': 'Create table with indexes',
'type': 'function',
'handler': create_new_table_handler,
'timeout': 60
}
# Data Backfill (inline SQL in MigrationManager.migrations list)
{
'name': 'backfill_field',
'description': 'Set default values',
'type': 'data',
'sql': [
"UPDATE table_name SET field = 0 WHERE field IS NULL"
]
}Location: backend/app/core/migration_manager.py (orchestrator) + backend/app/core/migration_files/ (complex operations)
src/
βββ components/ # Reusable UI components
βββ pages/ # Page components
βββ store/ # Redux store and slices
βββ services/ # API client and utilities
βββ types/ # TypeScript type definitions
βββ hooks/ # Custom React hooks
POST /api/v1/auth/login- User loginGET /api/v1/auth/me- Get current userPOST /api/v1/auth/register- Register new user
GET /api/v1/datasets- List datasetsPOST /api/v1/datasets- Create datasetGET /api/v1/datasets/{id}- Get dataset detailsPOST /api/v1/datasets/{id}/import- Import questions from file
GET /api/v1/tests- List test runsPOST /api/v1/tests- Create and start test runGET /api/v1/tests/{id}- Get test run detailsGET /api/v1/tests/{id}/results- Get test results
GET /api/v1/dialogflow/agents- List available agentsGET /api/v1/dialogflow/agents/{agent}/flows- List flowsGET /api/v1/dialogflow/flows/{flow}/pages- List pages
# Security
SECRET_KEY=your-super-secret-key
# Database
POSTGRES_SERVER=localhost
POSTGRES_USER=postgres
POSTGRES_PASSWORD=password
POSTGRES_DB=agent_evaluator
# Redis (local development only)
REDIS_URL=redis://localhost:6379
# Google Cloud
GOOGLE_CLOUD_PROJECT=your-project-id
# File Upload
UPLOAD_DIR=uploads
MAX_FILE_SIZE=52428800
# CORS
BACKEND_CORS_ORIGINS=http://localhost:3000,http://localhost:5173-
Enable APIs:
- Dialogflow CX API
- AI Platform API
- Cloud Storage API (optional)
-
Create Service Account:
gcloud iam service-accounts create dialogflow-tester \ --display-name="Dialogflow Agent Tester" -
Grant Permissions:
gcloud projects add-iam-policy-binding PROJECT_ID \ --member="serviceAccount:dialogflow-tester@PROJECT_ID.iam.gserviceaccount.com" \ --role="roles/dialogflow.reader" gcloud projects add-iam-policy-binding PROJECT_ID \ --member="serviceAccount:dialogflow-tester@PROJECT_ID.iam.gserviceaccount.com" \ --role="roles/aiplatform.user"
-
Download Key:
gcloud iam service-accounts keys create service-account.json \ --iam-account=dialogflow-tester@PROJECT_ID.iam.gserviceaccount.com
The new CSV bulk upload feature provides an intuitive column mapping interface. Your CSV can have any column names - the application will let you map them to the required fields during import:
question,answer,detect_empathy,no_match,priority,tags
"How do I reset my password?","You can reset your password by...",false,false,high,"password,security"
"What is the weather like?","I can't help with weather information",false,true,low,"weather,no-match"Column Mapping Support:
- Required: Question and Answer columns
- Optional: Empathy detection, No-match flag, Priority level, Tags
- Flexible: Any CSV column names can be mapped during the import process
[
{
"question": "How do I reset my password?",
"answer": "You can reset your password by...",
"detect_empathy": false,
"no_match": false,
"priority": "high",
"tags": ["password", "security"]
}
]cd backend
pytestcd frontend
npm test# Start services
docker-compose up -d
# Run integration tests
# (Add your integration test commands here)- Backend:
GET /health - Database: Automated health checks in Docker Compose
- Session Management: In-memory (production) / Redis monitoring (local development)
- Application logs: stdout/stderr
- Error tracking: Built-in FastAPI error handling
- Request logging: Configurable via FastAPI middleware
- Change Default Credentials: Update admin password immediately
- Environment Variables: Use secure secret management
- HTTPS: Configure SSL/TLS certificates
- Database Security: Restrict database access
- API Rate Limiting: Implement rate limiting middleware
- CORS: Configure appropriate CORS origins
- JWT tokens with configurable expiration
- Secure password hashing with bcrypt
- Role-based access control
- Session management: In-memory sessions (production), Redis sessions (local development)
-
Google Cloud Authentication:
# Verify service account key gcloud auth activate-service-account --key-file=service-account.json -
Database Connection:
# Check PostgreSQL is running docker-compose ps postgres -
Redis Connection (Local Development):
# Test Redis connectivity (local environment only) docker-compose exec redis redis-cli ping
-
Frontend Build Issues:
# Clear node modules and reinstall cd frontend Remove-Item -Recurse -Force node_modules, package-lock.json npm install
# View all logs
docker-compose logs -f
# View specific service logs
docker-compose logs -f backend
docker-compose logs -f frontendWe are currently working on improving user experience and preference persistence:
- β Flows API Fixes: Resolved 500 errors in Dialogflow flows endpoint
- β Batch Size Preferences: Complete persistence implementation for test run batch sizes
- β Infinite Loop Fixes: Eliminated useEffect dependency cycles causing re-rendering issues
- β Enhanced Debugging: Comprehensive logging system for preference management
- Create feature branch
- Make changes with tests
- Submit pull request
- Backend: Black formatter, Pylint, type hints
- Frontend: ESLint, Prettier, TypeScript strict mode
- Commits: Conventional commit messages