shopRAG: Product Review Chatbot

MIS 547 – Group 4 University of Arizona

Team Members

Esai Flores
Kyle deGuzman
Kyler Nats
Pragnya Narasimha
Sharanya Neelam

Project Overview

ShopRAG is an AI-powered chatbot that helps customers make informed purchasing decisions by answering questions about products based on real customer reviews. Using Retrieval-Augmented Generation (RAG), the system provides accurate, grounded responses without hallucinating information.

Key Features

Real-time Q&A - Answer product questions instantly using customer reviews
Semantic Search - Find relevant reviews using vector embeddings
Guardrails - Input validation, PII removal, and hallucination detection
Scalable - PostgreSQL + pgvector for production-grade vector search
Interactive UI - Gradio-based chat interface with product filtering
Containerized - Full Docker setup with monitoring (Prometheus + Grafana)
Production-Ready - FastAPI backend with health checks and metrics

Live Demo

Deployed on Digital Ocean

UI: Gradio Share Link (generated on startup)
Database: 30,000 products with 191,849 reviews
Category: Cell Phones & Accessories (Amazon Reviews 2023)

Architecture

Docker Containerized Setup

┌─────────────────────────────────────────────────────────────┐
│                  Docker Network: shoprag                     │
│                                                              │
│  ┌──────────┐    ┌──────────┐    ┌───────────┐            │
│  │ Gradio   │───>│ FastAPI  │───>│ Digital   │ (external) │
│  │ Frontend │    │ Backend  │    │ Ocean PG  │            │
│  │  :7860   │    │  :8000   │    │           │            │
│  └──────────┘    └────┬─────┘    └───────────┘            │
│                       │ /metrics                            │
│                  ┌────▼──────┐                              │
│                  │Prometheus │                              │
│                  │  :9090    │                              │
│                  └────┬──────┘                              │
│                  ┌────▼──────┐                              │
│                  │ Grafana   │                              │
│                  │  :3000    │                              │
│                  └───────────┘                              │
│                                                              │
│  ┌──────────────────────┐                                   │
│  │ Data Ingestion       │ (on-demand)                       │
│  └──────────────────────┘                                   │
└─────────────────────────────────────────────────────────────┘

System Components

┌─────────────┐
│   User      │
└──────┬──────┘
       │
       ▼
┌─────────────────────────────────┐
│   Gradio UI (Port 7860)         │
│   - Chat interface              │
│   - Product selection           │
│   - Review count slider         │
└──────────┬──────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│   RAG Pipeline                  │
│   ┌───────────────────────┐    │
│   │ Input Guardrails      │    │
│   │ - Length validation   │    │
│   │ - Prompt injection    │    │
│   │ - Rate limiting       │    │
│   └───────────┬───────────┘    │
│               ▼                 │
│   ┌───────────────────────┐    │
│   │ Embedder              │    │
│   │ BAAI/bge-small-en-v1.5│    │
│   │ 384-dim vectors       │    │
│   └───────────┬───────────┘    │
│               ▼                 │
│   ┌───────────────────────┐    │
│   │ Retriever (pgvector)  │    │
│   │ - Cosine similarity   │    │
│   │ - Quality filters     │    │
│   │ - Top-k results       │    │
│   └───────────┬───────────┘    │
│               ▼                 │
│   ┌───────────────────────┐    │
│   │ LLM Client            │    │
│   │ - PII removal         │    │
│   │ - Response generation │    │
│   │ - Hallucination check │    │
│   └───────────────────────┘    │
└─────────────────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│   PostgreSQL + pgvector         │
│   (Digital Ocean Managed DB)    │
│   - products table (30k)        │
│   - reviews table (191k+)       │
│   - Vector similarity index     │
└─────────────────────────────────┘
           │
           ▼
┌─────────────────────────────────┐
│   OpenAI API                    │
│   Model: gpt-4o-mini            │
│   - Natural language generation │
└─────────────────────────────────┘

Tech Stack

Backend

Python 3.10 - Core application
PostgreSQL + pgvector - Vector database for semantic search
psycopg2 - Database adapter
sentence-transformers - Embedding generation
OpenAI API - LLM for response generation

Frontend

Gradio - Interactive web UI
Markdown - Response formatting

Infrastructure

Docker & Docker Compose - Containerized deployment
Digital Ocean Droplet - Application hosting
Digital Ocean Managed PostgreSQL - Database hosting
Prometheus & Grafana - Monitoring and metrics
uv - Fast Python package manager

MLOps

Git/GitHub - Version control
Datasets (HuggingFace) - Data loading
dotenv - Environment management

Dataset

Amazon Reviews 2023 (McAuley Lab)

Subset: Cell Phones and Accessories
Products: 30,000 (most recent products)
Reviews: 191,849 with embeddings
Timespan: 1996–2023
Source: https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023

Data Schema

Products Table:

- asin (TEXT PRIMARY KEY)
- title (TEXT)
- main_category (VARCHAR(255))
- average_rating (REAL)
- rating_number (INTEGER)
- price (TEXT)
- features (TEXT)  -- JSON array
- description (TEXT)
- store (VARCHAR(255))

Reviews Table:

- id (SERIAL PRIMARY KEY)
- asin (TEXT)  -- Foreign key to products
- product_name (TEXT)
- category (TEXT)
- review_rating (REAL)
- verified_purchase (BOOLEAN)
- helpful_vote (INTEGER)
- timestamp (BIGINT)
- review_text (TEXT)
- embedding (VECTOR(384))  -- pgvector

Data Pipeline

1. Ingestion

uv run python backend/scripts/ingest_reviews_postgres.py

Process:

Load product metadata from HuggingFace
Filter to last 30k products (most recent)
Load reviews for those products
Filter low-quality reviews (length < 10 chars)
Combine review with product context
Generate embeddings (batch size: 512)
Store in PostgreSQL with pgvector

Performance:

Products/hour: ~15,000
Reviews/hour: ~80,000
Embedding batch: 512 reviews at once

2. Preprocessing

Quality Filters:

Minimum review length: 30 characters
Valid ratings only (rating > 0)
Remove duplicate reviews

Text Processing:

combined_text = f"""
Product: {product_name}
Category: {category}
Rating: {rating}/5

Review: {review_text}
"""

PII Removal:

Emails → [EMAIL]
Phone numbers → [PHONE]
URLs → [URL]
Credit cards → [CARD]
SSNs → [SSN]

RAG System

Retrieval-Augmented Generation (RAG)

Why RAG?

Grounded in real customer reviews
No model fine-tuning required
Easy to update with new data
Reduced hallucinations
Cost-effective (no GPU training)

Query Flow

Input Validation
- Check query length (3-500 chars)
- Detect prompt injection attempts
- Rate limit: 20 requests/minute
Embedding
- Convert query to 384-dim vector
- Model: BAAI/bge-small-en-v1.5
Retrieval
- Search PostgreSQL using cosine similarity
- Apply quality filters
- Return top-k reviews (default: 5)
Generation
- Build context from product + reviews
- Remove PII from review text
- Generate response using OpenAI GPT-4o-mini
- Check for hallucinations (word overlap)
Response
- Return concise answer (2-3 sentences)
- Show number of reviews used
- Display product information

Guardrails

1. Input Guardrails

- Length: 3-500 characters
- Prompt injection detection
- Rate limiting (20/min per user)

2. Retrieval Quality

WHERE LENGTH(review_text) >= 30
ORDER BY embedding <=> query_vector
LIMIT 5

3. PII Removal

All review text is sanitized before sending to LLM.

4. Hallucination Detection

# Lightweight word overlap check
overlap_ratio = len(response_words ∩ review_words) / len(response_words)

if overlap_ratio < 0.3:
    log("[HALLUCINATION WARNING]")

5. System Prompt Engineering

CRITICAL RULES:
1. ONLY use information directly stated in the provided context
2. DO NOT make assumptions or add information not in the reviews
3. Keep responses short (2-3 sentences maximum)

Installation

Recommended: Use Docker deployment for the quickest setup with monitoring included.

For local development without Docker, follow the steps below.

Prerequisites

Python 3.10+
PostgreSQL with pgvector extension
OpenAI API key
uv package manager (recommended)

Local Setup

# 1. Clone repository
git clone https://github.com/Praagnya/shopRAG.git
cd shopRAG

# 2. Install uv (if not installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# 3. Install dependencies
uv sync

# 4. Create .env file
cp .env.example .env
nano .env  # Add your API keys

# 5. Run data ingestion (optional - data already in DB)
uv run python backend/scripts/ingest_reviews_postgres.py

# 6. Start the UI
uv run python frontend/gradio_app.py

Environment Variables

# .env
DATABASE_URL=postgresql://user:pass@host:port/dbname
OPENAI_API_KEY=your-openai-api-key
OPENAI_MODEL=gpt-4o-mini

Usage

Start the Application

cd shopRAG
uv run python frontend/gradio_app.py

Access at: http://localhost:7860 or the Gradio share link.

Example Queries

Product-Specific (with ASIN):

ASIN: B07SKQZSN6
Query: "Is it durable?"

Global Search (no ASIN):

ASIN: [blank]
Query: "What are the best iPhone cases under $20?"

API Usage (Optional)

If FastAPI layer is added:

curl -X POST http://localhost:8000/api/v1/query \
  -H "Content-Type: application/json" \
  -d '{
    "query": "How is the battery life?",
    "product_asin": "B07SKQZSN6",
    "top_k": 5
  }'

Project Structure

shopRAG/
├── backend/
│   ├── api/
│   │   ├── main.py              # FastAPI application
│   │   └── routes/
│   │       └── rag.py           # RAG endpoints
│   ├── config/
│   │   └── settings.py          # Configuration
│   ├── scripts/
│   │   └── ingest_reviews_postgres.py  # Data ingestion
│   ├── services/
│   │   ├── embedder.py          # Embedding generation
│   │   ├── retriever_postgres.py # Vector search
│   │   ├── llm_client.py        # OpenAI integration
│   │   ├── rag_pipeline.py      # RAG orchestration
│   │   └── guardrails.py        # Input validation
│   └── utils/
│       └── text_processor.py    # Text utilities
├── frontend/
│   └── gradio_app.py            # Gradio UI
├── data/
│   └── product_cache.json       # Product metadata
├── monitoring/
│   ├── prometheus.yml           # Prometheus config
│   └── grafana/
│       ├── provisioning/        # Grafana datasources
│       └── dashboards/          # Pre-built dashboards
├── scripts/
│   └── docker-dev.sh            # Docker helper script
├── Dockerfile.backend           # Backend container
├── Dockerfile.frontend          # Frontend container
├── Dockerfile.ingest            # Ingestion container
├── docker-compose.yml           # Service orchestration
├── docker-compose.override.yml  # Development overrides
├── Makefile                     # Quick commands
├── .dockerignore                # Docker build exclusions
├── .env.docker                  # Environment template
├── .env                         # Environment variables
├── pyproject.toml               # Dependencies
├── uv.lock                      # Lock file
├── README.md                    # This file
└── DOCKER_README.md             # Docker setup guide

Deployment

Digital Ocean Droplet

Current Setup:

Droplet: 4 GB RAM, 2 vCPUs
Database: Managed PostgreSQL with pgvector
Access: SSH + Gradio share link

Deploy Steps:

# 1. SSH to droplet
ssh root@<droplet-ip>

# 2. Clone and setup
cd /root
git clone https://github.com/Praagnya/shopRAG.git
cd shopRAG

# 3. Install dependencies
curl -LsSf https://astral.sh/uv/install.sh | sh
uv sync

# 4. Configure environment
cp .env.example .env
nano .env  # Add DATABASE_URL and OPENAI_API_KEY

# 5. Run application
uv run python frontend/gradio_app.py

Docker Deployment (Recommended)

Full containerized setup with monitoring stack

Prerequisites

Docker Desktop (or Docker Engine + Docker Compose)
Digital Ocean PostgreSQL database
OpenAI API key
At least 8GB RAM allocated to Docker

Quick Start

# 1. Configure environment
cp .env.docker .env
nano .env  # Add OPENAI_API_KEY and DATABASE_URL

# 2. Build and start all services
make build
make up

# 3. Access applications
# - Frontend (Gradio UI): http://localhost:7860
# - Backend API: http://localhost:8000
# - API Documentation: http://localhost:8000/docs
# - Prometheus: http://localhost:9090
# - Grafana: http://localhost:3000 (admin/admin)

# 4. Run data ingestion (optional)
make ingest

Docker Services

The Docker Compose setup includes:

Backend (FastAPI) - Port 8000
Frontend (Gradio) - Port 7860
Prometheus - Metrics collection on port 9090
Grafana - Dashboards on port 3000
Ingest - On-demand data ingestion job

Common Commands

make build       # Build all Docker images
make up          # Start all services
make down        # Stop all services
make logs        # View all logs
make health      # Check service health
make restart     # Restart all services
make clean       # Remove containers and volumes

Alternative: Using Docker Compose Directly

# Build and start
docker-compose build
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down

# Run data ingestion
docker-compose --profile ingest run --rm ingest

For detailed Docker setup instructions, see DOCKER_README.md.

Performance Metrics

Current Stats

Total Products: 30,000
Total Reviews: 191,849
Embedding Dimension: 384
Average Query Time: ~2-3 seconds
Database Size: ~500 MB

Query Performance

Embedding: ~50ms
Retrieval: ~100-200ms
LLM Generation: ~1-2s
Total: ~2-3s per query

Monitoring

Prometheus Metrics (Docker Setup)

Access Prometheus at http://localhost:9090

Available metrics:

rag_queries_total - Total RAG queries
rag_llm_calls_total - Total LLM API calls
rag_pipeline_latency_ms - RAG pipeline latency
rag_embedding_latency_ms - Embedding latency
rag_retrieval_latency_ms - Retrieval latency
rag_llm_latency_ms - LLM latency
rag_errors_total - Total errors
rag_guardrail_failures_total - Guardrail rejections
rag_active_requests - Active requests
llm_tokens_used_total - Token usage

Grafana Dashboards (Docker Setup)

Access Grafana at http://localhost:3000
Login with admin / admin
Navigate to "shopRAG Overview Dashboard"

Dashboard shows:

Request rate and latency trends
LLM call rate and error rate
Active requests and products loaded
Component-level latency breakdown
Token usage tracking

Application Logs

Docker Deployment:

# View all logs
make logs

# Backend logs only
make logs-backend

# Frontend logs only
make logs-frontend

# Follow logs in real-time
docker-compose logs -f

Local Deployment:

# Application logs
tail -f /var/log/gradio.log

# Ingestion logs
tail -f /var/log/shoprag_ingest.log

# System logs
grep CRON /var/log/syslog

Database Stats

-- Review count
SELECT COUNT(*) FROM reviews;

-- Products with reviews
SELECT COUNT(DISTINCT asin) FROM reviews;

-- Average reviews per product
SELECT AVG(review_count) FROM (
    SELECT COUNT(*) as review_count
    FROM reviews
    GROUP BY asin
) subq;

Automation

Cron Jobs (Weekly Updates)

# Add to crontab
crontab -e

# Run ingestion every Sunday at 2 AM
0 2 * * 0 /root/shopRAG/scripts/auto_ingest.sh

Ingestion Script

#!/bin/bash
cd /root/shopRAG
git pull
uv run python backend/scripts/ingest_reviews_postgres.py
pkill -f gradio_app.py
nohup uv run python frontend/gradio_app.py > /var/log/gradio.log 2>&1 &

Testing

Run Tests

# Unit tests
uv run pytest tests/

# Integration tests
uv run pytest tests/integration/

# Test specific component
uv run pytest tests/test_retriever.py

Manual Testing

# Test RAG pipeline
from backend.services.rag_pipeline import get_rag_pipeline

rag = get_rag_pipeline()
result = rag.query("Is it durable?", product_asin="B07SKQZSN6")
print(result['response'])

Troubleshooting

Common Issues

1. Database Connection Timeout

# Add droplet IP to PostgreSQL trusted sources
# Digital Ocean → Databases → Settings → Trusted Sources

2. Empty Responses

# Check OpenAI model name
cat backend/config/settings.py | grep OPENAI_MODEL
# Should be: gpt-4o-mini (not gpt-5-mini)

3. No Reviews Retrieved

# Check if reviews exist for ASIN
uv run python -c "
import psycopg2, os
from dotenv import load_dotenv
load_dotenv()
conn = psycopg2.connect(os.getenv('DATABASE_URL'))
cur = conn.cursor()
cur.execute('SELECT COUNT(*) FROM reviews WHERE asin=%s', ('B07SKQZSN6',))
print(f'Reviews: {cur.fetchone()[0]}')
"

4. Port Already in Use

# Kill existing Gradio process
lsof -ti:7860 | xargs kill -9

Future Enhancements

Completed ✓

Docker containerization with monitoring
Prometheus metrics collection
Grafana dashboards
FastAPI backend with health checks
Production-ready deployment setup

Planned Features

Scaling

Horizontal scaling with load balancer
Redis caching for frequent queries
Kubernetes deployment
CI/CD pipeline with automated testing
Auto-scaling based on metrics

Contributing

This is an academic project. For questions or collaboration:

Create an issue on GitHub
Submit a pull request
Contact team members

References

Dataset: McAuley-Lab. (2023). Amazon Reviews 2023. HuggingFace.
RAG: Lewis et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv.
pgvector: Ankane. pgvector: Open-source vector similarity search for Postgres.
Embeddings: BAAI. BGE: BAAI General Embedding.

License

Academic project developed as part of MIS 547 – University of Arizona coursework.

Contact

Team Members:

Esai Flores
Kyle deGuzman
Kyler Nats
Pragnya Narasimha
Sharanya Neelam

Course: MIS 547 – Cloud Computing Institution: University of Arizona Semester: Fall 2024

Last Updated: December 2025

Name		Name	Last commit message	Last commit date
Latest commit History 51 Commits
.gradio		.gradio
backend		backend
data		data
frontend		frontend
monitoring		monitoring
scripts		scripts
.dockerignore		.dockerignore
.env.docker		.env.docker
.gitignore		.gitignore
.python-version		.python-version
DOCKER_README.md		DOCKER_README.md
Dockerfile.backend		Dockerfile.backend
Dockerfile.frontend		Dockerfile.frontend
Dockerfile.ingest		Dockerfile.ingest
Makefile		Makefile
README.md		README.md
analytics.py		analytics.py
dashboard.py		dashboard.py
docker-compose.override.yml		docker-compose.override.yml
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

shopRAG: Product Review Chatbot

Team Members

Project Overview

Key Features

Live Demo

Architecture

Docker Containerized Setup

System Components

Tech Stack

Backend

Frontend

Infrastructure

MLOps

Dataset

Data Schema

Data Pipeline

1. Ingestion

2. Preprocessing

RAG System

Retrieval-Augmented Generation (RAG)

Query Flow

Guardrails

1. Input Guardrails

2. Retrieval Quality

3. PII Removal

4. Hallucination Detection

5. System Prompt Engineering

Installation

Prerequisites

Local Setup

Environment Variables

Usage

Start the Application

Example Queries

API Usage (Optional)

Project Structure

Deployment

Digital Ocean Droplet

Docker Deployment (Recommended)

Prerequisites

Quick Start

Docker Services

Common Commands

Alternative: Using Docker Compose Directly

Performance Metrics

Current Stats

Query Performance

Monitoring

Prometheus Metrics (Docker Setup)

Grafana Dashboards (Docker Setup)

Application Logs

Database Stats

Automation

Cron Jobs (Weekly Updates)

Ingestion Script

Testing

Run Tests

Manual Testing

Troubleshooting

Common Issues

Future Enhancements

Completed ✓

Planned Features

Scaling

Contributing

References

License

Contact

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages