Skip to content

pananon/ragxgen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

11 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

RagXGen โ€” Retrieval-Augmented Generation Explained by Building It

RagXGen Next.js FastAPI Python TypeScript

A complete, production-quality implementation of Retrieval-Augmented Generation

Live Demo โ€ข Architecture โ€ข Quick Start โ€ข Documentation


๐ŸŽฏ What is RagXGen?

RagXGen is a portfolio project that demonstrates deep understanding of Retrieval-Augmented Generation by:

  1. Explaining it โ€” Clear, engineer-friendly documentation of how RAG works
  2. Visualizing it โ€” Interactive pipeline diagrams showing each step
  3. Implementing it โ€” Real, working code with production patterns
  4. Evaluating it โ€” Honest discussion of trade-offs and failure modes

This isn't a mock demo. It's a fully functional RAG system you can test with your own documents.


โœจ Features

  • ๐Ÿ“„ Document Upload โ€” Upload PDFs and TXT files for processing
  • ๐Ÿ” Semantic Search โ€” FAISS-powered vector similarity search
  • ๐Ÿ’ฌ Chat Interface โ€” Natural language Q&A with your documents
  • ๐Ÿ“Š Transparency โ€” See retrieved chunks, similarity scores, and sources
  • โš™๏ธ Configurable โ€” Adjust chunk size, overlap, and top-K retrieval
  • ๐ŸŽจ Modern UI โ€” Dark mode, smooth animations, responsive design

๐Ÿ— Architecture

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         FRONTEND (Next.js)                       โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚  โ”‚ Landing  โ”‚  โ”‚ What is  โ”‚  โ”‚Architect-โ”‚  โ”‚    Live Demo     โ”‚ โ”‚
โ”‚  โ”‚  Page    โ”‚  โ”‚   RAG?   โ”‚  โ”‚   ure    โ”‚  โ”‚  (Chat + Upload) โ”‚ โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
                             โ”‚ REST API
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                         BACKEND (FastAPI)                        โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚  โ”‚                      RAG Pipeline                           โ”‚ โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚ โ”‚
โ”‚  โ”‚  โ”‚ Documentโ”‚โ†’ โ”‚  Chunking  โ”‚โ†’ โ”‚Embeddingโ”‚โ†’ โ”‚FAISS Store โ”‚  โ”‚ โ”‚
โ”‚  โ”‚  โ”‚ Upload  โ”‚  โ”‚(1000 char) โ”‚  โ”‚(OpenAI) โ”‚  โ”‚(Similarity)โ”‚  โ”‚ โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚ โ”‚
โ”‚  โ”‚                                                             โ”‚ โ”‚
โ”‚  โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”‚ โ”‚
โ”‚  โ”‚  โ”‚  Query  โ”‚โ†’ โ”‚  Retrieve  โ”‚โ†’ โ”‚ Augmentโ”‚โ†’ โ”‚  Generate   โ”‚  โ”‚ โ”‚
โ”‚  โ”‚  โ”‚ Input   โ”‚  โ”‚  Top-K (4) โ”‚  โ”‚ Prompt โ”‚  โ”‚(GPT-4o-mini)โ”‚  โ”‚ โ”‚
โ”‚  โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ”‚ โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿš€ Quick Start

Prerequisites

  • Node.js 18+
  • Python 3.10+
  • OpenAI API Key

1. Clone the Repository

git clone https://github.com/yourusername/ragxgen.git
cd ragxgen

2. Set Up Backend

# Navigate to backend
cd backend

# Create virtual environment
python -m venv venv

# Activate virtual environment
# Windows:
.\venv\Scripts\activate
# macOS/Linux:
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# Create .env file
cp .env.example .env

# Edit .env and add your OpenAI API key
# OPENAI_API_KEY=sk-your-key-here

# Start the backend server
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

The backend will be available at http://localhost:8000. API documentation at http://localhost:8000/docs.

3. Set Up Frontend

# Open new terminal and navigate to frontend
cd frontend

# Install dependencies
npm install

# Create env file
cp .env.example .env.local

# Start the development server
npm run dev

The frontend will be available at http://localhost:3000.


๐Ÿ“ Project Structure

ragxgen/
โ”œโ”€โ”€ backend/                      # Python FastAPI Backend
โ”‚   โ”œโ”€โ”€ app/
โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚   โ”œโ”€โ”€ main.py               # FastAPI application entry
โ”‚   โ”‚   โ”œโ”€โ”€ config.py             # Configuration management
โ”‚   โ”‚   โ”œโ”€โ”€ models/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ schemas.py        # Pydantic models
โ”‚   โ”‚   โ”œโ”€โ”€ routers/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ documents.py      # Document upload endpoints
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ query.py          # RAG query endpoint
โ”‚   โ”‚   โ””โ”€โ”€ services/
โ”‚   โ”‚       โ”œโ”€โ”€ __init__.py
โ”‚   โ”‚       โ”œโ”€โ”€ embeddings.py     # OpenAI embeddings service
โ”‚   โ”‚       โ”œโ”€โ”€ vector_store.py   # FAISS vector store
โ”‚   โ”‚       โ””โ”€โ”€ rag_pipeline.py   # Core RAG implementation
โ”‚   โ”œโ”€โ”€ requirements.txt
โ”‚   โ””โ”€โ”€ .env.example
โ”‚
โ”œโ”€โ”€ frontend/                     # Next.js Frontend
โ”‚   โ”œโ”€โ”€ src/
โ”‚   โ”‚   โ”œโ”€โ”€ app/
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ layout.tsx        # Root layout
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ page.tsx          # Landing page
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ globals.css       # Global styles
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ what-is-rag/
โ”‚   โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ page.tsx      # What is RAG explanation
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ architecture/
โ”‚   โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ page.tsx      # Interactive architecture
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ demo/
โ”‚   โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ page.tsx      # Live RAG demo
โ”‚   โ”‚   โ”‚   โ”œโ”€โ”€ evaluation/
โ”‚   โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ page.tsx      # Trade-offs & failure cases
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ about/
โ”‚   โ”‚   โ”‚       โ””โ”€โ”€ page.tsx      # Case study
โ”‚   โ”‚   โ”œโ”€โ”€ components/
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ layout/
โ”‚   โ”‚   โ”‚       โ”œโ”€โ”€ Navigation.tsx
โ”‚   โ”‚   โ”‚       โ””โ”€โ”€ Footer.tsx
โ”‚   โ”‚   โ””โ”€โ”€ lib/
โ”‚   โ”‚       โ”œโ”€โ”€ api.ts            # API client
โ”‚   โ”‚       โ””โ”€โ”€ utils.ts          # Utility functions
โ”‚   โ”œโ”€โ”€ package.json
โ”‚   โ”œโ”€โ”€ tailwind.config.ts
โ”‚   โ””โ”€โ”€ tsconfig.json
โ”‚
โ””โ”€โ”€ README.md

๐Ÿ”ง Configuration

Backend Environment Variables

Variable Description Default
OPENAI_API_KEY Your OpenAI API key (required)
CHUNK_SIZE Characters per document chunk 1000
CHUNK_OVERLAP Overlap between chunks 200
TOP_K Default chunks to retrieve 4
MODEL_NAME LLM model for generation gpt-4o-mini
EMBEDDING_MODEL Embedding model text-embedding-3-small
CORS_ORIGINS Allowed CORS origins http://localhost:3000

Frontend Environment Variables

Variable Description Default
NEXT_PUBLIC_API_URL Backend API URL http://localhost:8000

๐Ÿ“š API Endpoints

Documents

Method Endpoint Description
POST /documents/upload Upload and process a document
GET /documents/session/{id} Get session information
DELETE /documents/session/{id} Delete a session
POST /documents/session/create Create new session

Query

Method Endpoint Description
POST /query/ Execute RAG query
GET /query/config Get RAG configuration

Health

Method Endpoint Description
GET /health Health check

๐Ÿง  How RAG Works

The Problem

Traditional LLMs have three key limitations:

  1. Hallucinations โ€” They confidently make up information
  2. Stale Knowledge โ€” Training cutoff means outdated information
  3. No Source Attribution โ€” Can't verify where answers come from

The Solution

RAG addresses these by retrieving relevant context from your documents:

User Question
     โ†“
Embed question โ†’ Search vector store โ†’ Retrieve top-K chunks
     โ†“
Inject chunks into prompt โ†’ Generate grounded answer
     โ†“
Answer with sources

Key Components

Component Purpose This Project
Document Chunking Split docs into searchable pieces RecursiveCharacterTextSplitter
Embeddings Convert text to vectors OpenAI text-embedding-3-small
Vector Store Store and search vectors FAISS (in-memory)
Retrieval Find relevant chunks Cosine similarity, Top-K
Generation Produce final answer GPT-4o-mini

โš–๏ธ Trade-offs & Limitations

Chunk Size

  • Smaller (200-500): Higher precision, less context per chunk
  • Larger (1000-2000): More context, may include irrelevant info
  • Recommendation: Start with 1000, adjust based on results

Top-K Retrieval

  • Lower (1-2): Focused, but may miss relevant info
  • Higher (6-10): Comprehensive, but noisier
  • Recommendation: Default to 4, adjust per use case

Known Limitations

  1. Session data is stored in memory (lost on restart)
  2. No persistent storage of vector indices
  3. Limited to PDF and TXT files
  4. Single-turn conversations (no memory)

๐Ÿ”ฎ Production Improvements

If deploying to production, consider:

Retrieval Quality

  • Add reranking with cross-encoder
  • Implement hybrid search (semantic + keyword)
  • Use HyDE for better retrieval

Scalability

  • Use managed vector DB (Pinecone, Weaviate)
  • Add Redis caching
  • Implement connection pooling

Reliability

  • Add comprehensive error handling
  • Implement retry with exponential backoff
  • Set up monitoring and alerting

User Experience

  • Stream responses for faster perceived latency
  • Add conversation history
  • Support more file formats

๐Ÿ›  Technology Choices

Why LangChain?

  • Mature ecosystem with good documentation
  • Built-in text splitters optimized for RAG
  • Easy integration with various LLMs and vector stores

Why FAISS?

  • No external dependencies (runs locally)
  • Fast similarity search
  • Good enough for demo/prototype scale

Why FastAPI?

  • High performance async Python
  • Automatic OpenAPI documentation
  • Excellent Pydantic integration

Why Next.js App Router?

  • Modern React patterns (Server Components)
  • Built-in routing and layouts
  • Great developer experience

๐Ÿ“„ License

This project is for educational and portfolio purposes.


๐Ÿค Connect

Built by a software engineer passionate about AI/ML systems.

โญ Star this repo if you found it helpful!

About

RAGXGEN is a production-style Retrieval-Augmented Generation system featuring document ingestion, vector search, and grounded LLM responses.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors