Skip to content

samarthpatel24/LLM-Powered-Intelligent-Query-Retrieval-System

Repository files navigation

LLM-Powered Intelligent Query-Retrieval System

Python FastAPI

A compact Retrieval-Augmented Generation (RAG) platform that processes PDFs, builds semantic vector indexes, and answers user queries using LLMs. Built for the Bajaj HackRX hackathon.

Quick Overview

  • Processes PDF documents and extracts meaningful text
  • Converts text chunks to embeddings and indexes with FAISS
  • Answers queries using context retrieved from the index and an LLM (Gemini)
  • Production features: bearer-token auth, caching, health endpoints, and OpenAPI docs

File Structure

app/
├── api/          # FastAPI routes and endpoints
├── core/         # config, security, directories
├── models/       # Pydantic request/response models
├── services/     # document processing, embeddings, vector store, RAG
└── utils/        # helpers

documents/           # uploaded/processed documents
vector_store/         # FAISS indexes and embeddings
parsed_documents/     # optional saved parsed text
clauses/              # extracted clauses and logs

Installation (short)

  1. Clone and create venv:
git clone <repo-url>
cd bajaj_hackrx
python -m venv venv
source venv/bin/activate
  1. Install deps:
pip install -r requirements.txt
  1. Configure .env (copy from .env.example) and add GEMINI_API_KEY.
  2. Run:
python run.py
# or
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Configuration Notes

  • Add GEMINI_API_KEY and BEARER_TOKEN to your .env.
  • MAX_FILE_SIZE and timeouts are configurable in .env.

Tech Stack

  • Python, FastAPI, Uvicorn, Pydantic
  • FlagEmbedding (BGE) for embeddings, FAISS for vector search
  • Google Gemini (Generative AI) for answer generation

Usage

  • API docs available at /docs when running locally.
  • Main endpoint: POST /hackrx/run (requires bearer token).

License

MIT 4. Embedding Manager: Vector embedding generation 5. Vector Store: Efficient similarity search with FAISS 6. Answer Generator: LLM-powered response generation

🛠️ Installation

Prerequisites

  • Python 3.12 or higher
  • Git

Quick Start

  1. Clone the repository

    git clone <repository-url>
    cd bajaj_hackrx
  2. Set up virtual environment

    python -m venv venv
    source venv/bin/activate  # On Windows: venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Configure environment

    cp .env.example .env
    # Edit .env with your configuration
    # IMPORTANT: Add your GEMINI_API_KEY to the .env file
  5. Run the application

    # Using the run script
    python run.py
    
    # Or using uvicorn directly
    uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

The API will be available at http://localhost:8000

⚙️ Configuration

Configure the application by editing the .env file:

# Application Configuration
APP_NAME="LLM-Powered Intelligent Query-Retrieval System"
VERSION="1.0.0"
DEBUG=true

# Server Configuration
HOST="0.0.0.0"
PORT=8000

# Security
BEARER_TOKEN="your-secure-token-here"

# AI Configuration
GEMINI_API_KEY="your-gemini-api-key-here"  # REQUIRED: Get from Google AI Studio

# Document Processing
MAX_FILE_SIZE=52428800  # 50MB in bytes
TEMP_DIR=""
SAVE_PARSED_TEXT=false
PARSED_TEXT_DIR="parsed_documents"

# Request Timeouts
DOWNLOAD_TIMEOUT=60
PROCESSING_TIMEOUT=300

⚠️ Important: You must add your GEMINI_API_KEY to the .env file for the system to work. Get your API key from Google AI Studio.

📚 API Documentation

Interactive Documentation

  • Swagger UI: http://localhost:8000/docs
  • ReDoc: http://localhost:8000/redoc

Main Endpoints

1. Health Check

GET /health

Returns system health status.

2. Process Queries

POST /hackrx/run
Authorization: Bearer your-token-here
Content-Type: application/json

{
  "documents": "https://example.com/document.pdf",
  "questions": [
    "What is the main topic of this document?",
    "Can you summarize the key findings?"
  ]
}

Response:

{
  "answers": [
    "The main topic is...",
    "The key findings include..."
  ],
  "processing_time": 2.45,
  "document_id": "doc_20250801_143234_042f627c",
  "metadata": {
    "chunks_processed": 15,
    "questions_answered": 2,
    "cache_hit": false
  }
}

🔧 Development

Project Structure

├── app/
│   ├── api/routes.py          # API endpoints
│   ├── core/
│   │   ├── config.py          # Configuration management
│   │   ├── security.py        # Authentication
│   │   └── directories.py     # Directory management
│   ├── models/requests.py     # Request/response models
│   ├── services/              # Core business logic
│   │   ├── rag_coordinator.py
│   │   ├── document_processor.py
│   │   ├── text_chunker.py
│   │   ├── embedding_manager.py
│   │   ├── vector_store.py
│   │   └── enhanced_answer_generator.py
│   └── utils/                 # Utility functions
├── documents/                 # Processed documents storage
├── vector_store/             # Vector embeddings storage
├── clauses/                  # Extracted clauses
└── requirements.txt          # Dependencies

Running in Development Mode

# With auto-reload
python run.py

# Or using uvicorn directly
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000

Testing

# Run health check
curl http://localhost:8000/health

# Test with sample document
curl -X POST "http://localhost:8000/hackrx/run" \
  -H "Authorization: Bearer your-token-here" \
  -H "Content-Type: application/json" \
  -d '{
    "documents": "file:///path/to/your/document.pdf",
    "questions": ["What is this document about?"]
  }'

🔒 Security

  • Authentication: Bearer token authentication for all protected endpoints
  • Input Validation: Comprehensive request validation using Pydantic
  • File Security: Safe file handling with size limits and type validation
  • CORS: Configurable cross-origin resource sharing

📈 Performance

The system is optimized for performance:

  • Document Caching: Automatic caching prevents reprocessing of identical documents
  • Vector Indexing: FAISS-based efficient similarity search
  • Async Processing: Non-blocking I/O operations
  • Chunking Strategy: Intelligent text segmentation preserves context
  • Memory Management: Efficient memory usage for large documents

Typical Performance Metrics

  • Document processing: 2-10 seconds (depending on size)
  • Question answering: 1-3 seconds per question
  • Vector search: <100ms for most queries

🛡️ Error Handling

The system provides comprehensive error handling:

  • Validation Errors: Clear messages for invalid inputs
  • Processing Errors: Graceful handling of document processing failures
  • Timeout Management: Configurable timeouts for external requests
  • Health Monitoring: Automatic health checks and status reporting

🤝 Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

  • Built for Bajaj HackRX hackathon
  • Powered by Google's Generative AI
  • Uses FlagEmbedding for high-quality embeddings
  • Built with FastAPI for high-performance APIs
  • Vector search powered by FAISS

📞 Support

For support and questions:

  • Create an issue in the repository
  • Check the documentation when running locally
  • Review the health endpoint for system status

Made with ❤️ for Bajaj HackRX

About

A sophisticated RAG-based document intelligence system using FastAPI, Google Gemini AI, FAISS, and FlagEmbedding for intelligent PDF query-answering. Features smart text chunking, vector search, authentication, and caching. Built for Bajaj HackRX hackathon.

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages