Case Study for Data Structures and Algorithms (23CSE203)
This project integrates Data Structures (Segment Tree) with an AI-powered Retrieval-Augmented Generation (RAG) pipeline to optimize stock-related queries. It demonstrates how advanced algorithms and AI models can work together for efficient data retrieval, analysis, and user interaction.
-
Backend (Python + Flask)
- Implements Segment Trees for efficient stock query operations (min, max, sum, average over a date range).
- Integrates with Pinecone Vector DB to store embeddings.
- Uses Google Gemini embeddings + LLM for natural language queries.
- Provides APIs that combine AI + data structure outputs.
-
Frontend (React + Vite)
- Interactive UI for querying stock data.
- Displays AI-enhanced insights along with structured results from the segment tree.
- User-friendly interface to test queries.
Segment Tree + RAG Pipeline Flow
User Query
↓
Flask API (/chat endpoint)
↓
chat_bot.py (Query Handler)
↓
RAG Pipeline (rag_setup.py)
| ├─ Pinecone Vector DB (document retrieval)
| ├─ Google Gemini Embeddings (text → vectors)
| └─ Gemini LLM (natural language understanding)
| ↓
| Generates JSON Command OR Text Answer
↓
Decision Point:
├─ [JSON Command Found] → Segment Tree (segment_tree.py)
│ ↓
│ Numerical Result
└─ [No Command] → Direct RAG Answer
↓
Final Answer
Description
- User Query: Natural language input from the frontend UI.
- Flask API: Receives user query and routes to backend query handler.
- chat_bot.py: Handles preprocessing and determines if Segment Tree is needed.
- RAG Pipeline:
- Retrieves relevant documents using Pinecone Vector DB.
- Converts text to embeddings using Google Gemini.
- Uses Gemini LLM for understanding and generating a JSON command or direct text answer.
- Decision Point:
- If a JSON command is generated (e.g., range query for min/max/sum/avg), it is processed by the Segment Tree module for efficient computation.
- If no JSON command is generated, the response is taken directly from the RAG output.
- Segment Tree (segment_tree.py): Performs efficient numerical operations for the query.
- Final Response: Sent back to frontend for display.
├── backend/
├── frontend/ # React + Vite app
├── .gitignore
├── README.md
└── LICENSE
# Navigate to backend
cd backend
# Create a virtual environment
python -m venv venv
# Activate the environment
# On Windows
venv\Scripts\activate
# On Mac/Linux
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Run backend server
python app.pyThe backend will start running (default: http://127.0.0.1:5000/).
# Navigate to frontend
cd frontend
# Install dependencies
npm install
# Start development server
npm run devThe frontend will run on http://localhost:5173/ by default.
User Query → User enters a stock-related query in natural language (e.g., "What is the average price of AAPL between Jan 1 and Jan 15?").
RAG Retrieval → Gemini embeddings + Pinecone fetch relevant historical stock data.
Segment Tree Processing → If the query involves numerical operations (min, max, avg, sum), a Segment Tree processes it efficiently.
AI + Data Structure Response → Final response is generated by combining retrieved AI context and Segment Tree calculations.
Frontend Display → Results are displayed neatly in the React UI.
- 🔍 Hybrid Querying → Combines AI retrieval with Segment Tree efficiency.
- 📈 Efficient Stock Queries → Range queries (min, max, avg, sum) in O(log n).
- 🤖 AI-Powered Assistant → Understands user queries in natural language.
- 🌐 Full-Stack Integration → Backend (Python) + Frontend (React + Vite).