(EDA PROJECT)
An end-to-end data analytics project that analyzes e-commerce customer behaviour — from raw CSV data to an interactive Tableau dashboard powered by a PostgreSQL database.
This project explores customer purchasing patterns and behaviour on an e-commerce platform. The pipeline covers data cleaning, exploratory data analysis (EDA), storage, SQL-based analysis, and visualization in a single cohesive workflow.
Raw CSV Data
↓
Data Cleaning (Jupyter Notebook)
↓
Exploratory Data Analysis (Jupyter Notebook)
↓
PostgreSQL Database
↓
SQL Queries & Analysis
↓
Tableau Dashboard
| Layer | Tool |
|---|---|
| Data Cleaning | Python, Pandas (Jupyter Notebook) |
| Database | PostgreSQL |
| Analysis | SQL |
| Visualization | Tableau |
| Metric | Value |
|---|---|
| Number of Customers | 3,900 |
| Average Purchase Amount | $59.76 |
| Total Revenue | $233.1K |
| Average Review Rating | 3.8 ⭐ |
- % of Customers by Subscription Status (Donut Chart) — 2,847 subscribed vs. 1,053 non-subscribed customers
- Revenue by Category (Bar Chart) — Breakdown across Clothing, Accessories, Footwear, and Outerwear; Clothing leads in revenue
- Top 10 Items Purchased (Line Chart) — Ranks items from Shorts at the lower end up to Blouse at the top, including Dress, Shirt, Pants, Jewelry, Belt, Scarf, Sweater, and Sunglasses
- Category Preference by Gender (Stacked Bar Chart) — Compares Male vs. Female purchasing behaviour across all four product categories
Performed in DATA_PROJECT_CLEANING.ipynb prior to loading data into PostgreSQL:
- Distribution of purchase amounts across customer segments
- Subscription vs. non-subscription spending patterns
- Seasonal purchase trends across product categories
- Gender-based buying behaviour analysis
- Identification of outliers in purchase amount and review ratings
The dashboard supports dynamic filtering by:
| Filter | Options |
|---|---|
| Category | Accessories, Clothing, Footwear, Outerwear |
| Subscription Status | Yes, No |
| Gender | Male, Female |
| Season | Fall, Spring, Summer, Winter |
| Shipping Type | 2-Day, Express, Free Shipping, Next Day Air, Standard, Store Pickup |
- 73% of customers (2,847) hold an active subscription, indicating strong retention
- Clothing is the highest revenue-generating category across all segments
- Blouse, Dress, and Shirt are the top 3 most purchased items
- Gender-based category preferences show notable differences, particularly in Clothing and Accessories
- Python 3.8+
- PostgreSQL 13+
- Tableau Desktop (or Tableau Public for
.twbxfiles) - Jupyter Notebook