End-to-End Analytics Engineering & BI Pipeline
- Problem: High early-stage churn (Weeks 1-4) in the tutoring SaaS platform leads to diminished Customer Lifetime Value (LTV).
- Goal: Build a scalable 3-tier (RAW-CORE-MART) data warehouse to monitor weekly retention and identify "at-risk" students based on engagement density.
- Output: A production-style Operational Command Center that generates a "weekly risk list" for targeted re-engagement campaigns.
- Data Warehousing: Snowflake (SQL) - 3-Tier Architecture
- Statistical Modeling: R (OLS Regression) - Baseline Behavioral Analysis
- Visualization: Tableau Public - Operational Dashboards
The pipeline implements SaaS-standard data governance to ensure reliability and performance:
- 01_Setup (Ingestion): Environment initialization and raw data auditing with Data Quality (DQ) guards (null-rate and duplicate detection).
- 02_Core (Normalization): Event-level
FACT_SOLVESwith incremental processing logic. Data grain is set at theuser-event-timestamplevel. - 03_Mart (Metric Layer): Aggregated tables optimized for BI performance, including Weekly Retention Cohorts and Student Feature Marts.
Engineering Highlights:
- Cost Optimization: Configured X-Small warehouse with auto-suspend and utilized Query Pruning on event dates to minimize compute credits.
- Reliability: Integrated row-count drift monitoring and freshness checks within the SQL workflow.
Instead of a static study, the R model serves as an Explainable Baseline for business intervention:
- Core Insight: Quantified that Active Days per Week is the primary driver of student productivity (Total Questions Solved).
- Actionability: The model identifies "low-frequency" user segments, pushing
Risk_Flagsback to the Snowflake Mart for automated CRM triggers.
The dashboard provides a Macro-to-Micro drill-down for operations teams:
- Weekly Retention Heatmap: Tracks 35-week user decay to pinpoint critical drop-off windows.
- Engagement Scatter: Linked visuals that allow operators to filter engagement behavior by acquisition cohort.
- π View Live Operational Dashboard
βββ sql/ # Production SQL: 01_Setup, 02_Core, 03_Mart
βββ r_analysis/ # R scripts for OLS baseline & risk segmentation
βββ tableau/ # Dashboard documentation & screenshots
βββ data_samples/ # Schema definitions & top-10 row samples for reference