Welcome to the SQL Data Warehouse Project repository!
This project demonstrates a complete data engineering and analytics workflow—from building a modern data warehouse to generating actionable insights which is designed as a portfolio project.
Key components of this project include:
- Data Architecture: Designing a modern warehouse using Bronze, Silver, and Gold layers.
- ETL Pipelines: Extracting, transforming, and loading data from source systems.
- Data Modeling: Creating fact and dimension tables optimized for analytics.
- Analytics & Reporting: Building SQL-based reports and dashboards for business insights.
Objective
Develop a scalable data warehouse using SQL Server to consolidate sales data and support informed decision-making.
Specifications
- Data Sources: The project integrates data from two primary source systems—ERP and CRM—provided as CSV files.
- Data Quality: All ingested data undergoes cleansing and validation to resolve inconsistencies and ensure accuracy prior to analysis.
- Integration: Data from both sources is consolidated into a unified, query-optimized model designed to support analytical workloads.
- Scope: The warehouse focuses exclusively on the most recent dataset. Historical data retention or historization is not required.
- Documentation: Comprehensive documentation of the data model is provided to facilitate understanding and usage by both business stakeholders and analytics teams.
This project follows the Medallion Architecture, structured into three layers:
1. Bronze Layer: Stores raw data as-is from the source systems. Data is ingested from CSV Files into SQL Server Database.
2. Silver Layer: This layer includes data cleansing, standardization, and normalization processes to prepare data for analysis.
3. Gold Layer: Houses business-ready data modeled into a star schema required for reporting and analytics.