Hrishikesh Gawde hrishithub

🧑‍💻 About Me

I’m Hrishikesh Gawde, a recent Computer Science graduate with a strong interest in data engineering. Since college, I’ve been curious about distributed systems after coming across a YouTube playlist by MIT OpenCourseWare on the topic. While exploring career paths related to distributed systems, I discovered data engineering and became interested in how data moves and transforms across systems. This interest led me to build multiple end-to-end data engineering projects that reflect real industry practices.

I’ve worked on projects that covers both batch and streaming data pipelines, touching on use cases like CDC ingestion, SCD2 merges, event-driven loads and lakehouse architectures like the Medallion Architecture. I’ve also focused on testing, automation, and workflow orchestration. These projects are built with modern tools and frameworks commonly used in the data industry.

🎯 Why I Built These Projects

My goal was to go beyond basic tutorials and build projects that show an understanding of data engineering workflows in different business domains. I wanted to cover multiple industry use cases, explore modern tools and create pipelines that include monitoring, testing, and automation layers. The projects cover batch and streaming processing, distributed computing, data warehousing, real-time analytics, CDC ingestion, SCD2 merges, data lakehouse architectures, and workflow orchestration. They span cloud platforms like AWS, GCP, and Databricks, and make use of open table formats like Iceberg, Hudi, and Delta Lake.

Each project reflects a specific learning goal: working with streaming data, handling change data capture, performing SCD2 merges or implementing data lakehouse patterns.

📂 Projects

Below are my data engineering projects. Each project has its own repository with source code and documentation.

✅ 1. Flight Booking Data Pipeline with Airflow and CICD

Tech Stack: GitHub, GitHub Actions, Google Storage, PySpark, Dataproc Serverless, Airflow, BigQuery

✅ 2. Event Driven Incremental Ingestion Pipeline for Order Tracking

Tech Stack: Google Storage, PySpark, Databricks, Delta Lake, Databricks Workflows, GitHub

✅ 3. UPI Transactions Real Time CDC Feed Processing

Tech Stack: Databricks, Spark Structured Streaming, Delta Lake

✅ 4. Travel Bookings Data Ingestion Pipeline With SCD2 Merge

Tech Stack: Databricks, PySpark, Delta Lake, Delta Live Table Job

✅ 5. Healthcare Delta Live Table Pipeline with Medallion Architecture

Tech Stack: Databricks, PySpark, Delta Lake, Delta Live Table Job

✅ 6. News Data Analysis with Event-Driven Incremental Load in Snowflake Table

Tech Stack: Airflow, Google Cloud Storage, Python, Snowflake

✅ 7. Movie Bookings Real Time CDC Data Pipeline with Medallion Architecture in Snowflake

Tech Stack: Python, Snowflake Dynamic Table, Snowflake Stream, Snowflake Tasks, Streamlit

✅ 8. Car Rental Data Batch Ingestion with SCD2 Merge in Snowflake Table

Tech Stack: Python, PySpark, GCP Dataproc, Airflow, Snowflake

✅ 9. IRCTC Streaming Data Ingestion into BigQuery

Tech Stack: Python, GCP Storage, GCP Pub-Sub, BigQuery, Dataflow

✅ 10. Walmart Data Ingestion in BigQuery

Tech Stack: Python, Airflow, GCP Storage, BigQuery

✅ 11. Ad Tech Real Time Data Analysis

Tech Stack: Python, AWS Kinesis, AWS Managed Flink, AWS Glue, Spark Streaming, Apache Iceberg, AWS S3, Glue Catalog, AWS Athena

✅ 12. Credit Card Transaction Analysis for Fraud Risk

Tech Stack: Python, PySpark, Google Storage, GCP Dataproc Serverless, GCP BigQuery, GCP Composer (Airflow), PyTest, GitHub, GitHub Actions

Feel free to explore my project repositories. Each one includes source code and detailed documentation. For any opportunities or discussions related to data engineering roles, you can reach me at:

📍 Mumbai, Maharashtra

📞 +91 9309268556

📧 hrishikesh.workmail@gmail.com

🔗 LinkedIn | GitHub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hrishikesh Gawde hrishithub

Block or report hrishithub

🧑‍💻 About Me

🎯 Why I Built These Projects

📂 Projects

Pinned Loading

Uh oh!