Empowering Data Intelligence with Distributed SQL for Sharding, Scalability, and Security Across All Databases.
-
Updated
Dec 7, 2025 - Java
Empowering Data Intelligence with Distributed SQL for Sharding, Scalability, and Security Across All Databases.
Change data capture for a variety of databases. Please log issues at https://github.com/debezium/dbz/issues.
Flink CDC is a streaming data integration tool
BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
Data Pipeline Automation Framework to build MCP servers, data APIs, and data lakes with SQL.
By Smart Shaped s.r.l. (https://www.smartshaped.com/)
Kafka Streams made easy with a YAML file
cron replacement to schedule complex data workflows
Data pipeline using Apache Kafka, Apache Spark and HDFS
Toolkit for describing data transformation pipelines by compositing simple reusable components.
SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).
⚡ 数据集成 | DataLink is a lightweight data integration framework build on top of DataX, Spark and Flink
A real-time data pipeline using Kafka, Spark, and Cassandra for processing and storing credit card expenses. Includes a Spring Boot application for retrieving personnel data from MySQL, storing images in S3, and displaying employee details with expense reports on a web interface.
An end to end data pipeline with Kafka Spark Streaming Integration
Data-processing and common libraries used in main project, all available under Apache 2.0
A real-time cryptocurrency data streaming pipeline.
Real Time Data Streaming Pipeline
This is the graduation project DEPI internship.
Add a description, image, and links to the data-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the data-pipeline topic, visit your repo's landing page and select "manage topics."