-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Feature](Streaming Job) Extend streaming job to support MySQL synchronization #58898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# Conflicts: # build.sh
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR extends streaming jobs to support MySQL synchronization via CDC (Change Data Capture), enabling users to sync data from MySQL databases to Doris in real-time. The implementation includes a new CDC client service and modifications to the streaming job framework.
Key Changes:
- Introduces a CDC client Spring Boot application that interfaces with MySQL using Flink CDC connectors
- Adds support for FROM MySQL TO Database syntax in job creation
- Implements split-based data reading for both snapshot and binlog phases
- Adds RPC endpoints for BE-FE communication to handle CDC operations
Reviewed changes
Copilot reviewed 85 out of 85 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| regression-test/suites/job_p0/streaming_job/cdc/test_streaming_mysql_job.groovy | Regression test for MySQL streaming job with CDC |
| gensrc/proto/internal_service.proto | Adds RPC interface for CDC client communication |
| fs_brokers/cdc_client/** | Complete CDC client implementation using Spring Boot |
| fe/fe-core/.../streaming/** | Extends streaming job framework with multi-table task support |
| fe/fe-core/.../offset/jdbc/** | JDBC offset provider for tracking MySQL binlog positions |
| fe/fe-core/.../util/StreamingJobUtils.java | Utility functions for streaming job management |
| docker/thirdparties/docker-compose/mysql/my.cnf | Enables MySQL binlog for CDC |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This comment was marked as outdated.
This comment was marked as outdated.
### What problem does this PR solve? fix show task error info when task timeout Related PR: #58898
### What problem does this PR solve? fix show task error info when task timeout Related PR: #58898
…pache#59705) ### What problem does this PR solve? Issue Number: close #xxx Related PR: apache#58898
…pty tables (apache#59735) ### What problem does this PR solve? Fix the issue of synchronization failure under empty tables Related PR: apache#58898
…e#59784) ### What problem does this PR solve? fix show task error info when task timeout Related PR: apache#58898
…pache#59760) ### What problem does this PR solve? fix get remote meta failed to pause streaming job Releate PR: apache#58898
… remainsplit relay problem (#59883) ### What problem does this PR solve? Related PR: #58898 After the Job is created for the first time, starting from the initial offset, the task for the first split is scheduled, When the task status is running or failed, If FE restarts, the split needs to be restore from the meta again.
… remainsplit relay problem (#59883) ### What problem does this PR solve? Related PR: #58898 After the Job is created for the first time, starting from the initial offset, the task for the first split is scheduled, When the task status is running or failed, If FE restarts, the split needs to be restore from the meta again.
What problem does this PR solve?
Issue Number: close #58896
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)