-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[BUG] Catch retry submit exception #4796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Catch retry submit exception #4796
Conversation
fe/fe-core/src/main/java/org/apache/doris/load/loadv2/BulkLoadJob.java
Outdated
Show resolved
Hide resolved
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EmmyMiao87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another problem is that I think loading should not have so much concurrency control.
In fact, db itself already has a control of the number of transactions.
So the number of concurrent tasks here is best to be twice the number of transactions.
Also, if the submitted task is abnormal, I think the load job can be cancelled directly.
The reason is that after the system is currently loading the stack, it is generally difficult to recover quickly, so you are not sure how long you will have to wait to retry successfully.
morningman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
EmmyMiao87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM

Proposed changes
When
Load Job Task Queueis filled, continue to submit more jobs to the queue will causeRejectedExecutionException. Butcallback.onTaskFailedfunction does not catch the exception, that will cause re-submitting job failed, and status is not updated to failed.issue: #4795
Types of changes
What types of changes does your code introduce to Doris?
Put an
xin the boxes that applyChecklist
Put an
xin the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code.Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...