-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Fix](delete command) Mark delete sign when do delete command in MoW table #35917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Fix](delete command) Mark delete sign when do delete command in MoW table #35917
Conversation
|
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
|
run buildall |
TPC-H: Total hot run time: 42255 ms |
TPC-DS: Total hot run time: 170489 ms |
ClickBench: Total hot run time: 30.43 s |
d8665f4 to
8387e79
Compare
|
run buildall |
TPC-H: Total hot run time: 39676 ms |
TPC-DS: Total hot run time: 173875 ms |
ClickBench: Total hot run time: 30.32 s |
8387e79 to
485badc
Compare
|
run buildall |
2 similar comments
|
run buildall |
|
run buildall |
f92a1a3 to
4df1572
Compare
|
run buildall |
1 similar comment
|
run buildall |
TPC-H: Total hot run time: 39823 ms |
TPC-DS: Total hot run time: 170324 ms |
ClickBench: Total hot run time: 30.92 s |
|
run buildall |
TPC-H: Total hot run time: 40628 ms |
TPC-DS: Total hot run time: 170936 ms |
ClickBench: Total hot run time: 30.3 s |
d2405ba to
ab893b2
Compare
|
run buildall |
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/DeleteFromCommand.java
Outdated
Show resolved
Hide resolved
fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/commands/DeleteFromCommand.java
Outdated
Show resolved
Hide resolved
TPC-H: Total hot run time: 39657 ms |
TPC-DS: Total hot run time: 172396 ms |
dataroaring
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
那个版本发布后,修复此问题呢? |
|
@weilai201 2.1.5 |
…table (#35917) ## Proposed changes close #34551 Problem: As shown in the issue above, if a key deleted by a delete statement is written to by updating only certain columns, the data will not display correctly. Reason: The delete statement deletes the data by writing a delete predicate, which is stored in the rowset meta and applied during data retrieval to filter the data. However, partial column updates do not consider the effect of the delete predicate when reading the original data. The imported key should be considered as a new key (since it has already been deleted), but it is actually treated as an old key. Therefore, only some columns are updated, leading to incorrect results. Solution: Consider the delete predicate during partial column updates, but this method will result in reading more columns, as shown in #35766. Thus, in this PR, we change the delete operation in the mow table from writing a delete predicate to writing a delete sign, which effectively resolves the issue.
…d mode (#37151) Problem: The `test_new_partial_update` case fails to run in cloud mode, but it passes in local mode. Reason: In PR #35917, we introduced a new table attribute `enable_light_delete`. When executing schema changes with `alter table set xxx` statements, the local and cloud modes process the logic differently. The cloud mode has its unique processing logic, which was not addressed in the mentioned PR, leading to failures in the cloud environment. Solution: To resolve the issue, we need to complete the missing schema change logic for the cloud mode. Once this is implemented, the problem should be resolved.
…table (apache#35917) close apache#34551 Problem: As shown in the issue above, if a key deleted by a delete statement is written to by updating only certain columns, the data will not display correctly. Reason: The delete statement deletes the data by writing a delete predicate, which is stored in the rowset meta and applied during data retrieval to filter the data. However, partial column updates do not consider the effect of the delete predicate when reading the original data. The imported key should be considered as a new key (since it has already been deleted), but it is actually treated as an old key. Therefore, only some columns are updated, leading to incorrect results. Solution: Consider the delete predicate during partial column updates, but this method will result in reading more columns, as shown in apache#35766. Thus, in this PR, we change the delete operation in the mow table from writing a delete predicate to writing a delete sign, which effectively resolves the issue.
…d mode (apache#37151) Problem: The `test_new_partial_update` case fails to run in cloud mode, but it passes in local mode. Reason: In PR apache#35917, we introduced a new table attribute `enable_light_delete`. When executing schema changes with `alter table set xxx` statements, the local and cloud modes process the logic differently. The cloud mode has its unique processing logic, which was not addressed in the mentioned PR, leading to failures in the cloud environment. Solution: To resolve the issue, we need to complete the missing schema change logic for the cloud mode. Once this is implemented, the problem should be resolved.
…table (apache#35917) close apache#34551 Problem: As shown in the issue above, if a key deleted by a delete statement is written to by updating only certain columns, the data will not display correctly. Reason: The delete statement deletes the data by writing a delete predicate, which is stored in the rowset meta and applied during data retrieval to filter the data. However, partial column updates do not consider the effect of the delete predicate when reading the original data. The imported key should be considered as a new key (since it has already been deleted), but it is actually treated as an old key. Therefore, only some columns are updated, leading to incorrect results. Solution: Consider the delete predicate during partial column updates, but this method will result in reading more columns, as shown in apache#35766. Thus, in this PR, we change the delete operation in the mow table from writing a delete predicate to writing a delete sign, which effectively resolves the issue.
…d mode (apache#37151) Problem: The `test_new_partial_update` case fails to run in cloud mode, but it passes in local mode. Reason: In PR apache#35917, we introduced a new table attribute `enable_light_delete`. When executing schema changes with `alter table set xxx` statements, the local and cloud modes process the logic differently. The cloud mode has its unique processing logic, which was not addressed in the mentioned PR, leading to failures in the cloud environment. Solution: To resolve the issue, we need to complete the missing schema change logic for the cloud mode. Once this is implemented, the problem should be resolved.
…d mode (#37151) Problem: The `test_new_partial_update` case fails to run in cloud mode, but it passes in local mode. Reason: In PR #35917, we introduced a new table attribute `enable_light_delete`. When executing schema changes with `alter table set xxx` statements, the local and cloud modes process the logic differently. The cloud mode has its unique processing logic, which was not addressed in the mentioned PR, leading to failures in the cloud environment. Solution: To resolve the issue, we need to complete the missing schema change logic for the cloud mode. Once this is implemented, the problem should be resolved.
…table (apache#35917) close apache#34551 Problem: As shown in the issue above, if a key deleted by a delete statement is written to by updating only certain columns, the data will not display correctly. Reason: The delete statement deletes the data by writing a delete predicate, which is stored in the rowset meta and applied during data retrieval to filter the data. However, partial column updates do not consider the effect of the delete predicate when reading the original data. The imported key should be considered as a new key (since it has already been deleted), but it is actually treated as an old key. Therefore, only some columns are updated, leading to incorrect results. Solution: Consider the delete predicate during partial column updates, but this method will result in reading more columns, as shown in apache#35766. Thus, in this PR, we change the delete operation in the mow table from writing a delete predicate to writing a delete sign, which effectively resolves the issue.
…d mode (apache#37151) Problem: The `test_new_partial_update` case fails to run in cloud mode, but it passes in local mode. Reason: In PR apache#35917, we introduced a new table attribute `enable_light_delete`. When executing schema changes with `alter table set xxx` statements, the local and cloud modes process the logic differently. The cloud mode has its unique processing logic, which was not addressed in the mentioned PR, leading to failures in the cloud environment. Solution: To resolve the issue, we need to complete the missing schema change logic for the cloud mode. Once this is implemented, the problem should be resolved.
In #35917 and #37151, we changed MOW table default delete command from delete predicate to delete sign. It makes sure the correctness during partial update but leads to slowdowns. Actually, if there is no partial update, delete predicate will not lead to data fault. Delete data by delete predicate or delete sign can be controlled by a table property "enable_light_delete". If "enable_light_delete=true", we execute delete command by delete predicate. Otherwise, we execute delete command by delete sign. In p2 cases, there are lots of cases with large data need to delete and do not perform partial column update operations. Therefore, in order to make it faster, we change some cases default create table clause.
In #35917 and #37151, we changed MOW table default delete command from delete predicate to delete sign. It makes sure the correctness during partial update but leads to slowdowns. Actually, if there is no partial update, delete predicate will not lead to data fault. Delete data by delete predicate or delete sign can be controlled by a table property "enable_light_delete". If "enable_light_delete=true", we execute delete command by delete predicate. Otherwise, we execute delete command by delete sign. In p2 cases, there are lots of cases with large data need to delete and do not perform partial column update operations. Therefore, in order to make it faster, we change some cases default create table clause.
…7689) In apache#35917 and apache#37151, we changed MOW table default delete command from delete predicate to delete sign. It makes sure the correctness during partial update but leads to slowdowns. Actually, if there is no partial update, delete predicate will not lead to data fault. Delete data by delete predicate or delete sign can be controlled by a table property "enable_light_delete". If "enable_light_delete=true", we execute delete command by delete predicate. Otherwise, we execute delete command by delete sign. In p2 cases, there are lots of cases with large data need to delete and do not perform partial column update operations. Therefore, in order to make it faster, we change some cases default create table clause.
…7689) In apache#35917 and apache#37151, we changed MOW table default delete command from delete predicate to delete sign. It makes sure the correctness during partial update but leads to slowdowns. Actually, if there is no partial update, delete predicate will not lead to data fault. Delete data by delete predicate or delete sign can be controlled by a table property "enable_light_delete". If "enable_light_delete=true", we execute delete command by delete predicate. Otherwise, we execute delete command by delete sign. In p2 cases, there are lots of cases with large data need to delete and do not perform partial column update operations. Therefore, in order to make it faster, we change some cases default create table clause.
Proposed changes
close #34551
Problem: As shown in the issue above, if a key deleted by a delete statement is written to by updating only certain columns, the data will not display correctly.
Reason: The delete statement deletes the data by writing a delete predicate, which is stored in the rowset meta and applied during data retrieval to filter the data. However, partial column updates do not consider the effect of the delete predicate when reading the original data. The imported key should be considered as a new key (since it has already been deleted), but it is actually treated as an old key. Therefore, only some columns are updated, leading to incorrect results.
Solution: Consider the delete predicate during partial column updates, but this method will result in reading more columns, as shown in #35766. Thus, in this PR, we change the delete operation in the mow table from writing a delete predicate to writing a delete sign, which effectively resolves the issue.