Skip to content

Conversation

@nextdreamblue
Copy link
Contributor

Proposed changes

Issue Number: close #17145

Problem summary

Describe your changes.

Checklist(Required)

  • Does it affect the original behavior
  • Has unit tests been added
  • Has document been added or modified
  • Does it need to update dependencies
  • Is this PR support rollback (If NO, please explain WHY)

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@nextdreamblue
Copy link
Contributor Author

run buildall

RETURN_IF_ERROR(_column_iterators[_schema.unique_id(cid)]->get_row_ranges_by_zone_map(
_opts.col_id_to_predicates[cid].get(),
_opts.col_id_to_del_predicates.count(cid) > 0
_opts.col_id_to_del_predicates.size() == 1 &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If user send to delete predicates:

  1. delete where a = 1
  2. delete where a = 2

In this pr, the delete predicate will take no effect?

Copy link
Contributor Author

@nextdreamblue nextdreamblue Feb 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

理论上,如果是两次独立的删除,比如delete from table where a=1; delete from table where a=2;其实这个地方应该可以使用的,但是目前的代码,是把所有不同版本的delete predicates和不同列的delete predicates都放到一起了,失去了版本信息、失去了谓词间可能是and的关系,统一弱化成了delete predicates都是独立的,有一个delete predicates满足条件,就把page都去掉。
这个pr的修改方式,就是在当前代码的基础上,当只有一个delete predicate的时候才能保证后续淘汰page的正确性,所以这里一律加了 == 1的判断才传递delete predicates。
如果要把不同版本的delete predicates和不同列的delete predicates作为完整和严谨的逻辑去判断page,需要修改的设计就有点多了,目前的方案算是一种优先解决bug的思路,后续可以进一步把delete predicates这块加速zone判断进行page淘汰的逻辑完善,提高delete predicates使用的场景。

"in_memory" = "false",
"storage_format" = "V2",
"light_schema_change" = "true",
"disable_auto_compaction" = "false"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think should disable auto compaction, then p0 could produce stable result. If disable auto compaction == false, the rowset maybe merged.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok,let me fix this test

RETURN_IF_ERROR(_column_iterators[_schema.unique_id(cid)]->get_row_ranges_by_zone_map(
_opts.col_id_to_predicates[cid].get(),
_opts.col_id_to_del_predicates.count(cid) > 0
_opts.col_id_to_del_predicates.size() == 1 &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here to explain these conditions. For example, why check size == 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i add a comment

@yiguolei yiguolei added the usercase Important user case type label label Feb 27, 2023
@nextdreamblue
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

// 统一弱化成了delete predicates都是独立的,有一个delete predicates满足条件,就把page都去掉。
// 这个pr的修改方式,就是在当前代码的基础上,当只有一个delete predicate的时候才能保证后续淘汰page的正确性,所以这里一律加了 == 1的判断才传递delete predicates。
// 如果要把不同版本的delete predicates和不同列的delete predicates作为完整和严谨的逻辑去判断page,需要修改的设计就有点多了,
// 目前的方案算是一种优先解决bug的思路,后续可以进一步把delete predicates这块加速zone判断进行page淘汰的逻辑完善,提高delete predicates使用的场景。
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please compact the comment and in English.
For example:

Currently, one transaction can only support one delete condition for page filtering.
Following case is allowed:
delete from table1 where a=1;
delete from table1 where a=2;
Following case is not allowed, because one transaction has more than on delete condition:
delete from table1 where a=1 and b = 2;
This may cause bug when you query table1 with a=1;
You can refer #17145 for more details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please compact the comment and in English. For example:

Currently, one transaction can only support one delete condition for page filtering.
Following case is allowed:
delete from table1 where a=1;
delete from table1 where a=2;
Following case is not allowed, because one transaction has more than on delete condition:
delete from table1 where a=1 and b = 2;
This may cause bug when you query table1 with a=1;
You can refer #17145 for more details.

thanks bro, you are my hero

@nextdreamblue
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

RETURN_IF_ERROR(_column_iterators[_schema.unique_id(cid)]->get_row_ranges_by_zone_map(
_opts.col_id_to_predicates[cid].get(),
_opts.col_id_to_del_predicates.count(cid) > 0
// Currently, one transaction can only support one delete condition for page filtering.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that when delete condition only contains OR predicates, page filter can work correctly here.
So this pr is a quick fix, it may cause some case performance degradation.
I think it's better to add a todo here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i add a todo


sql """delete from ${tableName} where k2 is not null and k3=11;"""

qt_sql """select k2,k3 from ${tableName};"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'd better add order by in query sql.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is better, lei fix it

wangbo
wangbo previously approved these changes Feb 27, 2023
Copy link
Contributor

@wangbo wangbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nextdreamblue
Copy link
Contributor Author

run buildall

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@nextdreamblue
Copy link
Contributor Author

run COMPILE

@nextdreamblue
Copy link
Contributor Author

run buildall

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Signed-off-by: nextdreamblue <zxw520blue1@163.com>
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@nextdreamblue
Copy link
Contributor Author

run buildall

Copy link
Contributor

@yiguolei yiguolei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Feb 28, 2023
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@yiguolei yiguolei merged commit e0cd859 into apache:master Feb 28, 2023
morningman pushed a commit that referenced this pull request Feb 28, 2023
理论上,如果是两次独立的删除,比如delete from table where a=1; delete from table where a=2;其实这个地方应该可以使用的,但是目前的代码,是把所有不同版本的delete predicates和不同列的delete predicates都放到一起了,失去了版本信息、失去了谓词间可能是and的关系,统一弱化成了delete predicates都是独立的,有一个delete predicates满足条件,就把page都去掉。
这个pr的修改方式,就是在当前代码的基础上,当只有一个delete predicate的时候才能保证后续淘汰page的正确性,所以这里一律加了 == 1的判断才传递delete predicates。
如果要把不同版本的delete predicates和不同列的delete predicates作为完整和严谨的逻辑去判断page,需要修改的设计就有点多了,目前的方案算是一种优先解决bug的思路,后续可以进一步把delete predicates这块加速zone判断进行page淘汰的逻辑完善,提高delete predicates使用的场景。
morningman pushed a commit that referenced this pull request Feb 28, 2023
yagagagaga pushed a commit to yagagagaga/doris that referenced this pull request Mar 9, 2023
…17146)

理论上,如果是两次独立的删除,比如delete from table where a=1; delete from table where a=2;其实这个地方应该可以使用的,但是目前的代码,是把所有不同版本的delete predicates和不同列的delete predicates都放到一起了,失去了版本信息、失去了谓词间可能是and的关系,统一弱化成了delete predicates都是独立的,有一个delete predicates满足条件,就把page都去掉。
这个pr的修改方式,就是在当前代码的基础上,当只有一个delete predicate的时候才能保证后续淘汰page的正确性,所以这里一律加了 == 1的判断才传递delete predicates。
如果要把不同版本的delete predicates和不同列的delete predicates作为完整和严谨的逻辑去判断page,需要修改的设计就有点多了,目前的方案算是一种优先解决bug的思路,后续可以进一步把delete predicates这块加速zone判断进行page淘汰的逻辑完善,提高delete predicates使用的场景。
luwei16 pushed a commit to luwei16/Doris that referenced this pull request Apr 7, 2023
* [fix](delete) fix delete from bug which can get wrong result (apache#17146)

理论上,如果是两次独立的删除,比如delete from table where a=1; delete from table where a=2;其实这个地方应该可以使用的,但是目前的代码,是把所有不同版本的delete predicates和不同列的delete predicates都放到一起了,失去了版本信息、失去了谓词间可能是and的关系,统一弱化成了delete predicates都是独立的,有一个delete predicates满足条件,就把page都去掉。
这个pr的修改方式,就是在当前代码的基础上,当只有一个delete predicate的时候才能保证后续淘汰page的正确性,所以这里一律加了 == 1的判断才传递delete predicates。
如果要把不同版本的delete predicates和不同列的delete predicates作为完整和严谨的逻辑去判断page,需要修改的设计就有点多了,目前的方案算是一种优先解决bug的思路,后续可以进一步把delete predicates这块加速zone判断进行page淘汰的逻辑完善,提高delete predicates使用的场景。

* [fix](delete) fix 'is null' or 'is not null' delete predicate will get wrong result (apache#17190)

fix 'is null' or 'is not null' delete predicate will get wrong result

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

---------

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Co-authored-by: xueweizhang <zxw520blue1@163.com>
swjtu-zhanglei pushed a commit to swjtu-zhanglei/incubator-doris that referenced this pull request Jul 25, 2023
* [fix](delete) fix delete from bug which can get wrong result (apache#17146)

理论上,如果是两次独立的删除,比如delete from table where a=1; delete from table where a=2;其实这个地方应该可以使用的,但是目前的代码,是把所有不同版本的delete predicates和不同列的delete predicates都放到一起了,失去了版本信息、失去了谓词间可能是and的关系,统一弱化成了delete predicates都是独立的,有一个delete predicates满足条件,就把page都去掉。
这个pr的修改方式,就是在当前代码的基础上,当只有一个delete predicate的时候才能保证后续淘汰page的正确性,所以这里一律加了 == 1的判断才传递delete predicates。
如果要把不同版本的delete predicates和不同列的delete predicates作为完整和严谨的逻辑去判断page,需要修改的设计就有点多了,目前的方案算是一种优先解决bug的思路,后续可以进一步把delete predicates这块加速zone判断进行page淘汰的逻辑完善,提高delete predicates使用的场景。

* [fix](delete) fix 'is null' or 'is not null' delete predicate will get wrong result (apache#17190)

fix 'is null' or 'is not null' delete predicate will get wrong result

Signed-off-by: nextdreamblue <zxw520blue1@163.com>

---------

Signed-off-by: nextdreamblue <zxw520blue1@163.com>
Co-authored-by: xueweizhang <zxw520blue1@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/1.1.6 dev/1.2.3-merged kind/test reviewed usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] will get wrong result when have delete predicates

4 participants