Skip to content

Conversation

@suxiaogang223
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #41460
Problem Summary:
When reading the Iceberg table, previously read DeleteRows should not be released immediately, as the Iceberg data file is split into multiple IcebergSplits for execution. These IcebergSplits belong to the same data file, meaning they share the same DeleteRows. Therefore, DeleteRows in the DeleteFile should not be released prematurely. Instead, they should be released when the shared_kv is reset, at which point all DeleteRows will be freed along with the cached DeleteFile.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…table with position delete (apache#47977)

Issue Number: close apache#41460

Problem Summary:
When reading the Iceberg table, previously read `DeleteRows` should not
be released immediately, as the Iceberg data file is split into multiple
`IcebergSplit`s for execution. These `IcebergSplit`s belong to the same
data file, meaning they share the same `DeleteRows`. Therefore,
`DeleteRows` in the `DeleteFile` should not be released prematurely.
Instead, they should be released when the shared_kv is reset, at which
point all `DeleteRows` will be freed along with the cached `DeleteFile`.
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.50% (9609/26328)
Line Coverage: 28.09% (79679/283614)
Region Coverage: 26.75% (40866/152790)
Branch Coverage: 23.50% (20712/88144)
Coverage Report: http://coverage.selectdb-in.cc/coverage/b15a10f66c1214ee586f6287ef5b968170ae3e48_b15a10f66c1214ee586f6287ef5b968170ae3e48/report/index.html
Increment Report: http://coverage.selectdb-in.cc/coverage/b15a10f66c1214ee586f6287ef5b968170ae3e48_b15a10f66c1214ee586f6287ef5b968170ae3e48/increment_report/index.html

@suxiaogang223 suxiaogang223 changed the title [cherry-pick](branch-2.1) Don't prematurely erase DeleteRows in reading iceberg table with position delete #47977 [cherry-pick](branch-2.1) Don't prematurely erase DeleteRows in reading iceberg table with position delete (#47977) Feb 26, 2025
@yiguolei yiguolei merged commit dae9d9d into apache:branch-2.1 Feb 27, 2025
18 of 20 checks passed
@yiguolei yiguolei added usercase Important user case type label p0_w labels Feb 27, 2025
@suxiaogang223 suxiaogang223 deleted the fix_iceberg_postition_bug branch July 10, 2025 09:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

p0_w usercase Important user case type label

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants