Skip to content

Conversation

@hubgeter
Copy link
Contributor

@hubgeter hubgeter commented Nov 3, 2025

bp #57208
Problem Summary:
When parsing RLE_DICTIONARY encoding, the parquet reader uniformly uses memcpy. However, for INT32, INT64, etc., direct assignment is faster than memcpy.

In Parquet dictionary encoding, the actual data is not stored contiguously, resulting in very small memcpy sizes. When analyzing the implementation of memcpy, we can see that for such small sizes, __builtin_memcpy is used instead. The implementation of __builtin_memcpy essentially behaves like a series of simple assignments. You can observe the corresponding assembly code here: https://godbolt.org/z/r9Ma1ozvd.

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

… decode RLE_DICTIONARY encoding (apache#57208)

Problem Summary:
When parsing RLE_DICTIONARY encoding, the parquet reader uniformly uses
memcpy. However, for INT32, INT64, etc., direct assignment is faster
than memcpy.

In Parquet dictionary encoding, the actual data is not stored
contiguously, resulting in very small memcpy sizes. When analyzing the
implementation of `memcpy`, we can see that for such small sizes,
`__builtin_memcpy` is used instead. The implementation of
`__builtin_memcpy` essentially behaves like a series of simple
assignments. You can observe the corresponding assembly code here:
https://godbolt.org/z/r9Ma1ozvd.
@hubgeter hubgeter requested a review from morrySnow as a code owner November 3, 2025 03:15
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@hubgeter
Copy link
Contributor Author

hubgeter commented Nov 3, 2025

run buildall

@hubgeter hubgeter changed the title [enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding (#57208) branch-3.1:[enhancement](parquet)Optimize the performance of parquet reader when decode RLE_DICTIONARY encoding (#57208) Nov 3, 2025
@morningman morningman merged commit 873d39e into apache:branch-3.1 Nov 4, 2025
22 of 23 checks passed
@morrySnow morrySnow mentioned this pull request Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants