Skip to content

Conversation

@BePPPower
Copy link
Contributor

@BePPPower BePPPower commented Dec 6, 2024

What problem does this PR solve

Related PR: picked from #44041

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…o the ORC and Parquet file formats. (apache#44041)

Problem Summary:

As before, the behavior of exporting of complex data types in Doris is
as follows:
  | orc type | parquet type | csv
-- | -- | -- | --
bitmap | string | Not Supported | Not Supported
quantile_state | Not Supported | Not Supported | Not Supported
hll | string | string | invisible string
jsonb | Not Supported | string | string
variant | Not Supported | string | string

What's more, there are some issues when exporting complex data types to
the ORC file format.

This PR does two things:
1. Fix the problem with exporting complex data types from Doris.
2. Support exporting these three complex types to both the ORC and the
Parquet file format.

  | orc type | parquet type | csv
-- | -- | -- | --
bitmap | binary | binary | "NULL"
quantile_state | binary | binary | "NULL"
hll | binary | binary | "NULL"
jsonb | string | string | string
variant | string | string | string

[fix](Outfile) Fix the data type mapping for complex types in Doris to the ORC and Parquet file formats.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@BePPPower
Copy link
Contributor Author

run buildall

@BePPPower BePPPower changed the title [fix](Outfile) Fix the data type mapping for complex types in Doris to the ORC and Parquet file formats [fix](branch-3.0) Fix the data type mapping for complex types in Doris to the ORC and Parquet file formats Dec 6, 2024
@morningman morningman merged commit fcd2ee7 into apache:branch-3.0 Dec 8, 2024
10 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants