Skip to content

[GLUTEN-9335][VL] Support iceberg partition write#10497

Merged
jinchengchenghh merged 10 commits intoapache:mainfrom
jinchengchenghh:iceberg-partition
Sep 3, 2025
Merged

[GLUTEN-9335][VL] Support iceberg partition write#10497
jinchengchenghh merged 10 commits intoapache:mainfrom
jinchengchenghh:iceberg-partition

Conversation

@jinchengchenghh
Copy link
Copy Markdown
Contributor

@jinchengchenghh jinchengchenghh commented Aug 20, 2025

Add Protobuf struct IcebergPartitionField to transfer the iceberg id information, add IcebergPartitionSpec to transfer partition information.
Build with test and benchmark in CI and fix IcebergWriteTest build.
Set the file format to orc to bypass native parquet write for partitioned tpch iceberg suite, after facebookincubator/velox#14670 which supports fanout false mode merged, we can relax the restriction.

Relevant PR: facebookincubator/velox#13874

@github-actions
Copy link
Copy Markdown

#9335

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

5 similar comments
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

auto veloxCfg =
std::make_shared<facebook::velox::config::ConfigBase>(std::unordered_map<std::string, std::string>(sparkConfs));
auto connectorSessionProperties_ = getHiveConfig(veloxCfg);
connectorSessionProperties_->set(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May need to do a refactor with WholeStageResultIterator::createConnectorConfig

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

1 similar comment
@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

Run Gluten Clickhouse CI on x86

Copy link
Copy Markdown
Member

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jinchengchenghh can you please do a rebase?

hiveConfMap[facebook::velox::connector::hive::HiveConfig::kOrcUseColumnNames] = "true";

return std::make_shared<facebook::velox::config::ConfigBase>(std::move(hiveConfMap));
return std::make_shared<facebook::velox::config::ConfigBase>(std::move(hiveConfMap), true);
Copy link
Copy Markdown
Member

@zhouyuan zhouyuan Sep 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config may introduce issue on latest main, as the azure client is expects some readonly config

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Sep 1, 2025

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

github-actions bot commented Sep 1, 2025

Run Gluten Clickhouse CI on x86

@github-actions
Copy link
Copy Markdown

github-actions bot commented Sep 2, 2025

Run Gluten Clickhouse CI on x86

@github-actions github-actions bot added the BUILD label Sep 2, 2025
@github-actions
Copy link
Copy Markdown

github-actions bot commented Sep 2, 2025

Run Gluten Clickhouse CI on x86

@jinchengchenghh
Copy link
Copy Markdown
Contributor Author

Could you help review again? Thanks! @zhouyuan

Copy link
Copy Markdown
Member

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

could we also create a sub-issue for the remaining fanout fix?

@jinchengchenghh jinchengchenghh merged commit d4bad42 into apache:main Sep 3, 2025
57 checks passed
@jinchengchenghh
Copy link
Copy Markdown
Contributor Author

Yes, added #10617

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants