Skip to content

[GLUTEN-9205][CH] Support deletion vector native write#9248

Merged
loneylee merged 9 commits intoapache:mainfrom
loneylee:9205
Apr 9, 2025
Merged

[GLUTEN-9205][CH] Support deletion vector native write#9248
loneylee merged 9 commits intoapache:mainfrom
loneylee:9205

Conversation

@loneylee
Copy link
Copy Markdown
Member

@loneylee loneylee commented Apr 7, 2025

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

(Fixes: #9205)

How was this patch tested?

test by ut

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2025

#9205

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2025

Run Gluten Clickhouse CI on x86

@loneylee
Copy link
Copy Markdown
Member Author

loneylee commented Apr 7, 2025

@CodiumAI-Agent /improve

@QodoAI-Agent
Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

@loneylee loneylee requested a review from Copilot April 7, 2025 13:02
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 11 out of 15 changed files in this pull request and generated no comments.

Files not reviewed (4)
  • backends-clickhouse/src-delta-32/main/scala/org/apache/gluten/sql/shims/delta32/Delta32Shims.scala: Language not supported
  • backends-clickhouse/src-delta-32/main/scala/org/apache/spark/sql/execution/DeletionVectorWriteTransformer.scala: Language not supported
  • backends-clickhouse/src-delta-32/test/scala/org/apache/spark/gluten/delta/GlutenDeltaParquetDeletionVectorSuite.scala: Language not supported
  • backends-clickhouse/src-delta/main/scala/org/apache/gluten/component/CHDeltaComponent.scala: Language not supported
Comments suppressed due to low confidence (1)

cpp-ch/local-engine/Storages/SubstraitSource/Delta/DeltaWriter.cpp:95

  • Ensure that 'get64(0)' is intended to always fetch the first row's value; if each row should have its own cardinality, consider using the current row index (e.g., get64(row_idx)).
auto cardinality = cardinality_src_columns.column->get64(0); // alisa deletedRowIndexCount

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2025

Run Gluten Clickhouse CI on x86

@QodoAI-Agent
Copy link
Copy Markdown

PR Code Suggestions ✨

No code suggestions found for the PR.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2025

Run Gluten Clickhouse CI on x86

/**
* Contains utility classes and method for performing DML operations with Deletion Vectors.
*/
object DMLWithDeletionVectorsHelper extends DeltaCommand {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scala class is copied from the delta source, please ref to the object VacuumCommand, add some comments for the contents of the changes which is different from the source, and add the comments // --- modified start and // --- modified end to specify the changes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix

public static native long deletionVectorWriteFinalize(long writer_address);

public static String encodeUUID(String uuid, String randomPrefix) {
return DeletionVectorWriteTransformer.encodeUUID(uuid, randomPrefix);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

call the java api from the native to generate the uuid ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, comment has been added

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2025

Run Gluten Clickhouse CI on x86

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 11 out of 16 changed files in this pull request and generated 1 comment.

Files not reviewed (5)
  • backends-clickhouse/src-delta-32/main/scala/org/apache/gluten/sql/shims/delta32/Delta32Shims.scala: Language not supported
  • backends-clickhouse/src-delta-32/main/scala/org/apache/spark/sql/execution/DeletionVectorWriteTransformer.scala: Language not supported
  • backends-clickhouse/src-delta-32/test/scala/org/apache/spark/gluten/delta/GlutenDeltaParquetDeletionVectorSuite.scala: Language not supported
  • backends-clickhouse/src-delta/main/scala/org/apache/gluten/component/CHDeltaComponent.scala: Language not supported
  • backends-clickhouse/src-delta/main/scala/org/apache/gluten/sql/shims/DeltaShims.scala: Language not supported
Comments suppressed due to low confidence (1)

cpp-ch/local-engine/Storages/SubstraitSource/Delta/DeltaWriter.cpp:247

  • Verify that passing a std::byte value to writeIntBinary is intended; an explicit cast to an integer type might be required to ensure correct binary output.
DB::writeIntBinary(DV_FILE_FORMAT_VERSION_ID_V1, *write_buffer);

Comment on lines +95 to +96
auto cardinality = cardinality_src_columns.column->get64(row_idx); // alisa deletedRowIndexCount

Copy link

Copilot AI Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider correcting the comment from 'alisa' to 'alias' for clarity.

Suggested change
auto cardinality = cardinality_src_columns.column->get64(row_idx); // alisa deletedRowIndexCount
auto cardinality = cardinality_src_columns.column->get64(row_idx); // alias deletedRowIndexCount

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor

@zzcclp zzcclp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@loneylee loneylee merged commit a4b230b into apache:main Apr 9, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CH] Support deletion vector native write

4 participants