Skip to content

Conversation

@flyrain
Copy link
Contributor

@flyrain flyrain commented May 5, 2022

Multiple people asked me to share the tool to rewrite the metadata files when moving tables from one location to another. Here you are. Three Spark actions are added.

  1. Action copy-table is for the source tables to rewrite metadata files
  2. Action checkSnapshotIntegrity and action removeExpiredFiles are for the target tables to apply changes, integrity check, and cleanup.

@rdblue
Copy link
Contributor

rdblue commented May 10, 2022

@flyrain, can you open these in separate PRs? They seem like different features to me.

@flyrain
Copy link
Contributor Author

flyrain commented May 10, 2022

@rdblue, filed this mainly for sharing. I will close this first, and we can always file separated new PRs if we want to merge this into master.

@flyrain flyrain closed this May 10, 2022
@laithalzyoud
Copy link
Contributor

Hello @flyrain @rdblue , are there any plans to file separate PRs and merge into master? We are already using these changes in a forked version of Iceberg and did some changes to make it work with 1.4.x and Spark 3.3. We want to get these changes merged to avoid updating the forked version on every Iceberg release. I can also help to push this through if needed!

@flyrain
Copy link
Contributor Author

flyrain commented Jan 3, 2024

Hi @laithalzyoud, glad you found this useful. Would you like to take the lead for this task? I could be the co-author if that makes sense to you. I can help on the review, but we will still need at least one more committer other than me to review and approve, since I'm going to be the co-author. cc @rdblue @RussellSpitzer @aokolnychyi @szehon-ho

@szehon-ho
Copy link
Member

+1 on adding it, I also think its useful , but was not there for the initial reason why it was not added originally.

@laithalzyoud
Copy link
Contributor

@flyrain Yes I would be glad to take lead along with you as co-author. I will dedicate some time next week and start working on this 👍

@amogh-jahagirdar
Copy link
Contributor

@flyrain @laithalzyoud I'm also happy to help with reviews on these PRs!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants