Skip to content

Conversation

@zhangbutao
Copy link
Contributor

After #5601 and #6220, there is inconsistent TimeTransform Type in some codes. This causes an exception when remove and then add a same time partition(year, month, day, hour).

I added a UT in this pr, and you can use the UT to reproduce this issue. This issue existed in iceberg1.1.0 and it blocked me upgrade iceberg from 1.0.0 to 1.1.0 in Hive repo. apache/hive#3851 and failed tests http://ci.hive.apache.org/blue/organizations/jenkins/hive-precommit/detail/PR-3851/1/tests

@zhangbutao
Copy link
Contributor Author

I am new to iceberg. Can anyone give approval to run the CI?

Copy link
Contributor

@hililiwei hililiwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution. I left a few comments.

table.spec());

// remove and add a field with TimeTransform(Years, Months, Days, Hours)
table.updateSpec().removeField("year_field_year").addField(year("year_field")).commit();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add a case that remove year_field_year first and then add year_field_year after the deletion is successful?

table.updateSpec().removeField("year_field_year").commit();
assert……
table.updateSpec().addField(year("year_field")).commit();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this case. Other tests have include the case, eg. testAddAfterLastFieldRemoved

.year("year_field")
.build(),
table.spec());
V2Assert.assertEquals(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, these two tests are essentially the same for me, right? Please correct me if I missed something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. test1: adding a year field to the spec(table.updateSpec().addField(year("year_field")).commit();)
    I want to verify if the actual table transform type is same as PartitionSpec.builderFor and the transform type class should be org.apache.iceberg.transforms.Years. If not same, the test2 will also faill.

  2. test2: table.updateSpec().removeField("year_field_year").addField(year("year_field")).commit();
    Because i have encountered exception when removed and then add a same field, i want to check if this could work after the fix. To be exact, test2 is a e2e test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, You mean V1Assert.assertEquals and V2Assert.assertEquals? I just follow other tests pattern and i think there maybe have some difference bettewn v2 and v1 formation, but i have not make more research.

Copy link

@PraveenNanda124 PraveenNanda124 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@nastra
Copy link
Contributor

nastra commented Jan 9, 2023

Note that this PR is essentially reverting #6220, which I don't think we can do atm. /cc @Fokko

@zhangbutao
Copy link
Contributor Author

Sorry for long time to come back. Will address comments as soon as :) @hililiwei

@zhangbutao zhangbutao force-pushed the fix_timetranform_inconsistency branch from 1ac0018 to 891c820 Compare January 28, 2023 09:13
@zhangbutao
Copy link
Contributor Author

Note that this PR is essentially reverting #6220, which I don't think we can do atm. /cc @Fokko

@rdblue @Fokko @nastra Could you please take a look? I am not sure if this fix will revert #6220. In addition, another PR #6653 is aim for same issue.

sourceColumn.fieldId(),
nextFieldId(),
targetName,
Transforms.year(sourceColumn.type()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant to remove those because then there is a situation where we don't know the sourceColumn.

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being late to the party here, I was traveling the last few days. I would be in favor of #6653 if that also solves your problem.

@zhangbutao
Copy link
Contributor Author

Sorry for being late to the party here, I was traveling the last few days. I would be in favor of #6653 if that also solves your problem.

Never mind :). Yes, #6653 can also fix this issue. Feel free to close this PR if not needed. Thank you. @Fokko

@Fokko
Copy link
Contributor

Fokko commented Feb 2, 2023

Thanks for letting me know and creating the PR in the first place, much appreciated 👍🏻

@Fokko Fokko closed this Feb 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants