Skip to content

Conversation

@kbendick
Copy link
Contributor

This backports #4364 to Flink 1.12.

See also #4417 for Flink 1.13.

Co-authored-by: liliwei hililiwei@gmail.com

@github-actions github-actions bot added the flink label Mar 28, 2022
@kbendick
Copy link
Contributor Author

kbendick commented Mar 28, 2022

It seems that the new tests aren't passing.

I will scan tomorrow to see if I missed anything when backporting. And check if anything wasn't backported previously.

This also opens the wider discussion of how much longer we'll actively support Flink 1.12 once Flink 1.15 is released. There have been several major API changes to areas that directly impact our work. Once supporting Flink 1.12 becomes a high burden, we'll have to really consider how important it is for us and our users.

Co-authored-by: liliwei <hililiwei@gmail.com>
@kbendick kbendick force-pushed the backport-flink-eq-delete-to-112 branch from 6c9bfac to c569366 Compare March 28, 2022 19:58
@rdblue
Copy link
Contributor

rdblue commented Mar 31, 2022

@kbendick, should we replace this with a PR that disables UPSERT in 1.12?

@kbendick
Copy link
Contributor Author

kbendick commented Apr 6, 2022

Opened a PR to simply throw UnsupportedOperationException in FlinkSink if it's determined that the program is running in upsert mode: #4519

@kbendick
Copy link
Contributor Author

kbendick commented Apr 8, 2022

Closing in favor of #4519

@kbendick kbendick closed this Apr 8, 2022
@kbendick kbendick deleted the backport-flink-eq-delete-to-112 branch April 8, 2022 17:39
@kbendick kbendick restored the backport-flink-eq-delete-to-112 branch April 15, 2022 16:42
@kbendick kbendick reopened this Apr 15, 2022
@kbendick
Copy link
Contributor Author

I updated this PR to use DATE instead of TO_DATE, as we discovered in #4532 that TO_DATE was causing tests to process rows out of order during INSERT VALUES.

However, the tests all still fail in 1.12.

I think that So I'm still not able to get 1.12 to correctly process upserts. I think we should disable upsert mode in Flink 1.12 in the next patch with a recommendation to upgrade to 1.13 as proposed here: #4519

> Task :iceberg-flink:iceberg-flink-1.12:test

org.apache.iceberg.flink.TestFlinkUpsert > testPrimaryKeyEqualToPartitionKey[catalogName=testhadoop, baseNamespace=default, format=PARQUET, isStreaming=true] FAILED
    java.lang.AssertionError: expected:<[2,aaa, 3,bbb]> but was:<[1,aaa, 3,bbb]>
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:120)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at org.apache.iceberg.flink.TestHelpers.assertRows(TestHelpers.java:140)
        at org.apache.iceberg.flink.TestFlinkUpsert.testPrimaryKeyEqualToPartitionKey(TestFlinkUpsert.java:178)

org.apache.iceberg.flink.TestFlinkUpsert > testPrimaryKeyFieldsAtBeginningOfSchema[catalogName=testhadoop, baseNamespace=default, format=PARQUET, isStreaming=true] FAILED
    java.lang.AssertionError: expected:<[aaa,2022-03-01,2, bbb,2022-03-01,3]> but was:<[aaa,2022-03-01,1, bbb,2022-03-01,3]>
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:120)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at org.apache.iceberg.flink.TestHelpers.assertRows(TestHelpers.java:140)
        at org.apache.iceberg.flink.TestFlinkUpsert.testPrimaryKeyFieldsAtBeginningOfSchema(TestFlinkUpsert.java:219)

org.apache.iceberg.flink.TestFlinkUpsert > testPrimaryKeyFieldsAtEndOfTableSchema[catalogName=testhadoop, baseNamespace=default, format=PARQUET, isStreaming=false] FAILED
    java.lang.AssertionError: expected:<[2,aaa,2022-03-01, 3,bbb,2022-03-01]> but was:<[1,aaa,2022-03-01, 3,bbb,2022-03-01]>
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:120)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at org.apache.iceberg.flink.TestHelpers.assertRows(TestHelpers.java:140)
        at org.apache.iceberg.flink.TestFlinkUpsert.testPrimaryKeyFieldsAtEndOfTableSchema(TestFlinkUpsert.java:262)

org.apache.iceberg.flink.TestFlinkUpsert > testPrimaryKeyEqualToPartitionKey[catalogName=testhadoop, baseNamespace=default, format=PARQUET, isStreaming=false] FAILED
    java.lang.AssertionError: expected:<[2,aaa, 3,bbb]> but was:<[1,aaa, 3,bbb]>
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:120)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at org.apache.iceberg.flink.TestHelpers.assertRows(TestHelpers.java:140)
        at org.apache.iceberg.flink.TestFlinkUpsert.testPrimaryKeyEqualToPartitionKey(TestFlinkUpsert.java:178)

org.apache.iceberg.flink.TestFlinkUpsert > testPrimaryKeyFieldsAtEndOfTableSchema[catalogName=testhadoop, baseNamespace=default, format=AVRO, isStreaming=true] FAILED
    java.lang.AssertionError: expected:<[2,aaa,2022-03-01, 3,bbb,2022-03-01]> but was:<[1,aaa,2022-03-01, 3,bbb,2022-03-01]>
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:120)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at org.apache.iceberg.flink.TestHelpers.assertRows(TestHelpers.java:140)
        at org.apache.iceberg.flink.TestFlinkUpsert.testPrimaryKeyFieldsAtEndOfTableSchema(TestFlinkUpsert.java:262)

org.apache.iceberg.flink.TestFlinkUpsert > testPrimaryKeyEqualToPartitionKey[catalogName=testhadoop, baseNamespace=default, format=AVRO, isStreaming=true] FAILED
    java.lang.AssertionError: expected:<[2,aaa, 3,bbb]> but was:<[1,aaa, 3,bbb]>
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:120)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at org.apache.iceberg.flink.TestHelpers.assertRows(TestHelpers.java:140)
        at org.apache.iceberg.flink.TestFlinkUpsert.testPrimaryKeyEqualToPartitionKey(TestFlinkUpsert.java:178)

org.apache.iceberg.flink.TestFlinkUpsert > testPrimaryKeyFieldsAtEndOfTableSchema[catalogName=testhadoop, baseNamespace=default, format=AVRO, isStreaming=false] FAILED
    java.lang.AssertionError: expected:<[2,aaa,2022-03-01, 3,bbb,2022-03-01]> but was:<[1,aaa,2022-03-01, 3,bbb,2022-03-01]>
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:120)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at org.apache.iceberg.flink.TestHelpers.assertRows(TestHelpers.java:140)
        at org.apache.iceberg.flink.TestFlinkUpsert.testPrimaryKeyFieldsAtEndOfTableSchema(TestFlinkUpsert.java:262)

org.apache.iceberg.flink.TestFlinkUpsert > testPrimaryKeyEqualToPartitionKey[catalogName=testhadoop, baseNamespace=default, format=AVRO, isStreaming=false] FAILED
    java.lang.AssertionError: expected:<[2,aaa, 3,bbb]> but was:<[1,aaa, 3,bbb]>
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:120)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at org.apache.iceberg.flink.TestHelpers.assertRows(TestHelpers.java:140)
        at org.apache.iceberg.flink.TestFlinkUpsert.testPrimaryKeyEqualToPartitionKey(TestFlinkUpsert.java:178)

org.apache.iceberg.flink.TestFlinkUpsert > testPrimaryKeyEqualToPartitionKey[catalogName=testhadoop, baseNamespace=default, format=ORC, isStreaming=false] FAILED
    java.lang.AssertionError: expected:<[2,aaa, 3,bbb]> but was:<[1,aaa, 3,bbb]>
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:120)
        at org.junit.Assert.assertEquals(Assert.java:146)
        at org.apache.iceberg.flink.TestHelpers.assertRows(TestHelpers.java:140)
        at org.apache.iceberg.flink.TestFlinkUpsert.testPrimaryKeyEqualToPartitionKey(TestFlinkUpsert.java:178)

@kbendick
Copy link
Contributor Author

Also, Github is saying that all checks are passing as Flink 1.12 has been removed in master. Checking out this branch and running it locally shows that the issue is still present.

@openinx
Copy link
Member

openinx commented Apr 18, 2022

If this failure can be reproduced locally, then we can just open the intellij idea debugger to analysis the written columnar files to see what's wrong there.

@rdblue
Copy link
Contributor

rdblue commented May 2, 2022

@openinx have you had a chance to look into the failure? It would be great to fix this instead of disabling in 1.12.

@github-actions
Copy link

github-actions bot commented Aug 8, 2024

This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the dev@iceberg.apache.org list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Aug 8, 2024
@github-actions
Copy link

This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions bot closed this Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants