Skip to content

Conversation

@varun-lakhyani
Copy link
Contributor

@varun-lakhyani varun-lakhyani commented Dec 29, 2025

Description

This resolves the TODO noted in spark/v3.5/spark-runtime/src/integration/java/org/apache/iceberg/spark/TestRoundTrip.java:41.

Fixed broken MERGE INTO example in Spark Getting Started documentation that referenced a non-existent count column.

What was wrong

The original example attempted to update a count column:

MERGE INTO local.db.target t USING (SELECT * FROM updates) u ON t.id = u.id
WHEN MATCHED THEN UPDATE SET t.count = t.count + u.count
WHEN NOT MATCHED THEN INSERT *;

However, the table schema defined earlier in the guide is (id bigint, data string), making this example invalid.
Table was defined as local.db.table not local.db.target so updated this in MERGE INTO

What changed

Updated the example to use the correct data column:

MERGE INTO local.db.table t USING (SELECT * FROM updates) u ON t.id = u.id
WHEN MATCHED THEN UPDATE SET t.data = u.data
WHEN NOT MATCHED THEN INSERT *;

```sql
MERGE INTO local.db.target t USING (SELECT * FROM updates) u ON t.id = u.id
WHEN MATCHED THEN UPDATE SET t.count = t.count + u.count
WHEN MATCHED THEN UPDATE SET t.data = u.data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To make this truly runnable (and align with TestRoundTrip’s TODO about the getting-started doc not being runnable), should we change local.db.target → local.db.table and define updates before the MERGE?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Updated the Getting Started doc to use local.db.table instead of local.db.target and added explicit CREATE TABLE and INSERT INTO statements for the source and updates tables to make the example runnable.


// Run through our Doc's Getting Started Example
// TODO Update doc example so that it can actually be run, modifications were required for this
// test suite to run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we also update 3.4/4.0/4.1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Removed the TODO comments from all Spark versions (3.4, 3.5, 4.0, 4.1).

@huaxingao huaxingao merged commit 9c3bed6 into apache:main Dec 30, 2025
33 checks passed
@huaxingao
Copy link
Contributor

Thanks @varun-lakhyani for the PR!

@varun-lakhyani
Copy link
Contributor Author

varun-lakhyani commented Dec 30, 2025

Thanks @varun-lakhyani for the PR!

Thanks for the review and merging this.
Much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants