Skip to content

Spark: Remove Apache DataFusion Comet integration#15674

Merged
huaxingao merged 3 commits intoapache:mainfrom
andygrove:remove-comet-dependency
Mar 18, 2026
Merged

Spark: Remove Apache DataFusion Comet integration#15674
huaxingao merged 3 commits intoapache:mainfrom
andygrove:remove-comet-dependency

Conversation

@andygrove
Copy link
Copy Markdown
Member

Rationale

The current Comet integration is broken due to shading issues and it depends on code in Comet that is deprecated.

There was an effort to fix these issues, but Comet is now focussing on integrating with the Rust implementation of Iceberg, so we would like to remove the integration from the Java implementation.

Changes

Removes the datafusion-comet dependency and all related code across Spark versions 3.4, 3.5, 4.0, and 4.1. This includes the Comet-specific vectorized reader classes, ParquetReaderType enum, the spark.sql.iceberg.parquet.reader-type configuration property, and associated build/test infrastructure.

Removes the datafusion-comet dependency and all related code across
Spark versions 3.4, 3.5, 4.0, and 4.1. This includes the Comet-specific
vectorized reader classes, ParquetReaderType enum, the spark.sql.iceberg.parquet.reader-type
configuration property, and associated build/test infrastructure.
@andygrove andygrove force-pushed the remove-comet-dependency branch from c8b5546 to 0c24239 Compare March 18, 2026 15:16
@andygrove andygrove marked this pull request as ready for review March 18, 2026 17:26
@andygrove
Copy link
Copy Markdown
Member Author

Copy link
Copy Markdown
Contributor

@huaxingao huaxingao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @andygrove

Copy link
Copy Markdown
Contributor

@anuragmantri anuragmantri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @andygrove

@huaxingao huaxingao changed the title chore: Remove Apache DataFusion Comet integration Spark: Remove Apache DataFusion Comet integration Mar 18, 2026
@huaxingao huaxingao merged commit 0d35440 into apache:main Mar 18, 2026
36 checks passed
@huaxingao
Copy link
Copy Markdown
Contributor

Thanks @andygrove and @anuragmantri

manuzhang pushed a commit to manuzhang/iceberg that referenced this pull request Mar 30, 2026
* Spark: Remove Apache DataFusion Comet integration

Removes the datafusion-comet dependency and all related code across
Spark versions 3.4, 3.5, 4.0, and 4.1. This includes the Comet-specific
vectorized reader classes, ParquetReaderType enum, the spark.sql.iceberg.parquet.reader-type
configuration property, and associated build/test infrastructure.

* spotless

* update test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants