feat(spark): Add `SessionStateBuilderSpark` to datafusion-spark by cht42 · Pull Request #19865 · apache/datafusion

cht42 · 2026-01-17T09:23:01Z

Which issue does this PR close?

Closes [datafusion-spark] Add method to register udf and expr planner in one go #19843

Rationale for this change

Currently, combining DataFusion's default features with Spark features is awkward because:

Expression planners must be registered before calling with_default_features().build() to take precedence
UDFs must be registered after the state is built (if using register_all)

This requires splitting the setup into multiple phases, which is verbose and error-prone.

What changes are included in this PR?

Added SessionStateBuilderSpark extension trait in datafusion-spark that provides with_spark_features() method to register both the Spark expression planner (with correct precedence) and all Spark UDFs in one call
Added core feature flag to datafusion-spark with datafusion as an optional dependency (this avoids having datafusion-core depend on datafusion-spark)
Updated datafusion-spark crate documentation with usage example
Simplified test context setup in datafusion-sqllogictest to use the new extension trait

Are these changes tested?

Yes, there is a unit test in datafusion-spark/src/session_state.rs plus the existing Spark SQLLogicTest suite validates that all Spark functions work correctly. The test context in datafusion-sqllogictest now uses the SessionStateBuilderSpark extension trait, serving as both a usage example and integration test.

Are there any user-facing changes?

Yes, this adds a new public API: SessionStateBuilderSpark extension trait (behind the core feature flag in datafusion-spark).

Jefffrey

Maybe its better to introduce a new trait (e.g. SessionStateBuilderSparkExt, though with a better name) to datafusion-spark containing the new with_spark_features method and impl this onto SessionStateBuilder to avoid needing having core depend on datafusion-spark

cht42 · 2026-01-17T15:31:52Z

Maybe its better to introduce a new trait (e.g. SessionStateBuilderSparkExt, though with a better name) to datafusion-spark containing the new with_spark_features method and impl this onto SessionStateBuilder to avoid needing having core depend on datafusion-spark

Souds good, updated the code to use that approach

Jefffrey

fyi @comphead

Jefffrey · 2026-01-19T07:47:08Z

+//! ```
+//!
+//! Then use the extension trait:
+//! ```ignore


Would prefer to avoid ignore here if possible

…erSpark

alamb

FYI @linhr @Omega359 and @comphead -- as you might be interested in this API

Jefffrey · 2026-01-27T06:56:14Z

Thanks @cht42

…he#19865) ## Which issue does this PR close? - Closes apache#19843 ## Rationale for this change Currently, combining DataFusion's default features with Spark features is awkward because: 1. Expression planners must be registered **before** calling `with_default_features().build()` to take precedence 2. UDFs must be registered **after** the state is built (if using `register_all`) This requires splitting the setup into multiple phases, which is verbose and error-prone. ## What changes are included in this PR? - Added `SessionStateBuilderSpark` extension trait in `datafusion-spark` that provides `with_spark_features()` method to register both the Spark expression planner (with correct precedence) and all Spark UDFs in one call - Added `core` feature flag to `datafusion-spark` with `datafusion` as an optional dependency (this avoids having `datafusion-core` depend on `datafusion-spark`) - Updated `datafusion-spark` crate documentation with usage example - Simplified test context setup in `datafusion-sqllogictest` to use the new extension trait ## Are these changes tested? Yes, there is a unit test in `datafusion-spark/src/session_state.rs` plus the existing Spark SQLLogicTest suite validates that all Spark functions work correctly. The test context in datafusion-sqllogictest now uses the `SessionStateBuilderSpark` extension trait, serving as both a usage example and integration test. ## Are there any user-facing changes? Yes, this adds a new public API: `SessionStateBuilderSpark` extension trait (behind the `core` feature flag in `datafusion-spark`).

cht42 added 3 commits January 17, 2026 13:01

feat: Add with_spark_features to SessionStateBuilder

e2a2861

test: Add unit test for SessionState with Spark features

c45f8fa

feat: Add Spark feature checks to CI workflow

be3cc4a

cht42 changed the title ~~feat: Add with_spark_features to SessionStateBuilder~~ feat: Add with_spark_features to SessionStateBuilder Jan 17, 2026

github-actions Bot added development-process Related to development process of DataFusion core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) spark labels Jan 17, 2026

cht42 mentioned this pull request Jan 17, 2026

Spark date part #19823

Merged

cht42 changed the title ~~feat: Add with_spark_features to SessionStateBuilder~~ feat(core): Add with_spark_features to SessionStateBuilder Jan 17, 2026

Jefffrey reviewed Jan 17, 2026

View reviewed changes

feat: Refactor Spark feature integration and update dependencies

168747a

github-actions Bot removed the core Core DataFusion crate label Jan 17, 2026

cht42 added 2 commits January 17, 2026 19:27

feat: Remove datafusion-spark core feature check from CI workflow

a07eab4

f

6a96254

cht42 force-pushed the spark-features branch from d7934b2 to 6a96254 Compare January 17, 2026 15:28

github-actions Bot removed the development-process Related to development process of DataFusion label Jan 17, 2026

refactor: Remove unnecessary documentation for SessionStateBuilderSpark

5e44faf

fix: Update documentation for SessionStateBuilderSpark usage example

078a1bf

cht42 changed the title ~~feat(core): Add with_spark_features to SessionStateBuilder~~ feat(spark): Add SessionStateBuilderSpark to datafusion-spark Jan 17, 2026

cht42 changed the title ~~feat(spark): Add SessionStateBuilderSpark to datafusion-spark~~ feat(spark): Add SessionStateBuilderSpark to datafusion-spark Jan 17, 2026

Jefffrey approved these changes Jan 19, 2026

View reviewed changes

cht42 added 2 commits January 19, 2026 12:01

fix: Correct code block syntax in documentation for SessionStateBuild…

90faba8

…erSpark

fix: Update documentation for SessionStateBuilderSpark example and usage

5f89e47

alamb reviewed Jan 22, 2026

View reviewed changes

Jefffrey added this pull request to the merge queue Jan 27, 2026

Merged via the queue into apache:main with commit 8653851 Jan 27, 2026
31 checks passed

comphead mentioned this pull request Mar 24, 2026

Release DataFusion 53.0.0 (Feb 2026 / Mar 2026) #19692

Closed

26 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(spark): Add `SessionStateBuilderSpark` to datafusion-spark#19865

feat(spark): Add `SessionStateBuilderSpark` to datafusion-spark#19865
Jefffrey merged 10 commits intoapache:mainfrom
cht42:spark-features

cht42 commented Jan 17, 2026 •

edited

Loading

Uh oh!

Jefffrey left a comment

Uh oh!

cht42 commented Jan 17, 2026

Uh oh!

Jefffrey left a comment

Uh oh!

Jefffrey Jan 19, 2026

Uh oh!

cht42 Jan 19, 2026

Uh oh!

alamb left a comment

Uh oh!

Uh oh!

Jefffrey commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

cht42 commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

Jefffrey left a comment

Choose a reason for hiding this comment

Uh oh!

cht42 commented Jan 17, 2026

Uh oh!

Jefffrey left a comment

Choose a reason for hiding this comment

Uh oh!

Jefffrey Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

cht42 Jan 19, 2026

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Jefffrey commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

cht42 commented Jan 17, 2026 •

edited

Loading