Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented May 30, 2025

Which issue does this PR close?

Rationale for this change

During the review of #16148 from @kosiew 🙏 @adriangb and I had some suggestions to simplify the API and make upgrading easier. However, it wasn't clear how they would look so I tried them out and it turns out they worked well

What changes are included in this PR?

This PR simplifies the API:

  1. Change FileForma::with_schema_adapter_factory to return an Result
  2. Provides a default implementation for with_schema_adapter_factory (which avoids having to change FileFormat implementations)
  3. Removes the impl_schema_adapter_methods macro
  4. Consolidate the newly added standalone integration tests (as individual tests results in longer build times, as each test ends up as its own binary)

Are these changes tested?

Yes by CI

Are there any user-facing changes?

Since none of the modified APIs have been released, this is not a breaking change.

@github-actions github-actions bot added core Core DataFusion crate datasource Changes to the datasource crate labels May 30, 2025
}

impl_schema_adapter_methods!();
fn with_schema_adapter_factory(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of the impl_schema_adapter_methods macro, I instead replicated the code several places. While there is more duplicated code I think it is clearer because what is happening is now more explicit and doesn't require a macro (even though impl_schema_adapter_methods was super well documented)

// Verify the schema adapter factory is present in the file source
assert!(config.source().schema_adapter_factory().is_some());
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is moved without change from datafusion/core/tests/test_adapter_updated.rs

/// automatically using the [`crate::impl_schema_adapter_methods`] macro.
/// The default implementation returns a not implemented error.
///
/// [`schema_adapter_factory`]: Self::schema_adapter_factory
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this signature is changed to return a Result and defaults to not implemented

As @kosiew points out this means that implementors of a FileSource will only get an error at runtime, not compile time, but I think that is better than forcing everyone to implement a schema adapter even if they don't need it

Copy link
Contributor

@kosiew kosiew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alamb
Copy link
Contributor Author

alamb commented Jun 3, 2025

@xudong963 would you have a moment to help review this PR? It is something I would like to get in before 48.0.0 is released (as it contains a change to an as yet unreleased API)

Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you

@xudong963 xudong963 merged commit 3236cc0 into apache:main Jun 4, 2025
28 checks passed
@alamb
Copy link
Contributor Author

alamb commented Jun 4, 2025

Thank you @xudong963

kosiew pushed a commit to kosiew/datafusion that referenced this pull request Jun 10, 2025
* Make FileFormat::with_schema_adapter_factory fallible, remove macros

* Remove standalone integration test

* Update doc test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate datasource Changes to the datasource crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consolidate schema adapter tests in schema_adapter_integration_tests.rs

3 participants