fix: Support auto scan mode with Spark 4.0.0#1975
Merged
andygrove merged 31 commits intoapache:mainfrom Aug 26, 2025
Merged
Conversation
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #1975 +/- ##
============================================
+ Coverage 56.12% 58.53% +2.40%
- Complexity 976 1282 +306
============================================
Files 119 143 +24
Lines 11743 13254 +1511
Branches 2251 2364 +113
============================================
+ Hits 6591 7758 +1167
- Misses 4012 4264 +252
- Partials 1140 1232 +92 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
auto scan mode with Spark 4.0.0
Member
Author
|
hive-1 failure to investigate: |
0dca8fb to
d641025
Compare
Member
Author
|
Current failures: core-1: hive-1: |
This was referenced Aug 21, 2025
comphead
reviewed
Aug 25, 2025
| case StringType => ArrowType.Utf8.INSTANCE | ||
| case _: StringType => ArrowType.Utf8.INSTANCE | ||
| case dt if isStringCollationType(dt) => | ||
| // TODO collation information is lost with this transformation |
Member
Author
There was a problem hiding this comment.
I removed this comment because Spark also does not track this information, as pointed out in #2190 (comment) by @parthchandra
comphead
reviewed
Aug 25, 2025
| typeChecker.isSchemaSupported(partitionSchema, fallbackReasons) | ||
|
|
||
| def hasMapsContainingStructs(dataType: DataType): Boolean = { | ||
| def isComplexType(dt: DataType): Boolean = dt match { |
Contributor
There was a problem hiding this comment.
isComplexType method now exists in in DataTypeSupport object
parthchandra
approved these changes
Aug 25, 2025
coderfender
pushed a commit
to coderfender/datafusion-comet
that referenced
this pull request
Dec 13, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Which issue does this PR close?
Closes #1967
Rationale for this change
Provide better performance with Spark 4.0.0 by supporting
autoscan mode. Also, increase test coverage ofnative_iceberg_compat, which we intend to eventually completely replacenative_comet.What changes are included in this PR?
autofallback for Spark 4CollationSuiteto expect Comet plansVariantShreddingSuitedue to [native_iceberg_compat]VariantShreddingSuitetest failures with Spark 4.0.0 #2209How are these changes tested?