fix: Sort on single struct should fallback to Spark#811
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #811 +/- ##
============================================
- Coverage 33.94% 33.80% -0.14%
+ Complexity 874 870 -4
============================================
Files 112 112
Lines 42916 42914 -2
Branches 9464 9452 -12
============================================
- Hits 14567 14507 -60
- Misses 25379 25428 +49
- Partials 2970 2979 +9 ☔ View full report in Codecov by Sentry. |
| | spark.comet.scan.preFetch.enabled | Whether to enable pre-fetching feature of CometScan. By default is disabled. | false | | ||
| | spark.comet.scan.preFetch.threadNum | The number of threads running pre-fetching for CometScan. Effective if spark.comet.scan.preFetch.enabled is enabled. By default it is 2. Note that more pre-fetching threads means more memory requirement to store pre-fetched row groups. | 2 | | ||
| | spark.comet.shuffle.preferDictionary.ratio | The ratio of total values to distinct values in a string column to decide whether to prefer dictionary encoding when shuffling the column. If the ratio is higher than this config, dictionary encoding will be used on shuffling string column. This config is effective if it is higher than 1.0. By default, this config is 10.0. Note that this config is only used when `spark.comet.exec.shuffle.mode` is `jvm`. | 10.0 | | ||
| | spark.comet.sparkToColumnar.supportedOperatorList | A comma-separated list of operators that will be converted to Comet columnar format when 'spark.comet.sparkToColumnar.enabled' is true | Range,InMemoryTableScan | |
There was a problem hiding this comment.
nit: Shall we use ` instead of '
There was a problem hiding this comment.
This is not changed by this PR. I think there is previous PR changing it, but didn't update the document.
There was a problem hiding this comment.
The document is updated automatically when make release locally.
|
Thanks @huaxingao |
| if isCometOperatorEnabled(op.conf, CometConf.OPERATOR_SORT) => | ||
| // TODO: Remove this constraint when we upgrade to new arrow-rs including | ||
| // https://github.com/apache/arrow-rs/pull/6225 | ||
| if (child.output.length == 1 && child.output.head.dataType.isInstanceOf[StructType]) { |
There was a problem hiding this comment.
As we add support for other types, do we need to update this to make it recursive so that we check for Map or Array containing struct?
There was a problem hiding this comment.
Let me add more data types here according to arrow-rs.
(cherry picked from commit 071c780)
Which issue does this PR close?
Closes #807.
Rationale for this change
What changes are included in this PR?
How are these changes tested?