Skip to content

chore: Remove obsolete supportedSortType function after Arrow updates#1946

Closed
Ruchir28 wants to merge 1 commit intoapache:mainfrom
Ruchir28:issue-1854
Closed

chore: Remove obsolete supportedSortType function after Arrow updates#1946
Ruchir28 wants to merge 1 commit intoapache:mainfrom
Ruchir28:issue-1854

Conversation

@Ruchir28
Copy link
Copy Markdown

Which issue does this PR close?

Closes #1854

Rationale for this change

The supportedSortType function was a fallback mechanism added to avoid Comet errors on complex single column case, as DataFusion SortExec calls arrow's lexsort_to_indices and the function fallbacks to sort_to_indices for single column case. However, sort_to_indices doesn't support all data types, e.g., struct which led to errors , as reported in this issue

However, with recent Arrow updates, these limitations have been resolved by this PR

As a result, the supportedSortType fallback is no longer needed and, in fact, prevents us from taking advantage of the improved native performance. This PR removes the function and its usages, allowing Comet to handle these sorting operations directly.

What changes are included in this PR?

  • Removed the supportedSortType function and it's usages
  • Updated the tests accordingly to confirm operations are handled by comet instead of falling back to spark

How are these changes tested?

The existing test case been updated. By changing the test assertion from checkSparkAnswer to checkSparkAnswerAndOperator, we now verify that the operation is correctly executed by the Comet native operator, confirming the fallback to Spark is no longer triggered.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jun 27, 2025

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 58.40%. Comparing base (f09f8af) to head (b640cbe).
⚠️ Report is 1166 commits behind head on main.

Files with missing lines Patch % Lines
...ark/sql/comet/CometTakeOrderedAndProjectExec.scala 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##               main    #1946      +/-   ##
============================================
+ Coverage     56.12%   58.40%   +2.27%     
- Complexity      976     1140     +164     
============================================
  Files           119      131      +12     
  Lines         11743    12878    +1135     
  Branches       2251     2383     +132     
============================================
+ Hits           6591     7521     +930     
- Misses         4012     4135     +123     
- Partials       1140     1222      +82     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@andygrove
Copy link
Copy Markdown
Member

There are 4 failing Spark SQL tests. Here is the first one:

2025-06-27T15:04:42.5762508Z [info] - SPARK-47430 Support GROUP BY MapType *** FAILED *** (694 milliseconds)
2025-06-27T15:04:42.5778426Z [info]   org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1735.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1735.0 (TID 1482) (43ff0ed8e63a executor driver): org.apache.comet.CometNativeException: Invalid argument error: The data type type Map(Field { name: "entries", data_type: Struct([Field { name: "key", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, Field { name: "value", data_type: Int32, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, dict_is_ordered: false, metadata: {} }, false) has no natural order

@andygrove
Copy link
Copy Markdown
Member

@Ruchir28 I suspect that we can't completely remove supportedSortType yet, but it could be updated to remove many of the current restrictions.

@andygrove
Copy link
Copy Markdown
Member

This PR seems inactive, so moving to draft for now

@andygrove andygrove marked this pull request as draft August 8, 2025 16:05
@andygrove
Copy link
Copy Markdown
Member

@Ruchir28 do you still plan on working on this or should we close this PR for now?

@andygrove
Copy link
Copy Markdown
Member

I will close this PR for now. Feel free to reopen if work resumes.

@andygrove andygrove closed this Aug 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Relax sort fallback constraints

3 participants