Skip to content

Conversation

@westonpace
Copy link
Member

@westonpace westonpace commented Dec 31, 2022

The sorting done by orderby is not stable. This means, given the input:

a b
1 false
1 true

the test could have generated both [false, true] and [true, false] for the b column. We likely did not encounter this before 498b645 because the entire thing was run serially (even though there was a parallel option it was not setup correctly).

Now that things are properly running parallel the results are non-deterministic. We could remove the b column but I feel it is a better stress test to have at least one payload column. So I changed the test to only compare the key array and not the payload array.

@github-actions
Copy link

@github-actions
Copy link

⚠️ GitHub issue #15141 has been automatically assigned in GitHub to PR creator.

@github-actions
Copy link

⚠️ GitHub issue #15141 has no components, please add labels for components.

@westonpace
Copy link
Member Author

@vibhatha would you be interested in providing a review / sanity check?

@vibhatha
Copy link
Contributor

@vibhatha would you be interested in providing a review / sanity check?

Sure @westonpace, I will.

Copy link
Contributor

@vibhatha vibhatha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westonpace make sense to me. It’s not possible to guarantee the outcome of this.

Just a question, is there a sort option which gives precedence to the index of the row and decide which comes first, when we have a tie like this?

@westonpace
Copy link
Member Author

Just a question, is there a sort option which gives precedence to the index of the row and decide which comes first, when we have a tie like this?

That's called a "stable sort". The underlying sort kernel (SortIndices) is stable. However, if the plan is run in parallel, then there is no guarantee the batches will accumulate in the same order. So even if the sort kernel is stable the sort node is not.

Once we add proper ordering we can add a stable option to the sort node which resequences the data before sorting so that the sort node can remain stable.

However, now that I write this, I realize it might be best to only apply my change when testing the parallel case, and to use the old comparison in the non-parallel case.

@vibhatha
Copy link
Contributor

vibhatha commented Jan 1, 2023

However, now that I write this, I realize it might be best to only apply my change when testing the parallel case, and to use the old comparison in the non-parallel case.

Yes, I also think it’s better that way.

@westonpace
Copy link
Member Author

Yes, I also think it’s better that way.

Hmm, I tried this but it turns out not to be so simple. I'm going to proceed with this how it is for now. We can worry about a full comparison later when we add a stable sort. I'll add a new issue requesting that. I'll merge this so it doesn't bother CI

@westonpace westonpace merged commit 5a57e6d into apache:master Jan 1, 2023
@ursabot
Copy link

ursabot commented Jan 1, 2023

Benchmark runs are scheduled for baseline = db6c59d and contender = 5a57e6d. 5a57e6d is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.56% ⬆️1.11%] test-mac-arm
[Finished ⬇️1.79% ⬆️0.0%] ursa-i9-9960x
[Failed ⬇️0.0% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 5a57e6dd ec2-t3-xlarge-us-east-2
[Failed] 5a57e6dd test-mac-arm
[Finished] 5a57e6dd ursa-i9-9960x
[Failed] 5a57e6dd ursa-thinkcentre-m75q
[Finished] db6c59d1 ec2-t3-xlarge-us-east-2
[Failed] db6c59d1 test-mac-arm
[Finished] db6c59d1 ursa-i9-9960x
[Failed] db6c59d1 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ursabot
Copy link

ursabot commented Jan 1, 2023

['Python', 'R'] benchmarks have high level of regressions.
ursa-i9-9960x

EpsilonPrime pushed a commit to EpsilonPrime/arrow that referenced this pull request Jan 5, 2023
…che#15142)

The sorting done by orderby is not stable.  This means, given the input:

a | b
--- | ---
1 | false
1 | true

the test could have generated both `[false, true]` and `[true, false]` for the `b` column.  We likely did not encounter this before apache@498b645 because the entire thing was run serially (even though there was a `parallel` option it was not setup correctly).

Now that things are properly running parallel the results are non-deterministic.  We could remove the `b` column but I feel it is a better stress test to have at least one payload column.  So I changed the test to only compare the key array and not the payload array.
* Closes: apache#15141

Authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[CI] arrow-compute: ExecPlanExecution.StressSourceOrderBy may failed

3 participants