Skip to content

[C++][Acero] Random hangs when joining tables with ExecutePlan #39582

@stenlarsson

Description

@stenlarsson

Describe the bug, including details regarding any error messages, version, and platform.

We have problems for a long time with a specific batch job that combines data from different sources. There is something in the data causing the issue, but I haven't been able to figure exactly what. I have created a test case where I tried my best to minimise and anonymise the data: https://github.com/stenlarsson/arrow-test

Sometimes it hangs after a random number of iterations:

$ ruby hang.rb
0
1
2

Sometimes it crashes:

$ ruby hang.rb
0
SEGV received in BUS handler
[1]    74331 abort      ruby hang.rb

I'm running macOS / Ruby 3.2.2 / Arrow 14.0.2 on my computer, but have also reproduced the error with Linux / Ruby 3.0.6 / Arrow 11.0.0. It doesn't seem to happen with Arrow 10.0.1.

Component(s)

Ruby

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions