Skip to content

[C++][Acero] Race condition in asof join causes execution to stall for large number of record batches #37796

@JerAguilon

Description

@JerAguilon

Describe the bug, including details regarding any error messages, version, and platform.

  • Version: Repro'd on HEAD, v12.0.0, and v13.0.0

I've encountered a subtle race condition in the asof join node that is particularly common for large parquet files with many row groups:

  1. The left hand side of the asofjoin completes, so InputFinished proceeds as expected. So far so good
  2. The right hand table(s) of the join are a huge dataset scan. They're still streaming and can legally still call AsofJoinNode::InputReceived all they want (doc ref)
  3. Each input batch is blindly pushed to the InputStates, which in turn defer to BackpressureHandlers to decide whether to pause inputs. (code pointer)
  4. If enough batches come in right after EndFromProcessThread is called, then we might exceed the high_threshold and tell the input node to pause via the BackpressureController
  5. At this point, the process thread has stopped for the asofjoiner, so the right hand table(s) won't be dequeue'd, meaning BackpressureController::Resume() will never be called. This causes a deadlock

I have hackily fixed this in a local checkout by storing an atomic<bool> of whether EndFromProcessQueue was called. If it turns true, then at InputReceived I shortcircuit and return a Status::OK() without enqueueing the batch. Also at EndFromProcessQueue, I call ResumeProducing for all input nodes.

For good measure, I also call StopProducing() on all the inputs in EndFromProcessQueue... though I don't know if it's necessary

Happy to submit a PR once I find bandwidth, but reporting this early in case others run into it.

Component(s)

C++

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions