Description of the bug:
1. SpawnLogModule.buildComplete() blocks the Guava EventBus thread.
SpawnLogModule is a subscriber to BuildCompleteEvent. Its handler calls spawnLogContext.close() synchronously. For CompactSpawnLogContext, close() blocks until AsynchronousMessageOutputStream's background writer thread drains all pending entries through ZstdOutputStream and into the pipe. While the EventBus thread is blocked here, BuildEventStreamer — which is the next subscriber in the same dispatch cycle — never runs, so BuildToolLogs is never published to BEP and CLOSE_EVENT_FUTURE is never enqueued. The BEP writer thread stalls indefinitely waiting for events that will never arrive.
2. ByteStreamBuildEventArtifactUploader reads the execlog pipe to compute a checksum.
After addLocalFile(outputPath) is called, ByteStreamBuildEventArtifactUploader.readPathMetadata() calls digestUtil.compute(path) which reads the entire file to hash it. For a named pipe, this call blocks until the write-end is closed. But the write-end is held by the EventBus thread, which is stuck in cause #1. Full deadlock.
EventBus thread: SpawnLogModule.buildComplete()
└─ spawnLogContext.close() ──────── BLOCKED (pipe backpressure)
∴ BuildEventStreamer.buildComplete() never fires
∴ BuildToolLogs event never enqueued
∴ BEP writer stalls
ByteStream thread: readPathMetadata(execlog.pipe)
└─ digestUtil.compute(pipe) ─────── BLOCKED (write-end still open)
Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
mkfifo /tmp/bep.pipe /tmp/execlog.pipe
cat /tmp/bep.pipe > /dev/null &
cat /tmp/execlog.pipe > /dev/null &
bazel build //some:target \
--build_event_binary_file=/tmp/bep.pipe \
--execution_log_compact_file=/tmp/execlog.pipe
# hangs forever; "Waiting for build events upload: BinaryFormatFileTransport"
Which operating system are you running Bazel on?
No response
What is the output of bazel info release?
No response
If bazel info release returns development version or (@non-git), tell us how you built Bazel.
No response
What's the output of git remote get-url origin; git rev-parse HEAD ?
If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response
Description of the bug:
1.
SpawnLogModule.buildComplete()blocks the Guava EventBus thread.SpawnLogModuleis a subscriber toBuildCompleteEvent. Its handler callsspawnLogContext.close()synchronously. ForCompactSpawnLogContext,close()blocks untilAsynchronousMessageOutputStream's background writer thread drains all pending entries throughZstdOutputStreamand into the pipe. While the EventBus thread is blocked here,BuildEventStreamer— which is the next subscriber in the same dispatch cycle — never runs, soBuildToolLogsis never published to BEP andCLOSE_EVENT_FUTUREis never enqueued. The BEP writer thread stalls indefinitely waiting for events that will never arrive.2.
ByteStreamBuildEventArtifactUploaderreads the execlog pipe to compute a checksum.After
addLocalFile(outputPath)is called,ByteStreamBuildEventArtifactUploader.readPathMetadata()callsdigestUtil.compute(path)which reads the entire file to hash it. For a named pipe, this call blocks until the write-end is closed. But the write-end is held by the EventBus thread, which is stuck in cause #1. Full deadlock.Which category does this issue belong to?
No response
What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
Which operating system are you running Bazel on?
No response
What is the output of
bazel info release?No response
If
bazel info releasereturnsdevelopment versionor(@non-git), tell us how you built Bazel.No response
What's the output of
git remote get-url origin; git rev-parse HEAD?If this is a regression, please try to identify the Bazel commit where the bug was introduced with bazelisk --bisect.
No response
Have you found anything relevant by searching the web?
No response
Any other information, logs, or outputs that you want to share?
No response