Skip to content

Add/improve tracing in the dataset writer #33880

@joosthooz

Description

@joosthooz

Describe the enhancement requested

The telemetry code in the dataset writer currently does not trace the asynchronous tasks that are submitted to the I/O thread pool that perform the actual encoding, compression, and writing.

Follow-up to #33738, where @westonpace noted: "Mentally, when I think of the dataset writer, I think there are two parts. The first part should be the trailing part of the fragment/pipeline that feeds the writer. In this first part we partition the batch, select the appropriate file queues, and deposit the batches into the queues. There is then a separate dedicated thread task to write each batch to the writer."

Component(s)

C++

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions