Enable segments read/published stats on Compaction task completion reports#15947
Conversation
d1794dc to
1a8de2a
Compare
1a8de2a to
3a945ce
Compare
…ble-segment-stats-for-compaction
…druid/indexing/common/task/batch/parallel/GeneratedPartitionsReport.java
There was a problem hiding this comment.
I tried these scenarios:
- index_parallel with no additional subtasks (wikipedia) - the task report has segmentsRead: null and segmentsPublished: null in the report
- index_parallel with more than 1 subtask (kttm1) - the task report has segmentsRead: null and segmentsPublished: null in the sub task and segmentsRead: 0 and semgentsPublished: 1 in the index_parallel task
- Issued a compaction task with no subtasks (kttm1) - the task report has segmentsRead: null and segmentsPublished: null
- issued a compaction task with sub tasks (wikipedia) - no reports for the sub tasks, but the compactTask has segmentsRead: 1 segmentsPublished: 1
- index_kafka on kttm - report has segmentsRead: null segmentsPublished: null
Since this change is meant to be scoped to just compaction tasks, I am -1 in it's current implementation because there are many different task reports than have the fields segmentsRead and segmentsPublished added to it returning null and index_parallel tasks that have a report that says it read 0 segments (while true, that's a little confusing).
If these fields did not show up in the task reports where they are not meant to, I think the change would be good.
…ble-segment-stats-for-compaction
suneet-s
left a comment
There was a problem hiding this comment.
Looks better! Only other thing I noticed is that index_parallel jobs with no subtasks that used to have task reports before this change no longer show a task report after this change. Can you look into that issue please
|
Here is the fix: #16042 |
kfaraz
left a comment
There was a problem hiding this comment.
Thanks for adding the new fields, @adithyachakilam . I have left some comments that would be more in line with the existing report class structure.
kfaraz
left a comment
There was a problem hiding this comment.
Thanks for the changes, @adithyachakilam ! Requested minor changes, otherwise looks good.
…ble-segment-stats-for-compaction
…ble-segment-stats-for-compaction
kfaraz
left a comment
There was a problem hiding this comment.
LGTM, thanks for addressing the comments.
Description
This adds visibility into number of segments read/published by each parallel compaction task.
This PR adds new subclass for
IngestionStatsAndErrorsTaskReportDatato take new fields andParallelIndexSupervisorTaskto populate the segments read/published in different parallel compaction algos.Release note
Parallel compaction task completion reports now have segmentsRead and segmentsPublished fields to see how effective a compaction task is.
Key changed/added classes in this PR
ParallelIndexSupervisorTask.javaIngestionStatsAndErrorsTaskReportData.javaThis PR has: