Skip to content

Add logs and metrics for append segments upgraded by a concurrent replace task#16563

Closed
kfaraz wants to merge 9 commits intoapache:masterfrom
kfaraz:add_segment_upgrade_metrics
Closed

Add logs and metrics for append segments upgraded by a concurrent replace task#16563
kfaraz wants to merge 9 commits intoapache:masterfrom
kfaraz:add_segment_upgrade_metrics

Conversation

@kfaraz
Copy link
Copy Markdown
Contributor

@kfaraz kfaraz commented Jun 6, 2024

Changes

  • Add metrics segment/upgraded/count and segment/upgradedRealtime/count
    emitted with dimensions taskId, dataSource, taskType, groupId, interval, etc.
  • Add new logs and cleanup existing ones
  • Minor method renames

Copy link
Copy Markdown
Contributor

@AmatyaAvadhanula AmatyaAvadhanula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @kfaraz. These metrics will be very helpful in monitoring concurrent append and replace.
I have a minor comment about changing realtime to pending in the metric name and description.
Also, do you think that we should also add a metric in SegmentTransactionalAppendAction whenever an upgrade segments entry is created because of a segment append within an interval locked by a REPLACE lock?

Comment thread docs/operations/metrics.md Outdated
|`segment/moved/bytes`|Size in bytes of segments moved/archived via the Move Task.| `dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies|
|`segment/nuked/bytes`|Size in bytes of segments deleted via the Kill Task.| `dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies|
| `segment/upgraded/count`| Number of published segments upgraded by a replace task.|`dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies|
| `segment/upgradedRealtime/count`| Number of realtime segments upgraded by a replace task.|`dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies|
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this to pendingSegment/upgraded/count or something similar.
Upgraded pending segments need not correspond only to realtime tasks.

@Nullable
private final String errorMsg;
@Nullable
private final List<DataSegment> upgradedAppendSegments;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just upgradedSegments seems sufficient.

);
versionToNumUpgradedSegments.forEach(
(version, numUpgradedSegments) -> log.info(
"Task[%s] of datasource[%s] upgraded [%d] append segments to replace version[%s].",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... upgraded [%d] segments ...

Comment thread docs/operations/metrics.md Outdated
Comment thread docs/operations/metrics.md Outdated
Comment thread docs/operations/metrics.md Outdated
Comment on lines +2452 to +2455
final List<List<DataSegment>> segmentsLists = Lists.partition(
new ArrayList<>(segments),
SEGMENT_INSERT_BATCH_SIZE
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think it is within the line limit.

Suggested change
final List<List<DataSegment>> segmentsLists = Lists.partition(
new ArrayList<>(segments),
SEGMENT_INSERT_BATCH_SIZE
);
final List<List<DataSegment>> segmentsLists = Lists.partition(new ArrayList<>(segments), SEGMENT_INSERT_BATCH_SIZE);

Copy link
Copy Markdown
Contributor Author

@kfaraz kfaraz Jun 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The soft limit for better readability is actually 80 chars.
120 chars is the hard limit which we should not exceed.


Obviously, all of this is subjective and depends on personal coding style.
But generally, I too avoid breaking lines in method/constructor calls if

  • they can be fit "comfortably" in a single line
  • they are not difficult to follow

The end goal is to have easily readable code.

So I would prefer

performAction(arg1, arg2, arg3, arg4, arg5);

over

performAction(
    arg1,
    arg2,
    arg3,
    arg4,
    arg5
);    

but NOT

performAction(computeFirstArg(), object.getSecondArg(), arg3, arg4, thirdArgIfNonNull());

over

performAction(
    computeFirstArg(),
    object.getSecondArg(),
    arg3,
    arg4,
    thirdArgIfNonNull()
);

For the case at hand, we have two options:

Option 1:

final List<List<DataSegment>> segmentsLists = Lists.partition(
        new ArrayList<>(segments),
        SEGMENT_INSERT_BATCH_SIZE
);

Option 2:

final List<List<DataSegment>> segmentsLists = Lists.partition(new ArrayList<>(segments), SEGMENT_INSERT_BATCH_SIZE);

I feel Option 1 here reads better for 2 reasons:

  • Option 2 is definitely too long, it's right at the 120 char limit.
  • Option 2 is a little more complex cognitively, it has too much stuff happening in a single line.

Hope you find that helpful (and not too verbose!). 🙂

Co-authored-by: Vishesh Garg <vishesh.garg@imply.io>
@kfaraz
Copy link
Copy Markdown
Contributor Author

kfaraz commented Jun 10, 2024

Thanks for the suggestions, @AmatyaAvadhanula ! Let me try to rename the added metrics and add another one for the transactional append action.

@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

@github-actions github-actions Bot added the stale label Sep 12, 2024
@kfaraz kfaraz removed the stale label Sep 23, 2024
@github-actions
Copy link
Copy Markdown

This pull request has been marked as stale due to 60 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If you think
that's incorrect or this pull request should instead be reviewed, please simply
write any comment. Even if closed, you can still revive the PR at any time or
discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

@github-actions github-actions Bot added the stale label Nov 23, 2024
@github-actions
Copy link
Copy Markdown

This pull request/issue has been closed due to lack of activity. If you think that
is incorrect, or the pull request requires review, you can revive the PR at any time.

@github-actions github-actions Bot closed this Dec 22, 2024
@kfaraz kfaraz deleted the add_segment_upgrade_metrics branch May 2, 2025 07:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants