Add logs and metrics for append segments upgraded by a concurrent replace task#16563
Add logs and metrics for append segments upgraded by a concurrent replace task#16563kfaraz wants to merge 9 commits intoapache:masterfrom
Conversation
AmatyaAvadhanula
left a comment
There was a problem hiding this comment.
Thanks @kfaraz. These metrics will be very helpful in monitoring concurrent append and replace.
I have a minor comment about changing realtime to pending in the metric name and description.
Also, do you think that we should also add a metric in SegmentTransactionalAppendAction whenever an upgrade segments entry is created because of a segment append within an interval locked by a REPLACE lock?
| |`segment/moved/bytes`|Size in bytes of segments moved/archived via the Move Task.| `dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies| | ||
| |`segment/nuked/bytes`|Size in bytes of segments deleted via the Kill Task.| `dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies| | ||
| | `segment/upgraded/count`| Number of published segments upgraded by a replace task.|`dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies| | ||
| | `segment/upgradedRealtime/count`| Number of realtime segments upgraded by a replace task.|`dataSource`, `taskId`, `taskType`, `groupId`, `interval`, `tags`|Varies| |
There was a problem hiding this comment.
Please rename this to pendingSegment/upgraded/count or something similar.
Upgraded pending segments need not correspond only to realtime tasks.
| @Nullable | ||
| private final String errorMsg; | ||
| @Nullable | ||
| private final List<DataSegment> upgradedAppendSegments; |
There was a problem hiding this comment.
Just upgradedSegments seems sufficient.
| ); | ||
| versionToNumUpgradedSegments.forEach( | ||
| (version, numUpgradedSegments) -> log.info( | ||
| "Task[%s] of datasource[%s] upgraded [%d] append segments to replace version[%s].", |
There was a problem hiding this comment.
... upgraded [%d] segments ...
… into add_segment_upgrade_metrics
| final List<List<DataSegment>> segmentsLists = Lists.partition( | ||
| new ArrayList<>(segments), | ||
| SEGMENT_INSERT_BATCH_SIZE | ||
| ); |
There was a problem hiding this comment.
Think it is within the line limit.
| final List<List<DataSegment>> segmentsLists = Lists.partition( | |
| new ArrayList<>(segments), | |
| SEGMENT_INSERT_BATCH_SIZE | |
| ); | |
| final List<List<DataSegment>> segmentsLists = Lists.partition(new ArrayList<>(segments), SEGMENT_INSERT_BATCH_SIZE); |
There was a problem hiding this comment.
The soft limit for better readability is actually 80 chars.
120 chars is the hard limit which we should not exceed.
Obviously, all of this is subjective and depends on personal coding style.
But generally, I too avoid breaking lines in method/constructor calls if
- they can be fit "comfortably" in a single line
- they are not difficult to follow
The end goal is to have easily readable code.
So I would prefer
performAction(arg1, arg2, arg3, arg4, arg5);over
performAction(
arg1,
arg2,
arg3,
arg4,
arg5
); but NOT
performAction(computeFirstArg(), object.getSecondArg(), arg3, arg4, thirdArgIfNonNull());over
performAction(
computeFirstArg(),
object.getSecondArg(),
arg3,
arg4,
thirdArgIfNonNull()
);For the case at hand, we have two options:
Option 1:
final List<List<DataSegment>> segmentsLists = Lists.partition(
new ArrayList<>(segments),
SEGMENT_INSERT_BATCH_SIZE
);Option 2:
final List<List<DataSegment>> segmentsLists = Lists.partition(new ArrayList<>(segments), SEGMENT_INSERT_BATCH_SIZE);I feel Option 1 here reads better for 2 reasons:
- Option 2 is definitely too long, it's right at the 120 char limit.
- Option 2 is a little more complex cognitively, it has too much stuff happening in a single line.
Hope you find that helpful (and not too verbose!). 🙂
Co-authored-by: Vishesh Garg <vishesh.garg@imply.io>
|
Thanks for the suggestions, @AmatyaAvadhanula ! Let me try to rename the added metrics and add another one for the transactional append action. |
|
This pull request has been marked as stale due to 60 days of inactivity. |
|
This pull request has been marked as stale due to 60 days of inactivity. |
|
This pull request/issue has been closed due to lack of activity. If you think that |
Changes
segment/upgraded/countandsegment/upgradedRealtime/countemitted with dimensions
taskId,dataSource,taskType,groupId,interval, etc.