-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Offloader metrics #13833
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Offloader metrics #13833
Conversation
…oader-metrics-prometheus
|
TRB, tests will be complete later. |
@tjiuming - the content of this PR description is copied from #11555. Is the content of the PR also copied from that PR? Several committers had requested changes on that PR, so it'd be helpful to understand how this PR compares to #11555. I see that @codelipenghui just closed that PR a couple days ago. @frankxieke, were you no longer planning on finishing #11555? |
#11555 is not finished, @frankxieke leaved streamnative, so I take over this issue and fix existing problems. @michaeljmarshall |
eolivelli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the patch LGTM
but I left one comment about topics unloading, please take a look
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/LedgerOffloaderStatsImpl.java
Show resolved
Hide resolved
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/LedgerOffloaderStatsImpl.java
Outdated
Show resolved
Hide resolved
eolivelli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
| this.brokerClientSharedTimer = | ||
| new HashedWheelTimer(new DefaultThreadFactory("broker-client-shared-timer"), 1, TimeUnit.MILLISECONDS); | ||
|
|
||
| int interval = config.getManagedLedgerStatsPeriodSeconds(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if interval is set to <=0 ? Does that disable stats?
|
|
||
| static LedgerOffloaderStats create(boolean exposeManagedLedgerStats, boolean exposeTopicLevelMetrics, | ||
| ScheduledExecutorService scheduler, int interval) { | ||
| if (!exposeManagedLedgerStats) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also contain the condition || interval <= 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally speaking, the value of managedLedgerStatsPeriodSeconds always > 0(default value is 60000), but I can create another PR to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally speaking, the value of managedLedgerStatsPeriodSeconds always > 0(default value is 60000), but I can create another PR to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to address in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh this was already merged, so please address in a new PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tjiuming did you follow up on this comment ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eolivelli interval actually is managedLedgerStatsPeriodSeconds in ServiceConfiguration, and managedLedgerStatsPeriodSeconds is annotated by @FieldContext(minValue = 1). So, interval always >= 1
### Motivation Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring. ### Modifications Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError.
### Motivation Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring. ### Modifications Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError.
### Motivation Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring. ### Modifications Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError. (cherry picked from commit 732049f)
Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring. Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError. (cherry picked from commit 732049f)
Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring. Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError. (cherry picked from commit 732049f)
### Motivation Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring. ### Modifications Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError. (cherry picked from commit 732049f)
Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring. Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError. (cherry picked from commit 732049f)
Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring. Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError. (cherry picked from commit 732049f)
### Motivation Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring. ### Modifications Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError. (cherry picked from commit 732049f) Co-authored-by: Tao Jiuming <95597048+tjiuming@users.noreply.github.com>
* [ManagedLedger] Make 'StatsPeroidSeconds' configurable (apache#11584) Make `StatsPeroidSeconds` configurable. - Move ‘StatsPeriodSeconds’ from ManagedLedgerFactoryImpl to ManagedLedgerFactoryConfig. - Add config `managedLedgerStatsPeriodSeconds`. * Offloader metrics (apache#13833) Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring. Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError. (cherry picked from commit 732049f) Co-authored-by: GuoJiwei <technoboy@apache.org> Co-authored-by: Tao Jiuming <95597048+tjiuming@users.noreply.github.com>
* [ManagedLedger] Make 'StatsPeroidSeconds' configurable (apache#11584) Make `StatsPeroidSeconds` configurable. - Move ‘StatsPeriodSeconds’ from ManagedLedgerFactoryImpl to ManagedLedgerFactoryConfig. - Add config `managedLedgerStatsPeriodSeconds`. (cherry picked from commit ddeae65) * Offloader metrics (apache#13833) Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring. Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError. (cherry picked from commit 732049f) Co-authored-by: GuoJiwei <technoboy@apache.org> Co-authored-by: Tao Jiuming <95597048+tjiuming@users.noreply.github.com>
After #13833, we change the signature of LedgerOffloaderFactory#create. But some users may use the old version LedgerOffloaderFactory in the offloader nar, when pulsar load the offloader nar, it will use the LedgerOffloaderFactory in the offloader nar, the LedgerOffloaderFactory#create has no param offloaderStats. Modifications: Make LedgerOffloaderFactory can load the old nar.
After #13833, we change the signature of LedgerOffloaderFactory#create. But some users may use the old version LedgerOffloaderFactory in the offloader nar, when pulsar load the offloader nar, it will use the LedgerOffloaderFactory in the offloader nar, the LedgerOffloaderFactory#create has no param offloaderStats. Modifications: Make LedgerOffloaderFactory can load the old nar.
Motivation
Currently, there is no offload metrics for tiered storage, so it is very hard for us to debug the performance issues. For example , we can not find why offload is slow or why read offload is slow. For above reasons. we need to add some offload metrics for monitoring.
Modifications
Add metrics during offload procedure and read offload data procedure. Including offloadTime, offloadError, offloadRate, readLedgerLatency, writeStoreLatency, writeStoreError, readOffloadIndexLatency, readOffloadDataLatency, readOffloadRate, readOffloadError.
Does this pull request potentially affect one of the following parts:
If
yeswas chosen, please highlight the changesDocumentation
Check the box below or label this PR directly (if you have committer privilege).
Need to update docs?
doc-required(If you need help on updating docs, create a doc issue)
no-need-doc(Please explain why)
doc(If this PR contains doc changes)