Add deep storage segment metric#16072
Conversation
After this patch, will such segments be reported as available or unavailable? |
| return datasourceToUnavailableSegments; | ||
| } | ||
|
|
||
| public Object2IntMap<String> getDatasourceToDeepStorageQueryOnlySegmentCount() |
There was a problem hiding this comment.
Nit: Some of the code in this method may be commoned out with getDatasourceToUnavailableSegmentCount.
| |`segment/unneededEternityTombstone/count`|Number of non-overshadowed eternity tombstones marked as unused.| |Varies| | ||
| |`segment/unavailable/count`|Number of unique segments left to load until all used segments are available for queries.|`dataSource`|0| | ||
| |`segment/underReplicated/count`|Number of segments, including replicas, left to load until all used segments are available for queries.|`tier`, `dataSource`|0| | ||
| |`segment/deepStorageOnly/count`|Number of unique segments that are only available for querying directly from deep storage.|`dataSource`|Varies| |
There was a problem hiding this comment.
How about the metric name deepStorageQueryOnly or better yet availableDeepStorageOnly as it aligns better with unavailable/count?
that should say "not be reported as unavailable", my bad. i think they should not be reported as unavailable b/c that metric is supposed to be 0 during steady state and enabling query from deep storage shouldn't change this |
|
What is the purpose of this metric? The doc says it's typical value is I can imagine tracking the size of cold data as a useful metric, but I'm not sure about the use of seeing how the count of cold segments changes over time will be helpful. Maybe some doc updates about that would help clarify the use of the metric. |
Thanks, @georgew5656 , the steady state reasoning makes sense to me. @suneet-s , one use of this metric would be to find out if the number of segments that are not available for querying using the native engine. For example, if there are some segments that are available for querying from deep storage using MSQ, they would not be counted towards the |
Addressing some followup comments from the change in the metric made in #16020
Description
If a datasource has a replicaCount=0 rule that applies to some segments that are not covered by a replicaCount>0 rule, those segments will not be reported as unavailable in segment/unavailable/count. Since they are not queryable by native queries (only using the query from deep storage feature), we want a separate metric that lets us check how many segments fit these criteria.
Fixed the bug ...
Renamed the class ...
Added a forbidden-apis entry ...
Add metric segment/deepStorage/count that counts the number of segments that are only queryable through the query from deep storage metric.
Release note
New metric to support query from deep storage feature.
Key changed/added classes in this PR
DruidCoordinatorStatsThis PR has: