KAFKA-9924: Add docs for RocksDB properties-based metrics#9895
KAFKA-9924: Add docs for RocksDB properties-based metrics#9895ableegoldman merged 4 commits intoapache:trunkfrom
Conversation
|
Call for review: @guozhangwang @ableegoldman @lct45 |
| each metric reports an aggregation over the RocksDB instances of the state store. | ||
| RocksDB Metrics are grouped into statistics-based metrics and properties-based metrics. | ||
| The former are recorded from statistics that a RocksDB state store collects whereas the latter are recorded from | ||
| properties that RocksDB exposes. |
There was a problem hiding this comment.
I'm still a little confused about the difference between the two after reading this. Maybe an example of something that is exposed by RocksDB but not collected by RocksDB would help
There was a problem hiding this comment.
I elaborated on the two types of metrics. Let me know if it is better now.
There was a problem hiding this comment.
Thanks for the explanation, I also didn't understand the difference at first
There was a problem hiding this comment.
Ahh I see now, thanks for the additional info! LGTM
Co-authored-by: leah <lthomas@confluent.io>
| each metric reports an aggregation over the RocksDB instances of the state store. | ||
| RocksDB Metrics are grouped into statistics-based metrics and properties-based metrics. | ||
| The former are recorded from statistics that a RocksDB state store collects whereas the latter are recorded from | ||
| properties that RocksDB exposes. |
There was a problem hiding this comment.
Thanks for the explanation, I also didn't understand the difference at first
| If a state store consists of multiple RocksDB instances, as is the case for aggregations over time and session windows, | ||
| each metric reports the sum over all the RocksDB instances of the state store, except for the block cache metrics | ||
| <code>block-cache-*</code>. The block cache metrics report the sum over all RocksDB instances if each instance uses its | ||
| own block cache, and they report the recorded value from only one instance if a single block cache is shared |
There was a problem hiding this comment.
Technically I think it's possible for users to share a single block cache among only some stores, but not others. Or they could have two block caches shared across different state stores, etc. Would we detect this case and report the correct block cache for a given state store?
(Idk how common that pattern could possibly be, but this made me wonder)
There was a problem hiding this comment.
Good point! Currently, we treat this mixed pattern as an illegal state and throw an IllegalStateException. Probably not the best way to handle it. Allowing such a mixed pattern complicates the measurement of the cache metrics. I opened KAFKA-12223 to document this.
| </tr> | ||
| <tr> | ||
| <td>compaction-pending</td> | ||
| <td>This metric reports 1 if at least one compaction is pending, otherwise it reports 0.</td> |
There was a problem hiding this comment.
Do you know if this applies only to a single rocksdb instance, or across all instances? By default rocksdb has a single shared Environment (basically a thread pool) for compactions so it seems like it could go either way. But it's ok if you don't know, I wouldn't waste hours trying to understand the rocksdb code trying to figure it out 😉
There was a problem hiding this comment.
No, I do not know. I supposed that it only applies to one single RocksDB instance. I am wondering how RocksDB can share a thread pool between state stores that are started independently from each other.
There was a problem hiding this comment.
I think it has to do with the Env class which specifies the thread pool. By default it uses a static Env, so the Env -- and also the underlying threads -- are shared between all stores within the process. Something like that
Co-authored-by: A. Sophie Blee-Goldman <ableegoldman@gmail.com>
|
Merged to trunk and cherrypicked to 2.7 |
Document the new properties-based metrics for RocksDB Reviewers: Leah Thomas <lthomas@confluent.io>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>
|
Looks like this broke the build, see #9935. I can see a number of test failures in the last Jenkins job, let's please verify the tests before merging. |
|
@ijuma I'm confused, I don't see any failures for the test that this broke (RocksDBMetricsTest) in the build on the last commit. I saw some test failures, but all unrelated. Am I looking in the wrong place? |

Committer Checklist (excluded from commit message)