Skip to content

Conversation

@heesung-sohn
Copy link
Contributor

Motivation

We have the following error from websocket service when running different GCs other than G1GC. This is because websocket uses its own org.apache.pulsar.websocket.stats.JvmMetrics class, which always looks up G1GC stats, regardless of the runtime GC. I think we can replace this duplicated org.apache.pulsar.websocket.stats.JvmMetrics with the common one org.apache.pulsar.common.stats.JvmMetrics, which supports different GC metrics.

2022-05-20T10:54:29,322-0700 [pulsar-websocket-57-2] ERROR org.apache.pulsar.websocket.stats.JvmMetrics - Failed to collect GC stats: java.lang:type=GarbageCollector,name=G1 Young Generation

Modifications

  • Removed org.apache.pulsar.websocket.stats.JvmMetrics
  • Replaced org.apache.pulsar.websocket.stats.JvmMetrics dependency with org.apache.pulsar.common.stats.JvmMetrics in ProxyStats.

Behavior changes

Before

### when running websocket on G1GC
curl http://localhost:8080/admin/v2/proxy-stats/metrics/
[{"metrics":{
"jvm_gc_old_count":0,
"jvm_gc_old_pause":0,
"jvm_gc_young_count":0,
"jvm_gc_young_pause":0,
"jvm_heap_used":52622168,
"jvm_max_direct_memory":4294967296,
"jvm_max_memory":2147483648,
"jvm_thread_cnt":24,
"jvm_total_memory":2147483648},
"dimensions":{"system":"jvm"}}]


### when running websocket on ZGC
websocket.stats.JvmMetrics constantly fails to collect G1GC metrics and throws the error.

2022-05-20T15:26:00,499-0700 [pulsar-websocket-2-1] ERROR org.apache.pulsar.websocket.stats.JvmMetrics - Failed to collect GC stats: java.lang:type=GarbageCollector,name=G1 Young Generation
...

curl http://localhost:8080/admin/v2/proxy-stats/metrics/
[{"metrics":{
"jvm_gc_old_count":0,
"jvm_gc_old_pause":0,
"jvm_gc_young_count":0,
"jvm_gc_young_pause":0,
"jvm_heap_used":117440512,
"jvm_max_direct_memory":4294967296,
"jvm_max_memory":2147483648,
"jvm_thread_cnt":24,
"jvm_total_memory":2147483648},
"dimensions":{"system":"jvm"}}]

After

### when running websocket on G1GC

curl http://localhost:8080/admin/v2/proxy-stats/metrics/
[{"metrics":{
"jvm_direct_memory_used":0,
"jvm_gc_old_count":0,
"jvm_gc_old_pause":0,
"jvm_gc_young_count":0,
"jvm_gc_young_pause":0,
"jvm_heap_used":58425032,
"jvm_max_direct_memory":4294967296,
"jvm_max_memory":2147483648,
"jvm_thread_cnt":28,
"jvm_total_memory":2147483648,
"prx_default_pool_allocated":0,
"prx_default_pool_used":0},
"dimensions":{"metric":"jvm_metrics"}}]

### when running websocket on ZGC
curl http://localhost:8080/admin/v2/proxy-stats/metrics/
[{"metrics":{
"jvm_ZGC Cycles_gc_count":0,
"jvm_ZGC Cycles_gc_pause":0,
"jvm_ZGC Pauses_gc_count":0,
"jvm_ZGC Pauses_gc_pause":0,
"jvm_direct_memory_used":0,
"jvm_full_gc_count":1,
"jvm_full_gc_pause":0,
"jvm_heap_used":117440512,
"jvm_max_direct_memory":4294967296,
"jvm_max_memory":2147483648,
"jvm_thread_cnt":24,
"jvm_total_memory":2147483648,
"prx_default_pool_allocated":0,
"prx_default_pool_used":0},
"dimensions":{"metric":"jvm_metrics"}}]

Verifying this change

  • Make sure that the change passes the CI checks.

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes / no)
  • The public API: (yes / no)
  • The schema: (yes / no / don't know)
  • The default values of configurations: (yes / no)
  • The wire protocol: (yes / no)
  • The rest endpoints: (yes / no)
  • The admin cli options: (yes / no)
  • Anything that affects deployment: (yes / no / don't know)

Documentation

Check the box below or label this PR directly.

Need to update docs?

  • doc-required
    (Your PR needs to update docs and you will update later)

  • [x ] no-need-doc
    (Please explain why)

  • doc
    (Your PR contains doc changes)

  • doc-added
    (Docs have been already added)

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label May 21, 2022
@tjiuming
Copy link
Contributor

LGTM

@tuteng
Copy link
Member

tuteng commented Oct 13, 2022

This exception was thrown in version 2.10, so consider releasing it to version 2.10.x, add label cherry-picked/branch-2.10 and release/2.10.2

@codelipenghui codelipenghui added type/bug The PR fixed a bug or issue reported a bug area/metrics and removed cherry-picked/branch-2.10 labels Oct 13, 2022
@codelipenghui
Copy link
Contributor

@tuteng I have removed the cherry-picked/branch-2.10 label. We are using this label to check if the PR has been cherry-picked. After cherry-picked, we will add this label.

codelipenghui pushed a commit that referenced this pull request Oct 13, 2022
Co-authored-by: Matteo Merli <mmerli@apache.org>
(cherry picked from commit 7eeeba1)
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Oct 18, 2022
…e#15692)

Co-authored-by: Matteo Merli <mmerli@apache.org>
(cherry picked from commit 7eeeba1)
(cherry picked from commit c1417bf)
congbobo184 pushed a commit that referenced this pull request Dec 19, 2022
Co-authored-by: Matteo Merli <mmerli@apache.org>
(cherry picked from commit 7eeeba1)
@congbobo184 congbobo184 added release/2.9.5 cherry-picked/branch-2.9 Archived: 2.9 is end of life labels Dec 19, 2022
@heesung-sohn heesung-sohn deleted the websocket-jvm-metric-fix branch April 2, 2024 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/metrics cherry-picked/branch-2.9 Archived: 2.9 is end of life cherry-picked/branch-2.10 doc-not-needed Your PR changes do not impact docs release/2.9.5 release/2.10.3 type/bug The PR fixed a bug or issue reported a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants