Skip to content

Add taskStatus dimension to service/heartbeat metric#17488

Merged
zachjsh merged 18 commits intoapache:masterfrom
zachjsh:heartbeat-metric-taskStatus-dim
Dec 6, 2024
Merged

Add taskStatus dimension to service/heartbeat metric#17488
zachjsh merged 18 commits intoapache:masterfrom
zachjsh:heartbeat-metric-taskStatus-dim

Conversation

@zachjsh
Copy link
Copy Markdown
Contributor

@zachjsh zachjsh commented Nov 18, 2024

Description

Added a taskStatus dimension to the service/heartbeat metric. This can be used to detect cases when a particular task has been reporting its heartbeat in a particular state for an unusually long time; for example if a streaming task is stuck in paused state. Only tasks derived from SeekableStreamIndexTask report their taskStatus with this metric for now, other task types do not.

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Copy Markdown
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some suggestions, similar to what had been suggested in an older PR #17268 (comment).

Comment thread docs/operations/metrics.md Outdated
Comment thread indexing-service/src/main/java/org/apache/druid/indexing/common/task/Task.java Outdated
Comment thread services/src/main/java/org/apache/druid/cli/CliPeon.java Outdated
@zachjsh zachjsh requested a review from kfaraz December 4, 2024 20:26
Copy link
Copy Markdown
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, @zachjsh . I have left some suggestions/queries.

Comment thread services/src/main/java/org/apache/druid/cli/CliPeon.java Outdated
@zachjsh zachjsh requested a review from kfaraz December 5, 2024 20:36
Copy link
Copy Markdown
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, minor non-blocking suggestions.

Comment thread services/src/main/java/org/apache/druid/cli/CliPeon.java Outdated
@zachjsh zachjsh merged commit 3b6a3ae into apache:master Dec 6, 2024
@zachjsh zachjsh deleted the heartbeat-metric-taskStatus-dim branch December 6, 2024 22:19
Akshat-Jain added a commit to Akshat-Jain/druid that referenced this pull request Dec 18, 2024
@adarshsanjeev adarshsanjeev added this to the 32.0.0 milestone Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants