Skip to content

Add maximumCapacity to taskRunner#17107

Merged
georgew5656 merged 3 commits intoapache:masterfrom
georgew5656:maximumCapacityResponse
Oct 7, 2024
Merged

Add maximumCapacity to taskRunner#17107
georgew5656 merged 3 commits intoapache:masterfrom
georgew5656:maximumCapacityResponse

Conversation

@georgew5656
Copy link
Copy Markdown
Contributor

Move logic for calculating maximumCapacity (tatal capacity based on max workers from autoscaling) to task runners

Description

It is possible to get weird responses from the /totalWorkerCapacity endpoint if mmless ingestion is enabled and the overlord dynamic.autoscaler config is set. This is because the TaskQueryTool.getTotalWorkerCapacity gets totalCapacity from the overlord's task runner but gets maximumCapacity directly from the dynamic config.

I think it makes sense to just expose a getMaximumCapacity method on TaskRunner that defaults to -1 (the default value of maximumCapacity) and gets overwritten.

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...

In order to move getMaximumCapacity logic into each TaskRunner, i had to duplicate logic between HttpRemoteTaskRunner and RemoteTaskRunner since they have the same logic (check the overlord dynamic configuration), but I think this is okay in order for the task runners to be responsible for their own max capacity values.

Release note

  • TaskRunner should expose a getMaximumCapacity field
Key changed/added classes in this PR
  • TaskRunner
  • HttpRemoteTaskRunner/RemoteTaskRunner
  • KubernetesAndWorkerTaskRunner/KubernetesTaskRunner
  • TaskQueryTool

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Copy link
Copy Markdown
Contributor

@kfaraz kfaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good, left some minor suggestions.

Comment thread indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskRunner.java Outdated
Comment thread indexing-service/src/main/java/org/apache/druid/indexing/overlord/TaskRunner.java Outdated
}

@Override
public int getMaximumCapacity()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a javadoc here listing the cases when this method returns -1.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this method body should be commoned out between RemoteTaskRunner and HttpTaskRunner.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

were you thinking of introducing a new class to do this?

@georgew5656 georgew5656 requested a review from kfaraz October 1, 2024 14:48
@georgew5656 georgew5656 merged commit 5d7c7a8 into apache:master Oct 7, 2024
@adarshsanjeev adarshsanjeev added this to the 32.0.0 milestone Jan 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants