Improve lag-based autoscaler config persistence by Fly-Style · Pull Request #18745 · apache/druid

Fly-Style · 2025-11-14T09:37:19Z

When a supervisor is updated via API, the logic is following this order:
provided taskCountStart > provided taskCount > existing taskCount > provided taskCountMin.

Key changed/added classes in this PR

SupervisorManager
SeekableStreamSupervisorSpec

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
a release note entry in the PR description.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
been tested in a test Druid cluster.

+      AutoScalerConfig autoScalerError = mapper.convertValue(
+          ImmutableMap.of(
+              "enableTaskAutoScaler",
+              "true",
+              "taskCountMax",
+              "1",
+              "taskCountMin",
+              "4"
+          ), AutoScalerConfig.class
+      );


jtuglu1 · 2025-11-14T17:18:14Z

When the autoscaler does a scaling action, it updates ioConfig.autoScalerConfig.taskCountStart, not ioConfig.taskCount

Hi, @Fly-Style thanks for the change. Not sure the above makes sense to me. autoScalerConfig.taskCountStart is supposed to be an immutable value used on supervisor submit to allow the scaler to not have to start at minimum task count (this makes having a smaller minimum easier/possible). Scaling actions should not touch this taskCountStart value (it is meant to be immutable value in the spec to all supervisor to return to baseline on resubmit). In a way, you can expect this to be the "average" task count. From the docs:

Optional config to specify the number of ingestion tasks to start with. When you enable the autoscaler, Druid ignores the value of taskCount in ioConfig and, if specified, starts with the taskCountStart number of tasks. Otherwise, defaults to taskCountMin.

ioConfig.taskCount is the currently running taskCount and should be what is updated. There are other things that rely on the ioConfig.taskCount value being updated/accurate, like the stopTaskCount calculations (both fixed and variable).

Looks like you want to have a way to not "reset" the task count during re-submits, and have the value be sticky. I would prefer this not be the default behavior, but rather opt-in since:

Your "current" task count may not accurately reflect the true current load of the system. "Resubmitting" the supervisor should be (IMO) equivalent to "reseting" the supervisor.
It's nice to have a way to "reset" this value back to the expected baseline task count. For supervisors running large (~1000s) of tasks often times supervisors can become bloated and not scale down fast enough. Resubmitting them allows task count to return to baseline.

jtuglu1 · 2025-11-14T17:38:02Z

Similarly, have you tested what happens when a supervisor is terminated (tombstoned) and then resubmitted without a taskCountStart? It'd be good to make sure that we don't end up merging an old supervisor (with same ID) data with potentially unrelated, new supervisor's data.

kfaraz · 2025-12-09T08:39:07Z

@Fly-Style , I think @jtuglu1 makes a fair point. I too would prefer that taskCountStart remain immutable and we retain the capability to reset the task count upon resubmission.

How about the following approach instead?

When submitting a supervisor with auto-scaler disabled, use the taskCount as is.
When submitting a (new or existing) supervisor with auto-scaler enabled, set taskCount = taskCountStart == null ? taskCountMin : taskCountStart.
Persist the supervisor.
For any auto-scaling event, update the taskCount the same way that we do today.
When the supervisor starts, it should start with the taskCount as the starting number of tasks rather than taskCountStart.

This would ensure:

the issue with Overlord restarts is fixed
taskCountStart remains an immutable config
taskCount always reflects the current task count in the supervisor

What do you think?

gianm · 2025-12-09T10:15:45Z

At our deployment, we do want supervisors to stick to the current task count when resubmitted. The reasons are:

Supervisors need to be updated periodically to do stuff like change tuning configs, change schemas, etc. When we do a change like this, we don't want the task count to change from whatever is currently running.
With the way we use autoscaling, the taskCountStart is just some fixed starter value that is the same for all tables, not anything actually related to that particular table. We rely on the autoscaler to find the ideal value. So, reverting back to the taskCountStart is not desirable.

I'm open to some way of offering both behaviors. I personally feel like sticking to the current task count on resubmitting is more intuitive as a default (since it seems odd that changing schema would lead to a reset of the task count). But I can live with the current behavior as default, as long as there's some way to get the stick-to-current behavior. Currently, there isn't.

kfaraz

Took an initial pass assuming that we continue with the original approach in this PR, i.e. update the taskCountStart when auto-scaling event occurs and carry forward the current value of taskCountStart when supervisor is resubmitted.

Overall code flow makes sense. Left some nitpicks here and there.

kfaraz · 2025-12-09T13:33:15Z

Similarly, have you tested what happens when a supervisor is terminated (tombstoned) and then resubmitted without a taskCountStart? It'd be good to make sure that we don't end up merging an old supervisor (with same ID) data with potentially unrelated, new supervisor's data.

@jtuglu1 , this is already handled since the SupervisorManager contains only the non-tombstoned latest supervisor versions in memory. But we could add an embedded test to verify the same.

@Fly-Style , let's try to add an embedded test for LagBasedAutoScaler similar to what you are doing in the other PR for cost-based auto-scaler. The test could have a method which verifies that we do not pick up the taskCountStart from tombstoned versions, even if the supervisor id is the same.

Fly-Style · 2025-12-09T13:49:03Z

let's try to add an embedded test for LagBasedAutoScaler similar to what you are doing in the other PR for cost-based auto-scaler.

Would love to, but in separate PR if you don't mind :)

gianm · 2025-12-09T19:21:40Z

I'm open to some way of offering both behaviors. I personally feel like sticking to the current task count on resubmitting is more intuitive as a default (since it seems odd that changing schema would lead to a reset of the task count). But I can live with the current behavior as default, as long as there's some way to get the stick-to-current behavior. Currently, there isn't.

A possibility:

When a supervisor is started through supervisor.start(), initial task count is always taken from taskCount. This keeps task count consistent through Overlord restarts, etc.
When a supervisor is posted through SupervisorResource#specPost, taskCount is set to:
1. taskCountStart if that is nonnull
2. else, the user-provided taskCount if that is nonnull
3. else, the taskCount from the previous supervisor spec (in the DB) if one exists
4. else, the user-provided taskCountMin

This I think allows @jtuglu1 and me to both have the behavior we want: @jtuglu1 would set taskCountStart and the task count would always reset to that. I would set neither taskCount nor taskCountStart, and the task count would start out at taskCountMin for a fresh supervisor, then retain the current count from then on.

kfaraz · 2025-12-10T04:30:50Z

Thanks for the suggestion, @gianm ! I think that would work nicely and meet all our needs.

jtuglu1 · 2025-12-10T04:32:29Z

SGTM

kfaraz

@Fly-Style , based on the decided approach, we should do the following:

Continue persisting taskCount (and not taskCountStart) whenever an auto-scale event occurs.
Leave taskCountStart as an immutable config
While merging spec on resubmission, follow this order:
provided taskCountStart > provided taskCount > existing taskCount > provided taskCountMin.
Update docs and add a release note in PR description to reflect this change in behaviour.

Please let me know if anything seems ambiguous.

Fly-Style · 2025-12-10T08:13:40Z

@gianm, @kfaraz, @jtuglu1 thanks a lot for a productive discussion! Happy to implement this according to all requests!

kfaraz

Minor final suggestions.

kfaraz · 2025-12-11T13:16:39Z

+    // Either if autoscaler is absent or taskCountStart is specified - just return.
+    if (thisAutoScalerConfig == null || thisAutoScalerConfig.getTaskCountStart() != null) {
+      return;
+    }


Should we also return early if this.ioConfig.getTaskCount() is specified?

Not really, taskCountStart has bigger priority :)

the priority: provided taskCountStart > provided taskCount > existing taskCount > provided taskCountMin.

kfaraz

Looks good, thanks for the changes @Fly-Style ! 🚀

Sasha Syrotenko and others added 3 commits November 13, 2025 12:57

Improve lag-based autoscaler config behaviour

314b20d

Remove sout from production code

8d39651

Fix more falling tests

b4f3c45

github-actions Bot added Area - Streaming Ingestion Area - Ingestion labels Nov 14, 2025

Checkstyle

aaf84c9

github-advanced-security AI found potential problems Nov 14, 2025

View reviewed changes

Fly-Style marked this pull request as ready for review November 14, 2025 12:01

jtuglu1 self-requested a review November 14, 2025 17:19

Merge branch 'master' into fix/70147-autoscaler-persisted-cfg

ba27c5c

kfaraz reviewed Dec 9, 2025

View reviewed changes

kfaraz reviewed Dec 10, 2025

View reviewed changes

Comment thread ...in/java/org/apache/druid/indexing/seekablestream/supervisor/autoscaler/AutoScalerConfig.java Outdated

kfaraz mentioned this pull request Dec 10, 2025

Introduce cost-based tasks autoscaler for streaming ingestion #18819

Merged

10 tasks

Change a logic according to a new decision

fa86de5

Fly-Style force-pushed the fix/70147-autoscaler-persisted-cfg branch from 26b82e7 to fa86de5 Compare December 10, 2025 12:00

Fix another issues

9679527

Fly-Style requested a review from kfaraz December 10, 2025 13:09

kfaraz reviewed Dec 11, 2025

View reviewed changes

Fly-Style force-pushed the fix/70147-autoscaler-persisted-cfg branch from 426d296 to 9feffbc Compare December 11, 2025 16:55

kfaraz approved these changes Dec 11, 2025

View reviewed changes

Fly-Style force-pushed the fix/70147-autoscaler-persisted-cfg branch from 9feffbc to f9cd5ec Compare December 11, 2025 17:04

Tests fixes and cleanups

c7a1468

Fly-Style force-pushed the fix/70147-autoscaler-persisted-cfg branch from f9cd5ec to c7a1468 Compare December 11, 2025 17:08

Cleanup test methods in resource test

f3d20e0

Fly-Style force-pushed the fix/70147-autoscaler-persisted-cfg branch from 93a1a37 to f3d20e0 Compare December 11, 2025 17:27

kfaraz merged commit 4aedee9 into apache:master Dec 12, 2025
55 checks passed

kgyrtkirk added this to the 36.0.0 milestone Jan 19, 2026

gianm mentioned this pull request Jan 23, 2026

Adjust cost-based autoscaler algorithm #18936

Merged

6 tasks

gianm mentioned this pull request Mar 8, 2026

Do not kill a task if offsets are inconsistent but publish from another group is pending #19091

Merged

10 tasks

Conversation

Fly-Style commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key changed/added classes in this PR

Uh oh!

Check notice

jtuglu1 commented Nov 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jtuglu1 commented Nov 14, 2025

Uh oh!

kfaraz commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gianm commented Dec 9, 2025

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kfaraz commented Dec 9, 2025

Uh oh!

Fly-Style commented Dec 9, 2025

Uh oh!

gianm commented Dec 9, 2025

Uh oh!

kfaraz commented Dec 10, 2025

Uh oh!

jtuglu1 commented Dec 10, 2025

Uh oh!

kfaraz left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Fly-Style commented Dec 10, 2025

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kfaraz Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Fly-Style Dec 11, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Fly-Style commented Nov 14, 2025 •

edited

Loading

jtuglu1 commented Nov 14, 2025 •

edited

Loading

kfaraz commented Dec 9, 2025 •

edited

Loading

kfaraz left a comment •

edited

Loading