Implementing perpetually running tasks for Streaming Ingestion by uds5501 · Pull Request #18466 · apache/druid

uds5501 · 2025-09-02T06:57:27Z

Proposed changes

As part of this PR, I aim to introduce perpetually running tasks for seekable ingestion (tasks that should never shut down). To enable this feature, users need to specify usePerpetuallyRunningTasks:true in the supervisor spec.

Once users have enabled this flag, they can expect the following changes in Supervisor and SeekableStreamIndexTaskRunner behaviours:

Task rollovers

Existing behaviour: if the supervisor identifies that a running task group has been running for > configured duration, it'll be checkpointed and moved to pendingTaskGroup and eventually shut down.
New behaviour: The supervisor will not attempt to checkpoint and shut down the tasks in the task groups (ioConfig.taskDuration is ignored).

Auto Scaling

Existing behaviour: If a dynamic autoscale event is triggered, an early stop is requested and all the existing tasks are shut down, in memory data structures are cleared and supervisor's running loop creates new task groups in the next run.
New behaviour: The tasks will no longer be shut down, instead the workflow will look like the one mentioned below

To felicitate the coordination of this flow, there will be a flag each on the supervisor and task runner end.

On supervisor, isDynamicAllocationOngoing flag ensures that another scale event is not accepted until the existing one is finished and it determines whether the checkpoint action needs to call the update config to the other tasks.
On task runner, a waitForConfigUpdate is set true when pauseAndCheckpoint is triggered that can only be unset on /configUpdate.

Note for scaledowns: For scaledowns, we'll for now shut down the extra tasks (so that we don't have to remember that there's another task group currently running that we could send the partitions to).

Checkpoint mechanism

Existing behaviour: /offsets/end used to resume the tasks with new sequence added in the task runner.
New behaviour: /offsets/end will update the end of the latest sequence but not add a new sequence in the sequences list if the task runner is waiting for a config update. The task resume responsibility will now land on /configUpdate API in the event of auto scaling.

Druid cluster upgrades

There won't be any change (if early handoffs are requested for a supervisor)

Partition <> Task Mapping assignment

Earlier it used to be mod based mapping, it'll now be range based mapping.

Misc.

This will cover bits that are necessary to ensure that the supervisor don't end up killing tasks that are valid / other race conditions

Supervisor

Once a update config is completed for a task, it's registered in memory and storage via TaskQueue.update() so that in the next runInternal() loop, supervisor knows about the change of ioConfig that took place during task discovery.
Additionally, on an update, we update the task group's starting sequences in the activelyRuningTaskGroups.
isCheckpointSignatureValid() has been introduced to return true for now , earlier the check verified that partitions and offsets are the same in the checkpoints however, with the possible change of partition mapping in tasks, this no longer holds true.

Task Runner

All the points where the main run loop could've been broken have been patched (so that the loop never stops).

What the overall changes look like:

kfaraz

Thanks for the PR, @uds5501 !
The approach makes sense to me, let's try to add some embedded Kafka tests for this.

kfaraz · 2025-09-04T04:05:00Z

+      this.ioConfig = req.getIoConfig();
+      this.stream = ioConfig.getStartSequenceNumbers().getStream();
+      this.endOffsets = new ConcurrentHashMap<>(ioConfig.getEndSequenceNumbers().getPartitionSequenceNumberMap());
+      minMessageTime = Configs.valueOrDefault(ioConfig.getMinimumMessageTime(), DateTimes.MIN);
+      maxMessageTime = Configs.valueOrDefault(ioConfig.getMaximumMessageTime(), DateTimes.MAX);


Maybe put this part in a new method so that constructor can also reuse the code.

kfaraz · 2025-09-04T04:09:12Z

+  public Response updateConfig(TaskConfigUpdateRequest<PartitionIdType, SequenceOffsetType> req) throws InterruptedException
+  {
+    try {
+      requestPause();


We should call pause() instead of requestPause(), since the latter only requests a pause but doesn't ensure that we reach a paused state.
If the pause() call returns non-OK response, we should return the same response immediately.

Before starting with pause, we can log the new config that we are trying to update to.
After the update finishes, let's log the new config and also emit an event.

kfaraz · 2025-09-04T04:13:15Z

+   * Creates new sequences for the ingestion process. It currently accepts the ioConfig given by the request as the correct offsets
+   * and ignores the offsets it may have stored in currOffsets and endOffsets.
+   */
+  private void createNewSequenceFromIoConfig(SeekableStreamIndexTaskIOConfig<PartitionIdType, SequenceOffsetType> ioConfig)


For the 2 new methods, the SeekableStreamIndexTaskRunner must already be performing these actions.
Let's try to put them in common methods so that we can use the same method in the normal flow as well as on update config.

Consider looking at the Kafka StickyAssignor implementation for inspiration here.
The Kafka StickyAssignor is a partition assignment strategy for Kafka consumers within a consumer group. Its primary goal is to achieve both a balanced distribution of partitions among consumers and to minimize the movement of partitions during rebalances.

kfaraz · 2025-09-04T04:18:39Z

@uds5501 , we should probably also add an API to get the latest config from the task since it would have diverged from the original task payload.

… or not

…signment-rollovers

uds5501 · 2025-09-08T19:12:28Z

Edit : This has now been included in the original approach :)

There's a flaw in the original approach. When the autoscaler event is triggered, the offsets are of time t0.
When an autoscaler is triggered, the supervisor attempts to set the offsets present at t0, however, as part of the updateConfig, a checkpoint is triggered by the runners, this makes the offsets t1 (offset t1 >= offset t0), so now the offsets being sent at updateConfig time are irrelevant.

Instead, the chronology has to be:

[Supervisor] sets an internal isUpdatingConfig flag internally, hits the updateConfig API.
[TaskRunner] isConfigChangeOngoing=true , task runner is paused and checkpoint is triggered.
[Supervisor] Hits the setEndOffset API, checks if there was an ongoing isUpdatingConfig (continue normally, if not then attempt creating the new sequence as part of this endOffsetCall and toggle off the isConfigChangeOngoing in the runner).

Concerns:

However this breaks the current design, the updateConfig API is not really updating config anymore, instead it's just performing a pause and forcing a checkpoint, so do we rename this to pauseAndCheckpoint?
This API will be very specific to just catering the auto scaler repartitioning of perpetually running task and won't be extensible (at least in any other way I could look at it) for general config updates.

uds5501 · 2025-09-09T08:00:46Z

New approach will perform the following:

Supervisor will hit a new API /pauseAndCheckpoint on the task runners. [Might end up using /pause with a flag but for the POC, continuing with this]
Supervisor waits till all the checkpoints are completed using a latch.
Supervisor then triggers the /updateConfig with new partitions
Task runner resumes the flow.

…aling event

+    final int numSegments = Integer.parseInt(
+        cluster.runSql("SELECT COUNT(*) FROM sys.segments WHERE datasource = '%s'", dataSource)
+    );


+    final int numRows = Integer.parseInt(
+        cluster.runSql("SELECT COUNT(*) FROM %s", dataSource)
+    );


+    final int numSegments = Integer.parseInt(
+        cluster.runSql("SELECT COUNT(*) FROM sys.segments WHERE datasource = '%s'", dataSource)
+    );


+    final int numRows = Integer.parseInt(
+        cluster.runSql("SELECT COUNT(*) FROM %s", dataSource)
+    );


+    final int numSegments = Integer.parseInt(
+        cluster.runSql("SELECT COUNT(*) FROM sys.segments WHERE datasource = '%s'", dataSource)
+    );


+    final int numRows = Integer.parseInt(
+        cluster.runSql("SELECT COUNT(*) FROM %s", dataSource)
+    );


+
+    SeekableStreamStartSequenceNumbers<KafkaTopicPartition, Long> startSequenceNumbers =
+        new SeekableStreamStartSequenceNumbers<>(
+            spec.getIoConfig().getStream(),


+        existingTaskGroup.getMaximumMessageTime(),
+        spec.getIoConfig().getInputFormat(),
+        spec.getIoConfig().getConfigOverrides(),
+        spec.getIoConfig().isMultiTopic(),


+        spec.getIoConfig().getInputFormat(),
+        spec.getIoConfig().getConfigOverrides(),
+        spec.getIoConfig().isMultiTopic(),
+        spec.getIoConfig().getTaskDuration().getStandardMinutes()


    }

-    SeekableStreamIndexTaskTuningConfig ss = spec.getSpec().getTuningConfig().convertToTaskTuningConfig();
+    SeekableStreamIndexTaskTuningConfig ss = spec.getSpec().getTuningConfig().convertToTaskTuningConfig(spec.usePerpetuallyRunningTasks());


+  /**
+   * Test implementation of OrderedSequenceNumber for Long values
+   */
+  private static class TestSequenceNumber extends OrderedSequenceNumber<Long>


+  /**
+   * Test implementation that throws exceptions on comparison
+   */
+  private static class TestExceptionSequenceNumber extends OrderedSequenceNumber<Long>


kfaraz

Thanks for the changes, @uds5501 !
I have taken a pass through the SeekableStreamIndexTaskRunner, left some initial feedback.

Will take a look at the supervisor side changes after this.

Also, as discussed offline, I think we should not be trying to update all the task payloads whenever there is a scaling event.

Instead, we should perhaps do the following:

Add a new nullable field, say taskIoConfigBuilder (or some other better name) to SeekableStreamSupervisorSpec
this field should have enough info to create task ioConfigs as needed
include a String version inside the taskIoConfigBuilder
this same version should be sent to the tasks in TaskConfigUpdateRequest
update the supervisor spec whenever there is a scaling event

Persisting the update in the supervisor spec has multiple benefits:

Supervisor table is already versioned. So we have the history at our disposal.
We need not update all the task payloads in metadata store every time there is a scaling event.
(There can be a large number of tasks. Also this breaks the current model of immutable Task payloads).
The version provides the supervisor with an easy way to quickly check if a task
is running the same spec/ioConfig or not.

The only drawback is that changes made to the supervisor spec due to scaling
might be difficult to distinguish from changes made by the user.
But we can remedy that with some update message (The audit logs could also be used for this).
Haven't thought this through completely, will try to give it some more thought.

kfaraz · 2025-09-22T06:36:49Z

+      final int compareToEnd = this.compareTo(end);
+      return isEndOffsetExclusive ? compareToEnd < 0 : compareToEnd <= 0;
+    }
+    catch (Exception e) {


What kind of exception can happen here? I don't think we should be catching it.

Fair enough, we've already handled the end's sequence number to be null, we don't need to capture this exception.

kfaraz · 2025-09-22T13:04:16Z


+  @Test
+  @Timeout(60)
+  public void test_ingest20kRows_ofSelfClusterMetricsWithScaleOuts_andVerifyValues()


I think it would make more sense to put these test methods in a new test class which is more focused on scaling, task duration etc. This test class KafkaClusterMetricsTest was mostly about ingesting metrics of a cluster.

The new test class can be called something like KafkaTaskScalingTest and it can use data directly published to Kafka topic (similar to KafkaSupervisorTest) rather than self cluster metrics.

Some typical test cases can be something like:
(all of these cases should be added to a new class KafkaTaskScalingTest)

Case 1:

No auto-scaling

Task duration = 0.5s, perpetual = false

Start the supervisor

Do not publish any data to Kafka topic

Verify that tasks finish within 1s (+ completion time might take the total duration to 2 or 3s but definitely < 5s)

Stop the supervisor

Case 2:

No auto-scaling

Task duration = 0.5s, perpetual = true

Start the supervisor

Do not publish any data to Kafka topic

Verify that tasks do not finish even after 10s

Stop the supervisor

Case 3:

Auto-scaling enabled

Task duration = 0.5s, perpetual = true

Start the supervisor

Do not publish any data to Kafka topic

Verify that tasks are scaled down after some time

Stop the supervisor

Case 4:

Auto-scaling enabled

Task duration = 0.5s, perpetual = true

Start the supervisor

Publish some data and force the task to enter into some lag that triggers scale up action
(you may use druid.unsafe.cluster.testing for this, see ClusterTestingModule)

Verify that tasks are scaled up after some time.

Verify all kinds of events here - that the tasks were paused and checkpointed, then they waited for config update, and then they were finally resumed

Don't send any more data

Verify that the tasks are scaled down

Verify the events again

Stop the supervisor

kfaraz · 2025-09-22T13:09:54Z

  @Nullable
  private final InputRowParser<ByteBuffer> parser;
-  private final String stream;
+  private String stream;


The input source stream never changes for a supervisor. See SeekableStreamSupervisorSpec.validateSpecUpdateTo().

kfaraz · 2025-09-22T13:10:06Z

  private volatile DateTime minMessageTime;
  private volatile DateTime maxMessageTime;
  private final ScheduledExecutorService rejectionPeriodUpdaterExec;
+  private AtomicBoolean waitForConfigUpdate = new AtomicBoolean(false);


Suggested change

private AtomicBoolean waitForConfigUpdate = new AtomicBoolean(false);

private final AtomicBoolean waitForConfigUpdate = new AtomicBoolean(false);

kfaraz · 2025-09-22T13:11:03Z

    //milliseconds waited for created segments to be handed off
    long handoffWaitMs = 0L;
-
+    log.info("Task perpetually running: %s", task.isPerpetuallyRunning());


Move this log line to runInternal.

It's already in runInternal()

👍🏻

Move this log line to the start of this method.

Suggested change

log.info("Task perpetually running: %s", task.isPerpetuallyRunning());

log.info("Running task[%s] in persisted[%s] mode.", task.getId(), task.isPerpetuallyRunning());

kfaraz · 2025-09-22T13:46:07Z

+      return Response.status(409).entity("Task must be paused for checkpoint completion before updating config").build();
+    }
+    try {
+      log.info("Attempting to update config to [%s]", request.getIoConfig());


Please move the logic inside the try into a separate private method, and add a short javadoc to it outlining the steps involved.

kfaraz · 2025-09-22T13:47:36Z

+      log.info("Attempting to update config to [%s]", request.getIoConfig());
+
+      SeekableStreamIndexTaskIOConfig<PartitionIdType, SequenceOffsetType> newIoConfig = (SeekableStreamIndexTaskIOConfig<PartitionIdType, SequenceOffsetType>)
+          toolbox.getJsonMapper().convertValue(request.getIoConfig(), SeekableStreamIndexTaskIOConfig.class);


Why convert? Isn't the payload already a SeekableStreamTaskIOConfig?

My bad, I thought it required conversion to understand the generics, going to remove the conversion.

kfaraz · 2025-09-22T13:48:36Z

+ */
+public class TaskConfigUpdateRequest
+{
+  private final SeekableStreamIndexTaskIOConfig ioConfig;


Should this field and the class have generic args for partition id type and offset type?

I remember some discussion earlier where if we set the partition id types etc here, the supervisor may change the partition id type and that will be invalid later, so decided to strip generics from here.

No, let's retain the generics. It will work as expected.

kfaraz · 2025-09-22T13:51:29Z

+        seekToStartingSequence(recordSupplier, assignment);
+      } else {
+        // if there is no assignment, It means that there was no partition assigned to this task after scaling down.
+        pause();


Wouldn't the task already be in a paused state?

It would be, fair enough

kfaraz · 2025-09-22T13:51:55Z

+      createNewSequenceFromIoConfig(newIoConfig);
+
+      assignment = assignPartitions(recordSupplier);
+      boolean shouldResume = true;


Rather than this boolean, just call resume() inside the if.

kfaraz · 2025-09-23T03:46:20Z

+        spec.getSpec().getIOConfig().getConfigOverrides(),
+        spec.getSpec().getIOConfig().isMultiTopic()


I don't think this change is needed. spec.getIoConfig() already does the right thing.

It's deprecated and IDE + github seems to be complaining about that.

Okay. Although, let's postpone it for a later PR if possible.

kfaraz · 2025-09-23T03:47:37Z

+    if (spec.usePerpetuallyRunningTasks()) {
+      int taskGroupId = getRangeBasedTaskGroupId(partitionId, taskCount);
+      log.debug("Range-based assignment for partition [%s]: taskGroupId [%d] when taskCount is [%d]", partitionId, taskGroupId, taskCount);
+      return taskGroupId;
+    } else {
+      if (partitionId.isMultiTopicPartition()) {
+        return Math.abs(31 * partitionId.topic().hashCode() + partitionId.partition()) % taskCount;
+      } else {
+        return partitionId.partition() % taskCount;
+      }
+    }


Please simplify this if:

Suggested change

if (spec.usePerpetuallyRunningTasks()) {

int taskGroupId = getRangeBasedTaskGroupId(partitionId, taskCount);

log.debug("Range-based assignment for partition [%s]: taskGroupId [%d] when taskCount is [%d]", partitionId, taskGroupId, taskCount);

return taskGroupId;

} else {

if (partitionId.isMultiTopicPartition()) {

return Math.abs(31 * partitionId.topic().hashCode() + partitionId.partition()) % taskCount;

} else {

return partitionId.partition() % taskCount;

}

}

if (partitionId.isMultiTopicPartition()) {

return Math.abs(31 * partitionId.topic().hashCode() + partitionId.partition()) % taskCount;

} else if (spec.usePerpetuallyRunningTasks()) {

return getRangeBasedTaskGroupId(partitionId, taskCount);

} else {

return partitionId.partition() % taskCount;

}

removed this altogether.

kfaraz · 2025-09-23T03:48:00Z

+  {
+    int minPartitionsPerTaskGroup = totalPartitions / taskCount;
+
    if (partitionId.isMultiTopicPartition()) {


This method shouldn't need to handle multi topic stuff right now.

    return suspended;
  }

+  public Optional<String> getVersion()


jtuglu1

Did a quick initial pass, changes LGTM – thanks! I'll take another look at the partition re-assignment logic in a bit.

Have we tested on a rolling update between overlords? Or just a leadership change?

jtuglu1 · 2025-09-25T05:30:09Z

    }
  }

+  private Void updateEntryWithHandle(


nit: perhaps add some javadoc heading like insertEntryWithHandle.

jtuglu1 · 2025-09-25T05:31:09Z

+  @JsonCreator
+  public TaskConfigUpdateRequest(
+      @JsonProperty("ioConfig") @Nullable SeekableStreamIndexTaskIOConfig<PartitionIdType, SequenceOffsetType> ioConfig,
+      @JsonProperty("supervisorSpecVersion") String supervisorSpecVersion


nit: can we assert this is non-null like we do for other task actions' required params?

jtuglu1 · 2025-09-25T05:31:34Z

+  )
+  {
+    this.ioConfig = ioConfig;
+    this.supervisorSpecVersion = supervisorSpecVersion;


nit: same null check

jtuglu1 · 2025-09-25T05:36:43Z

                    }
+                    if (isDynamicAllocationOngoing.get()) {
+                      checkpointsToWaitFor -= setEndOffsetFutures.size();
+                      if (checkpointsToWaitFor <= 0) {


When can this be < 0?

jtuglu1 · 2025-09-25T05:38:41Z

+                      checkpointsToWaitFor -= setEndOffsetFutures.size();
+                      if (checkpointsToWaitFor <= 0) {
+                        log.info("All tasks in current task groups have been checkpointed, resuming dynamic allocation");
+                        pendingConfigUpdateHook.call();


Is pendingConfigUpdateHook written/read across threads? Might want to make this volatile or an atomic ref

jtuglu1 · 2025-09-25T05:58:59Z

+    // For end sequences, use NOT_SET to indicate open-ended reading
+    Map<KafkaTopicPartition, Long> endingSequences = new HashMap<>();
+    for (KafkaTopicPartition partition : partitions) {
+      endingSequences.put(partition, END_OF_PARTITION);


END_OF_PARTITION or NOT_SET? Comment is a bit misleading

jtuglu1 · 2025-09-25T06:03:04Z

+                log.info("Task [%s] paused successfully & Checkpoint requested successffully", id);
+                return deserializeOffsetsMap(r.getContent());
+              } else if (r.getStatus().equals(HttpResponseStatus.ACCEPTED)) {
+                return null;


Do we expect to see this kind of response?

jtuglu1 · 2025-09-25T06:03:41Z

    //milliseconds waited for created segments to be handed off
    long handoffWaitMs = 0L;
-
+    log.info("Task perpetually running: %s", task.isPerpetuallyRunning());


nit: add task ID to this log (unless thread ID is there).

jtuglu1 · 2025-09-25T06:08:53Z

+      return Response.ok().entity("Task is already paused for checkpoint completion").build();
+    }
+    Response pauseResponse = pause();
+    if (pauseResponse.getStatus() == 409) {


nit:

.getStatus().equals(HttpResponseStatus.CONFLICT)

jtuglu1 · 2025-09-25T06:10:34Z

+    Response pauseResponse = pause();
+    if (pauseResponse.getStatus() == 409) {
+      waitForConfigUpdate.set(false);
+      return pauseResponse;


possibly naive: do we need to resume() here?

kfaraz · 2025-09-25T10:47:13Z

      }
+
+      @Override
+      public void update(String id, Task entry)


Please remove this.

kfaraz · 2025-09-25T10:47:30Z

  }
+
+  @Test
+  public void testUpdateTask()


I don't think this is needed anymore.

kfaraz · 2025-09-25T10:48:24Z

    // The default implementation does not do any validation checks.
  }
+
+  default Optional<String> getVersion()


We shouldn't put version into the SupervisorSpec since there is already a VersionedSupervisorSpec.
You can maintain the version as a separate variable in the supervisor or just use a VersionedSupervisorSpec.

kfaraz · 2025-09-25T10:50:16Z

      this.interval = interval;
    }

+    private TestTask(String id, String dataSource, Interval interval, Map<String, Object> context)


Is this needed anymore?

kfaraz · 2025-09-25T10:50:50Z

+    TestSequenceNumber current = new TestSequenceNumber(null, false);
+    TestSequenceNumber end = new TestSequenceNumber(10L, false);
+
+    Assert.assertThrows(NullPointerException.class, () -> current.isMoreToReadBeforeReadingRecord(end, false));


Please put args in separate lines.

kfaraz · 2025-09-25T11:14:18Z

+  private boolean changeTaskCountForPerpetualTasks(int desiredActiveTaskCount,
+                                                   Runnable successfulScaleAutoScalerCallback
+  )


Please fix the formatting.

kfaraz · 2025-09-25T11:16:06Z

+  private boolean changeTaskCountForPerpetualTasks(int desiredActiveTaskCount,
+                                                   Runnable successfulScaleAutoScalerCallback
+  )


Please fix the formatting.

kfaraz · 2025-09-25T11:18:06Z

  public void runInternal()
  {
+    if (isDynamicAllocationOngoing.get()) {
+      log.info("Skipping run because dynamic allocation is ongoing.");


This can get noisy as task scaling can be a slow operation.

kfaraz · 2025-09-29T06:23:42Z

        .get()
        .withDataSchema(schema -> schema.withTimestamp(new TimestampSpec("timestamp", "iso", null)))
-        .withTuningConfig(tuningConfig -> tuningConfig.withMaxRowsPerSegment(maxRowsPerSegment))
+        .withTuningConfig(tuningConfig -> tuningConfig


Please revert all the changes to this file.

kfaraz · 2025-09-29T06:24:59Z

+ * Embedded test to verify task scaling behaviour of {@code KafkaSupervisor} ingesting from a custom kafka topic.
+ */
+@SuppressWarnings("resource")
+public class KafkaTaskScalingTest extends EmbeddedClusterTestBase


Why is this class separate from KafkaTaskAutoScalingTest. We can merge the two classes into one.

kfaraz · 2025-09-29T06:39:20Z

                     .build();
    } else {
      try {
+        // Don't acquire a lock if the task is already paused for checkpoint completion, avoiding deadlock


This comment is not valid anymore

kfaraz · 2025-09-29T06:39:49Z

  public Response setEndOffsets(
      Map<PartitionIdType, SequenceOffsetType> sequenceNumbers,
-      boolean finish // this field is only for internal purposes, shouldn't be usually set by users
+      boolean finish, // this field is only for internal purposes, shouldn't be usually set by users


This old comment does not add any value. Please remove it.

Suggested change

boolean finish, // this field is only for internal purposes, shouldn't be usually set by users

boolean finish,

kfaraz · 2025-09-29T06:41:12Z

+  @JsonCreator
+  public TaskConfigResponse(
+      @JsonProperty("ioConfig") @Nullable SeekableStreamIndexTaskIOConfig<PartitionIdType, SequenceOffsetType> ioConfig,
+      @JsonProperty("supervisorSpecVersion") String supervisorSpecVersion


We might as well just call it supervisorVersion. The version always denotes the spec anyway.
Please make the same change in all the relevant places.

task partition reassignment API design

8d683ab

github-actions Bot added the Area - Ingestion label Sep 2, 2025

Introduce updateConfig API

5fe8a1d

kfaraz reviewed Sep 4, 2025

View reviewed changes

uds5501 added 2 commits September 4, 2025 11:33

Update config request to just use generic objects

16960c0

Add configs to check whether perpetually running tasks should be used…

3fd66cb

… or not

uds5501 changed the title ~~Partition reassignment for a running task~~ [WIP] Partition reassignment for a running task Sep 4, 2025

uds5501 added 2 commits September 4, 2025 19:29

Make partition assignment sequential

b974ec8

Implement supervisor changes to handle perpetual tasks

26379c5

github-actions Bot added the Area - Streaming Ingestion label Sep 5, 2025

uds5501 added 3 commits September 5, 2025 15:00

Fix compilation and checkstyle issues

27dddae

Merge branch 'master' of github.com:apache/druid into supervisor-reas…

884487d

…signment-rollovers

Fix kinesis compilation failures

aa67322

uds5501 force-pushed the supervisor-reassignment-rollovers branch from f9c98ee to aa67322 Compare September 5, 2025 09:53

Working compilation

a57d3eb

github-advanced-security AI found potential problems Sep 5, 2025

View reviewed changes

uds5501 added 5 commits September 5, 2025 22:11

Add embedded tests

a690e17

Get embedded tests working

a85c85e

Push embedded tests

7060e63

Perform parsing in task runner

870ef5b

Emit events when config is updated

7b5524b

uds5501 added 3 commits September 10, 2025 11:33

WIP: Attempting to fix failing latch

77193e0

Fix latches and clean up maven issues

4b51c14

Use atomic booleans for coordination b/w multiple triggers of auto sc…

dcd3549

…aling event

uds5501 force-pushed the supervisor-reassignment-rollovers branch from 9f608bd to dcd3549 Compare September 11, 2025 07:30

uds5501 added 2 commits September 11, 2025 23:06

Fix range based partition assignment logic

673f114

WIP: Fix the latch issue, use settable futures instead.

160dd9f

Fix tests and implcitly turn on release locks on handoff to true

1dab564

github-advanced-security AI found potential problems Sep 18, 2025

View reviewed changes

Comment thread .../main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java Fixed

Fix scale in issues

2054c06

uds5501 marked this pull request as ready for review September 18, 2025 17:09

github-advanced-security AI found potential problems Sep 18, 2025

View reviewed changes

uds5501 added 4 commits September 19, 2025 01:11

Scale down the embedded tests

290bc6b

See if removing router fixes js error

cacabbf

Attempt clearing console from the deps

819973d

Should fix flaky embedded test

6c11f9b

uds5501 changed the title ~~Implementing perpetually running tasks for Steaming Ingestion~~ Implementing perpetually running tasks for Streaming Ingestion Sep 21, 2025

Remove complaining ioConfig and add kafka supervisor test

5d09806

github-advanced-security AI found potential problems Sep 22, 2025

View reviewed changes

Comment thread ...ng-service/src/test/java/org/apache/druid/indexing/kafka/supervisor/KafkaSupervisorTest.java Fixed

kfaraz reviewed Sep 22, 2025

View reviewed changes

kfaraz reviewed Sep 23, 2025

View reviewed changes

Address review comments related to Index task runner

3392bda

jtuglu1 self-requested a review September 24, 2025 04:54

uds5501 added 2 commits September 24, 2025 17:21

Stop persisting task configs and address other comments

0aa8439

Rename a few tests

8ce15d2

github-advanced-security AI found potential problems Sep 25, 2025

View reviewed changes

jtuglu1 reviewed Sep 25, 2025

View reviewed changes

uds5501 added 2 commits September 26, 2025 13:16

Add new embedded tests, fix no data pushed bug, revert some refactors

f0997ba

Complete all embedded test additions

799c547

kfaraz reviewed Sep 29, 2025

View reviewed changes

jtuglu1 mentioned this pull request Dec 6, 2025

Introduce cost-based tasks autoscaler for streaming ingestion #18819

Merged

10 tasks

	private AtomicBoolean waitForConfigUpdate = new AtomicBoolean(false);
	private final AtomicBoolean waitForConfigUpdate = new AtomicBoolean(false);

	log.info("Task perpetually running: %s", task.isPerpetuallyRunning());
	log.info("Running task[%s] in persisted[%s] mode.", task.getId(), task.isPerpetuallyRunning());

		spec.getSpec().getIOConfig().getConfigOverrides(),
		spec.getSpec().getIOConfig().isMultiTopic()

	boolean finish, // this field is only for internal purposes, shouldn't be usually set by users
	boolean finish,

Conversation

uds5501 commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Proposed changes

Task rollovers

Auto Scaling

Checkpoint mechanism

Druid cluster upgrades

Partition <> Task Mapping assignment

Misc.

Supervisor

Task Runner

Uh oh!

kfaraz left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kfaraz commented Sep 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

uds5501 commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

uds5501 commented Sep 9, 2025

Uh oh!

Uh oh!

Check notice

Check notice

Check notice

Check notice

Check notice

Check notice

Check notice

Check notice

Check notice

Check notice

Check warning

Check warning

Uh oh!

kfaraz left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

uds5501 commented Sep 2, 2025 •

edited

Loading

uds5501 commented Sep 8, 2025 •

edited

Loading

kfaraz left a comment •

edited

Loading

kfaraz Sep 23, 2025 •

edited

Loading

jtuglu1 left a comment •

edited

Loading