Add compaction templates and `CompactionJobQueue` by kfaraz · Pull Request #18402 · apache/druid

kfaraz · 2025-08-14T06:50:54Z

Note

A lot of the changes related to compaction template implementations and persisting templates in Druid catalog were once a part of this PR but have been removed until there is consensus on the best approach.

This PR now deals with only refactoring the OverlordCompactionScheduler to use the CompactionJobQueue and other related changes.

Changes

Functionality

Reset the compaction job queue every 15 minutes by default.
Whenever a compaction task finishes, check if jobs pending in the queue can be launched.
When a compaction supervisor is stopped, remove its jobs from the queue.
When a compaction supervisor is started (or restarted), create jobs for it and add them to the queue.

Classes

Add BatchIndexingJob which may contain either a ClientTaskQuery or a ClientSqlQuery (for MSQ jobs).
Add BatchIndexingJobTemplate that can create jobs for a given source and destination
Add CompactionConfigBasedJobTemplate which implements CompactionJobTemplate
Update CompactionSupervisor to create jobs using templates
Add CompactionJobQueue to create and submit compaction jobs to the Overlord

Refactor for reuse

Move common code from CompactSegments to CompactionSlotManager, CompactionSnapshotBuilder
Update CompactionStatus, CompactionStatusTracker and DataSourceCompactibleSegmentIterator

Future work

Have a common BatchIndexingSupervisor which uses templates to create jobs.
- It could be implemented by ScheduledBatchSupervisor and CompactionSupervisor.
- This change was originally included in this patch but has been left out to keep the changes small.
Reset the job queue (if required) when the cluster-level compaction config changes
Allow force reset of the job queue by calling an API per supervisor (e.g. supervisor reset) and overall
Cancel tasks which are not aligned with the supervisor spec anymore
Do not queue up jobs for recently compacted intervals even if the segment timeline has been updated
Determine if any change was made to a time chunk of the segment timeline in the latest metadata store poll

Release note

TODO

This PR has:

+        config.getTaskPriority(),
+        ClientCompactionTaskQueryTuningConfig.from(
+            config.getTuningConfig(),
+            config.getMaxRowsPerSegment(),


    if (partitionsSpecFromTuningConfig == null) {
-      final long maxTotalRows = Configs.valueOrDefault(tuningConfig.getMaxTotalRows(), Long.MAX_VALUE);
-      return new DynamicPartitionsSpec(tuningConfig.getMaxRowsPerSegment(), maxTotalRows);
+      final Long maxTotalRows = tuningConfig.getMaxTotalRows();


-      final long maxTotalRows = Configs.valueOrDefault(tuningConfig.getMaxTotalRows(), Long.MAX_VALUE);
-      return new DynamicPartitionsSpec(tuningConfig.getMaxRowsPerSegment(), maxTotalRows);
+      final Long maxTotalRows = tuningConfig.getMaxTotalRows();
+      final Integer maxRowsPerSegment = tuningConfig.getMaxRowsPerSegment();


…_with_templates

uds5501

Had a qq -

…_with_templates

imply-cheddar · 2025-09-12T15:38:59Z

+    // Filter out jobs if they are outside the search interval
+    final List<CompactionJob> validJobs = new ArrayList<>();
+    for (CompactionJob job : allJobs) {
+      final Interval compactionInterval = job.getCandidate().getCompactionInterval();
+      if (searchInterval.contains(compactionInterval)) {
+        validJobs.add(job);
+      }
+    }


Why would you get jobs outside of the interval? That seems like a problem with the contract or the specific implementation rather than a concern that everybody who ever calls the method needs to apply?

Thanks for catching this. This should not be needed anymore.

imply-cheddar · 2025-09-12T15:42:32Z

+    final BatchIndexingJobTemplate delegate
+        = resolvedTable.decodeProperty(IndexingTemplateDefn.PROPERTY_PAYLOAD);
+    if (delegate instanceof CompactionJobTemplate) {
+      return (CompactionJobTemplate) delegate;
+    } else {


Perhaps this is an established pattern. But I would never have thought that you are supposed to get the table from the catalog and then pass a magical parameter to "decodeProperty" in order to get it to become an object that is a template.

Why isn't there a method that's like asJobTemplate() or as() or something like that? What's PROPERTY_PAYLOAD and why is it special? Will a decoded property always generate a BatchIndexingJobTemplate? What does "decode property" have to do with generating a BatchIndexingJobTemplate?

Maybe as I read more of the code I'll understand, but only reading the usage-side of this is not very intuitive.

Yes, this felt hacky to me too. Unfortunately, the Druid catalog currently only understands a "table" as the top-level object. So, I just stuck to that model for the time being. I have also put in some relevant points in the PR descrioption under "Open questions".

@clintropolis , do you have any suggestions on what would be the preferred approach to store an indexing/compaction template in the catalog?

imply-cheddar · 2025-09-12T15:43:49Z

+  {
+    final CompactionJobTemplate delegate = getDelegate();
+    if (delegate == null) {
+      return List.of();


This would mean that the table doesn't actually exist right? Should we offer some sort of indication that there is a reference to a table that doesn't exist? an exception? a log line? Something that can help figure out that there's an issue with how things are setup?

Fixed to throw a NotFound exception with the proper error message.

imply-cheddar · 2025-09-12T15:45:36Z

+  @Override
+  public String getType()
+  {
+    throw new UnsupportedOperationException("This template type cannot be serialized");
+  }


What is someone supposed to do if this exception gets thrown? Why would it have happened? Are there any hints that can be provided to the developer who sees this and needs to fix it? Also, it should probably be either a DruidException.defensive() or a NotYetImplemented exception.

Converted to a defensive exception with a better error message.

imply-cheddar · 2025-09-12T15:48:44Z

+/**
+ * Parameters used while creating a {@link CompactionJob} using a {@link CompactionJobTemplate}.
+ */
+public class CompactionJobParams implements JobParams


In some other code, you are adjsuting the input with a searchInterval in order to limit the time interval seen. When I initially read that, I wondered "isn't that a param? why isn't it on the param object". After seeing this class, I still don't have a good answer, why isn't that a param on the param object?

The search interval seemed like a better fit for the InputSource since it defines the time range of the data that is the input for a job template.

It seemed redundant to include it in CompactionJobParams too since the passed DruidInputSource already contains the search interval.

imply-cheddar · 2025-09-12T15:49:50Z

+  @Override
+  public DateTime getScheduleStartTime()
+  {
+    return scheduleStartTime;
+  }
+
+  public ClusterCompactionConfig getClusterCompactionConfig()
+  {
+    return clusterCompactionConfig;
+  }
+
+  public SegmentTimeline getTimeline(String dataSource)
+  {
+    return timelineProvider.getTimelineForDataSource(dataSource);
+  }
+
+  public CompactionSnapshotBuilder getSnapshotBuilder()
+  {
+    return snapshotBuilder;
+  }


Where is the right place to find the contract of what these objects are and what they do and why they exist? If I'm just a lowly developer trying to create a new CompactionTask thingie and I'm passing in a CompactionJobParams how do I go about figuring out what the semantics of the things that were given to me are and what I need to do with them?

Added javadocs for these methods. Please let me know if they do not seem adequate.

imply-cheddar · 2025-09-12T15:50:51Z

+/**
+ * Iterates over all eligible compaction jobs in order of their priority.
+ * A fresh instance of this class must be used in every run of the
+ * {@link CompactionScheduler}.
+ */


What's the scale of compaction jobs? Like, how many do we expect this to be iterating at any point in time?

Each time chunk (using the target segment granularity) of each datasource would translate to a single compaction job.

On a large cluster with say 1 year of hourly segment data for 10 datasources, this number could easily be 365 * 24 * 10 = ~80k.

Every time the queue is reset, we re-create all of the jobs and check which ones are already done and which ones need to be queued up.
Some of the work is wasteful and we may optimize it in follow up PRs to simply not re-create jobs for intervals which we know to be already compacted.

I have added some metrics to easily monitor the size of the job queue.

imply-cheddar · 2025-09-12T15:56:35Z

+    final long segmentPollPeriodSeconds =
+        segmentManagerConfig.getPollDuration().toStandardDuration().getMillis();
+    this.schedulePeriodMillis = Math.min(5_000, segmentPollPeriodSeconds);


A 5 second poll is pretty rapid, what's the logic behind the need for such a rapid poll and why it won't be a significant resource burden?

Yes, this value has been carried over from the original iteration of the OverlordCompactionScheduler.
I had intended the scheduler to be able to pick up compaction jobs as soon as compaction task slots become available but it is wasteful to recompute the entire queue for that, especially since on large clusters, recomputation of the entire queue may take up to a couple of minutes (seen from coordinator/time metric on large clusters).

We can do the following instead:

Keep the schedule period as 5 minutes (or may be even higher?)

When the scheduler kicks in, reset the job queue.

We already receive a callback whenever a task completes.

When a task completes and slots become available, check if there is any pending job still in the queue from the last scheduled run. If there is a pending job, launch that.

Please let me know if this makes sense.

imply-cheddar · 2025-09-12T15:59:41Z

+/**
+ * Provides parameters required to create a {@link BatchIndexingJob}.
+ */
+public interface JobParams


Why interface instead of class? It seems like this interface is very likely to only ever carry getters and it's unclear to me why it's important that different classes can implement this instead of just having a reference to one of these lying around.

Fixed, made JobParams a concrete class.
We would still need to have a CompactionJobParams which extends JobParams since compaction job templates use some extra stuff like CompactionSnapshotBuilder and ClusterCompactionConfig.

…_with_templates

capistrant

looking really cool. minor comments/questions

capistrant · 2025-10-29T00:51:21Z

+ * separate to allow:
+ * <ul>
+ * <li>fields to be nullable so that only non-null fields are used for matching</li>
+ * <li>legacy "compaction-incompatible" fields to be removed</li>


I think this may be me coming late to the party on compaction. For my sake, could you elaborate on this list item to get me up to speed. I guess I'm unclear on what you refer to as compaction-incompatible fields

Updated the javadoc to clarify this point. But it mostly refers to things like the transformSpec which can filter out data or even the dimensionsSpec where we may choose to drop some dimensions and thus cause an irreversible aggregation of the data.

(Side note: the more desirable way to achieve this pre-aggregation to improve query perf would be to use projections).

capistrant · 2025-10-29T00:59:14Z

+    return jobs;
+  }
+
+  private ClientSqlQuery createQueryForJob(String dataSource, Interval compactionInterval)


wondering aloud on if there is a way to make the query and formatting more flexible/extensible if folks want to use more than these standard format vars. Perhaps if someone wanted to do some filtering during compaction? Or would you suggest that if a person desires that, that it would be in some custom template type that they roll on their own?

I guess since this is just inline, the filter could be defined inline. But if someone changed the query after it existed for some time that filter wouldn't be re-applied to already compacted segments since it is not persisted in the target state.

Yes, the filter can simply be included in the SQL.

But if someone changed the query after it existed for some time that filter wouldn't be re-applied to already compacted segments since it is not persisted in the target state.

Yes, that is by design. We don't want compaction to be triggered as long as the compaction state has not changed.
This ensures that every minor tweak made to the SQL query does not end up recompacting everything.

capistrant · 2025-10-29T21:23:07Z

@@ -0,0 +1,191 @@
+/*


when I was playing with this on my local I accidentally got into a state where I had defined an expected granularity of MONTH, but the query in the template was DAY. It looked like it just went into an infinite compact loop. Could we force the templated query to honor segment granularity from the state matcher by injecting it? I guess this forces everyone creating a rule to select a segment gran though, which is probably not desired.

Allowing the state matcher to be distinct from the query is by design as it allows us to make improvements the query without worrying about all the intervals being recompacted when not desired.

For the time being, it is upto the user to ensure that the state matcher and the query are compatible with each other.
In future versions, we will include some kind of template validation.

…_with_templates

kfaraz · 2025-10-30T12:10:47Z

Since there is a lack of clarity on how the catalog would be used to persist compaction templates, the catalog bits have been removed from this PR for the time being.

For posterity, some possible options for storing compaction templates are:

a. As a table inside a new schema index_template(currently used in this PR)
b. OR as a table inside the druid schema: Currently used for datasources only
c. OR as a single row inside sys.templates: probably not preferable since the catalog models everything as tables and their properties, but this would be neither.

Note: In all of the above cases, the template is always physically stored as a single row in druid_tableDefs in the metadata store.

d. A separate metadata table druid_indexingTemplates. The schema used to access the contents would still be one of a, b, or c.

gianm

LGTM

kfaraz · 2025-10-31T02:14:51Z

Thanks for the reviews, @capistrant , @gianm , @uds5501 , @cheddar !

Add catalog templates for cascading compaction

7a3f46d

github-actions Bot added Area - Dependencies Area - Ingestion labels Aug 14, 2025

Remove TODOs

ce0b28f

github-advanced-security AI found potential problems Aug 14, 2025

View reviewed changes

kfaraz added 4 commits August 14, 2025 13:47

Fix up tests

fa566ce

Add template for creating MSQ compaction jobs

f3e19d2

Add field CompactionCandidate.compactionInterval

939e96a

Add more test coverage for templates

0f96366

github-advanced-security AI found potential problems Aug 16, 2025

View reviewed changes

kfaraz added 2 commits August 17, 2025 10:19

Adjust rule boundaries to ensure maximum compaction

032729e

Merge branch 'master' of github.com:apache/druid into cascade_compact…

34af71e

…_with_templates

github-advanced-security AI found potential problems Aug 17, 2025

View reviewed changes

Comment thread ...-tests/src/test/java/org/apache/druid/testing/embedded/compact/CompactionSupervisorTest.java Fixed

kfaraz added 3 commits August 17, 2025 11:27

Fix tests, comments

6dea564

Add test with mixed templates

69a5425

Clean up API for CompactionJobTemplate, fix test

db2aebd

github-advanced-security AI found potential problems Aug 17, 2025

View reviewed changes

Comment thread indexing-service/src/main/java/org/apache/druid/indexing/compact/CompactionSupervisor.java Fixed

Use SQL to query template definitions

3e084bf

github-actions Bot added the Area - Querying label Aug 17, 2025

kfaraz added 2 commits August 18, 2025 10:00

Add CatalogCoreModule to quidem test setup

45fe7cb

Fix SqlModuleTest

ec51aec

uds5501 reviewed Aug 20, 2025

View reviewed changes

Comment thread ...ing-service/src/main/java/org/apache/druid/indexing/compact/CascadingCompactionTemplate.java Outdated

kfaraz added 2 commits August 21, 2025 14:11

Merge branch 'master' of github.com:apache/druid into cascade_compact…

03cefa5

…_with_templates

Merge branch 'master' of github.com:apache/druid into cascade_compact…

e0aa77d

…_with_templates

imply-cheddar reviewed Sep 12, 2025

View reviewed changes

kfaraz added 3 commits September 29, 2025 22:03

Merge branch 'master' of github.com:apache/druid into cascade_compact…

59ec104

…_with_templates

Add javadocs, simplify JobParams

fcac675

Merge branch 'master' of github.com:apache/druid into cascade_compact…

52741a9

…_with_templates

kfaraz mentioned this pull request Oct 9, 2025

Enhance Compaction task to be able to write to a different/new datasource #18612

Closed

10 tasks

Merge branch 'master' of github.com:apache/druid into cascade_compact…

4e53ab0

…_with_templates

Minor fixes

bd490e1

github-advanced-security AI found potential problems Oct 27, 2025

View reviewed changes

Comment thread server/src/test/java/org/apache/druid/server/metrics/LatchableEmitter.java Fixed

kfaraz added 3 commits October 27, 2025 18:17

Minor fixes

ae17f67

Add metric, remove redundant checks

b6423f8

Merge branch 'master' of github.com:apache/druid into cascade_compact…

e0d919c

…_with_templates

capistrant reviewed Oct 29, 2025

View reviewed changes

kfaraz added 8 commits October 30, 2025 11:38

Update compaction status when job finishes

c830ced

Merge branch 'master' of github.com:apache/druid into cascade_compact…

2263cb6

…_with_templates

Remove catalog changes

e6c68b3

Remove extra changes

558e0b5

Fix up test, remove extra files

d8ce2a7

Update javadocs

a67f408

Fix up tests

738f4f8

Merge branch 'master' of github.com:apache/druid into cascade_compact…

5fcdc43

…_with_templates

kfaraz added 2 commits October 30, 2025 20:55

Remove new template implementations

1be4e04

Remove extra file

3cbf955

kfaraz changed the title ~~Add catalog templates to power cascading compaction~~ Add compaction templates and CompactionJobQueue Oct 30, 2025

gianm approved these changes Oct 30, 2025

View reviewed changes

capistrant approved these changes Oct 30, 2025

View reviewed changes

kfaraz merged commit 30d98b0 into apache:master Oct 31, 2025
51 checks passed

kfaraz deleted the cascade_compact_with_templates branch October 31, 2025 02:14

kgyrtkirk added this to the 36.0.0 milestone Jan 19, 2026

Conversation

kfaraz commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Note

Changes

Functionality

Classes

Refactor for reuse

Future work

Release note

Uh oh!

Uh oh!

Uh oh!

Check notice

Uh oh!

Check notice

Uh oh!

Check notice

Uh oh!

Uh oh!

Uh oh!

uds5501 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kfaraz Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

capistrant left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kfaraz commented Aug 14, 2025 •

edited

Loading

kfaraz Oct 27, 2025 •

edited

Loading

kfaraz commented Oct 30, 2025 •

edited

Loading