Add compaction templates and CompactionJobQueue#18402
Conversation
| config.getTaskPriority(), | ||
| ClientCompactionTaskQueryTuningConfig.from( | ||
| config.getTuningConfig(), | ||
| config.getMaxRowsPerSegment(), |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note
| if (partitionsSpecFromTuningConfig == null) { | ||
| final long maxTotalRows = Configs.valueOrDefault(tuningConfig.getMaxTotalRows(), Long.MAX_VALUE); | ||
| return new DynamicPartitionsSpec(tuningConfig.getMaxRowsPerSegment(), maxTotalRows); | ||
| final Long maxTotalRows = tuningConfig.getMaxTotalRows(); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note
| final long maxTotalRows = Configs.valueOrDefault(tuningConfig.getMaxTotalRows(), Long.MAX_VALUE); | ||
| return new DynamicPartitionsSpec(tuningConfig.getMaxRowsPerSegment(), maxTotalRows); | ||
| final Long maxTotalRows = tuningConfig.getMaxTotalRows(); | ||
| final Integer maxRowsPerSegment = tuningConfig.getMaxRowsPerSegment(); |
Check notice
Code scanning / CodeQL
Deprecated method or constructor invocation Note
| // Filter out jobs if they are outside the search interval | ||
| final List<CompactionJob> validJobs = new ArrayList<>(); | ||
| for (CompactionJob job : allJobs) { | ||
| final Interval compactionInterval = job.getCandidate().getCompactionInterval(); | ||
| if (searchInterval.contains(compactionInterval)) { | ||
| validJobs.add(job); | ||
| } | ||
| } |
There was a problem hiding this comment.
Why would you get jobs outside of the interval? That seems like a problem with the contract or the specific implementation rather than a concern that everybody who ever calls the method needs to apply?
There was a problem hiding this comment.
Thanks for catching this. This should not be needed anymore.
| final BatchIndexingJobTemplate delegate | ||
| = resolvedTable.decodeProperty(IndexingTemplateDefn.PROPERTY_PAYLOAD); | ||
| if (delegate instanceof CompactionJobTemplate) { | ||
| return (CompactionJobTemplate) delegate; | ||
| } else { |
There was a problem hiding this comment.
Perhaps this is an established pattern. But I would never have thought that you are supposed to get the table from the catalog and then pass a magical parameter to "decodeProperty" in order to get it to become an object that is a template.
Why isn't there a method that's like asJobTemplate() or as() or something like that? What's PROPERTY_PAYLOAD and why is it special? Will a decoded property always generate a BatchIndexingJobTemplate? What does "decode property" have to do with generating a BatchIndexingJobTemplate?
Maybe as I read more of the code I'll understand, but only reading the usage-side of this is not very intuitive.
There was a problem hiding this comment.
Yes, this felt hacky to me too. Unfortunately, the Druid catalog currently only understands a "table" as the top-level object. So, I just stuck to that model for the time being. I have also put in some relevant points in the PR descrioption under "Open questions".
@clintropolis , do you have any suggestions on what would be the preferred approach to store an indexing/compaction template in the catalog?
| { | ||
| final CompactionJobTemplate delegate = getDelegate(); | ||
| if (delegate == null) { | ||
| return List.of(); |
There was a problem hiding this comment.
This would mean that the table doesn't actually exist right? Should we offer some sort of indication that there is a reference to a table that doesn't exist? an exception? a log line? Something that can help figure out that there's an issue with how things are setup?
There was a problem hiding this comment.
Fixed to throw a NotFound exception with the proper error message.
| @Override | ||
| public String getType() | ||
| { | ||
| throw new UnsupportedOperationException("This template type cannot be serialized"); | ||
| } |
There was a problem hiding this comment.
What is someone supposed to do if this exception gets thrown? Why would it have happened? Are there any hints that can be provided to the developer who sees this and needs to fix it? Also, it should probably be either a DruidException.defensive() or a NotYetImplemented exception.
There was a problem hiding this comment.
Converted to a defensive exception with a better error message.
| /** | ||
| * Parameters used while creating a {@link CompactionJob} using a {@link CompactionJobTemplate}. | ||
| */ | ||
| public class CompactionJobParams implements JobParams |
There was a problem hiding this comment.
In some other code, you are adjsuting the input with a searchInterval in order to limit the time interval seen. When I initially read that, I wondered "isn't that a param? why isn't it on the param object". After seeing this class, I still don't have a good answer, why isn't that a param on the param object?
There was a problem hiding this comment.
The search interval seemed like a better fit for the InputSource since it defines the time range of the data that is the input for a job template.
It seemed redundant to include it in CompactionJobParams too since the passed DruidInputSource already contains the search interval.
| @Override | ||
| public DateTime getScheduleStartTime() | ||
| { | ||
| return scheduleStartTime; | ||
| } | ||
|
|
||
| public ClusterCompactionConfig getClusterCompactionConfig() | ||
| { | ||
| return clusterCompactionConfig; | ||
| } | ||
|
|
||
| public SegmentTimeline getTimeline(String dataSource) | ||
| { | ||
| return timelineProvider.getTimelineForDataSource(dataSource); | ||
| } | ||
|
|
||
| public CompactionSnapshotBuilder getSnapshotBuilder() | ||
| { | ||
| return snapshotBuilder; | ||
| } |
There was a problem hiding this comment.
Where is the right place to find the contract of what these objects are and what they do and why they exist? If I'm just a lowly developer trying to create a new CompactionTask thingie and I'm passing in a CompactionJobParams how do I go about figuring out what the semantics of the things that were given to me are and what I need to do with them?
There was a problem hiding this comment.
Added javadocs for these methods. Please let me know if they do not seem adequate.
| /** | ||
| * Iterates over all eligible compaction jobs in order of their priority. | ||
| * A fresh instance of this class must be used in every run of the | ||
| * {@link CompactionScheduler}. | ||
| */ |
There was a problem hiding this comment.
What's the scale of compaction jobs? Like, how many do we expect this to be iterating at any point in time?
There was a problem hiding this comment.
Each time chunk (using the target segment granularity) of each datasource would translate to a single compaction job.
On a large cluster with say 1 year of hourly segment data for 10 datasources, this number could easily be 365 * 24 * 10 = ~80k.
Every time the queue is reset, we re-create all of the jobs and check which ones are already done and which ones need to be queued up.
Some of the work is wasteful and we may optimize it in follow up PRs to simply not re-create jobs for intervals which we know to be already compacted.
I have added some metrics to easily monitor the size of the job queue.
| final long segmentPollPeriodSeconds = | ||
| segmentManagerConfig.getPollDuration().toStandardDuration().getMillis(); | ||
| this.schedulePeriodMillis = Math.min(5_000, segmentPollPeriodSeconds); |
There was a problem hiding this comment.
A 5 second poll is pretty rapid, what's the logic behind the need for such a rapid poll and why it won't be a significant resource burden?
There was a problem hiding this comment.
Yes, this value has been carried over from the original iteration of the OverlordCompactionScheduler.
I had intended the scheduler to be able to pick up compaction jobs as soon as compaction task slots become available but it is wasteful to recompute the entire queue for that, especially since on large clusters, recomputation of the entire queue may take up to a couple of minutes (seen from coordinator/time metric on large clusters).
We can do the following instead:
- Keep the schedule period as 5 minutes (or may be even higher?)
- When the scheduler kicks in, reset the job queue.
- We already receive a callback whenever a task completes.
- When a task completes and slots become available, check if there is any pending job still in the queue from the last scheduled run. If there is a pending job, launch that.
Please let me know if this makes sense.
| /** | ||
| * Provides parameters required to create a {@link BatchIndexingJob}. | ||
| */ | ||
| public interface JobParams |
There was a problem hiding this comment.
Why interface instead of class? It seems like this interface is very likely to only ever carry getters and it's unclear to me why it's important that different classes can implement this instead of just having a reference to one of these lying around.
There was a problem hiding this comment.
Fixed, made JobParams a concrete class.
We would still need to have a CompactionJobParams which extends JobParams since compaction job templates use some extra stuff like CompactionSnapshotBuilder and ClusterCompactionConfig.
capistrant
left a comment
There was a problem hiding this comment.
looking really cool. minor comments/questions
| * separate to allow: | ||
| * <ul> | ||
| * <li>fields to be nullable so that only non-null fields are used for matching</li> | ||
| * <li>legacy "compaction-incompatible" fields to be removed</li> |
There was a problem hiding this comment.
I think this may be me coming late to the party on compaction. For my sake, could you elaborate on this list item to get me up to speed. I guess I'm unclear on what you refer to as compaction-incompatible fields
There was a problem hiding this comment.
Updated the javadoc to clarify this point. But it mostly refers to things like the transformSpec which can filter out data or even the dimensionsSpec where we may choose to drop some dimensions and thus cause an irreversible aggregation of the data.
(Side note: the more desirable way to achieve this pre-aggregation to improve query perf would be to use projections).
| return jobs; | ||
| } | ||
|
|
||
| private ClientSqlQuery createQueryForJob(String dataSource, Interval compactionInterval) |
There was a problem hiding this comment.
wondering aloud on if there is a way to make the query and formatting more flexible/extensible if folks want to use more than these standard format vars. Perhaps if someone wanted to do some filtering during compaction? Or would you suggest that if a person desires that, that it would be in some custom template type that they roll on their own?
There was a problem hiding this comment.
I guess since this is just inline, the filter could be defined inline. But if someone changed the query after it existed for some time that filter wouldn't be re-applied to already compacted segments since it is not persisted in the target state.
There was a problem hiding this comment.
Yes, the filter can simply be included in the SQL.
But if someone changed the query after it existed for some time that filter wouldn't be re-applied to already compacted segments since it is not persisted in the target state.
Yes, that is by design. We don't want compaction to be triggered as long as the compaction state has not changed.
This ensures that every minor tweak made to the SQL query does not end up recompacting everything.
| @@ -0,0 +1,191 @@ | |||
| /* | |||
There was a problem hiding this comment.
when I was playing with this on my local I accidentally got into a state where I had defined an expected granularity of MONTH, but the query in the template was DAY. It looked like it just went into an infinite compact loop. Could we force the templated query to honor segment granularity from the state matcher by injecting it? I guess this forces everyone creating a rule to select a segment gran though, which is probably not desired.
There was a problem hiding this comment.
Allowing the state matcher to be distinct from the query is by design as it allows us to make improvements the query without worrying about all the intervals being recompacted when not desired.
For the time being, it is upto the user to ensure that the state matcher and the query are compatible with each other.
In future versions, we will include some kind of template validation.
|
Since there is a lack of clarity on how the catalog would be used to persist compaction templates, the catalog bits have been removed from this PR for the time being. For posterity, some possible options for storing compaction templates are: a. As a table inside a new schema Note: In all of the above cases, the template is always physically stored as a single row in d. A separate metadata table |
CompactionJobQueue
|
Thanks for the reviews, @capistrant , @gianm , @uds5501 , @cheddar ! |
Note
A lot of the changes related to compaction template implementations and persisting templates in Druid catalog were once a part of this PR but have been removed until there is consensus on the best approach.
This PR now deals with only refactoring the
OverlordCompactionSchedulerto use theCompactionJobQueueand other related changes.Changes
Functionality
Classes
BatchIndexingJobwhich may contain either aClientTaskQueryor aClientSqlQuery(for MSQ jobs).BatchIndexingJobTemplatethat can create jobs for a given source and destinationCompactionConfigBasedJobTemplatewhich implementsCompactionJobTemplateCompactionSupervisorto create jobs using templatesCompactionJobQueueto create and submit compaction jobs to the OverlordRefactor for reuse
CompactSegmentstoCompactionSlotManager,CompactionSnapshotBuilderCompactionStatus,CompactionStatusTrackerandDataSourceCompactibleSegmentIteratorFuture work
BatchIndexingSupervisorwhich uses templates to create jobs.ScheduledBatchSupervisorandCompactionSupervisor.Release note
TODO
This PR has: