Granularity interval materialization by loquisgon · Pull Request #10742 · apache/druid

loquisgon · 2021-01-11T20:05:01Z

Description

The UniformGranularitySpec currently materializes all intervals when it is constructed. This may cause OOM in the Overlord and other modules whenever the resulting interval list is very large. The changes here should avoid that materialization in all places in the overlord for the UniformGranularitySpec. There is still one method in the unform granularity spec, public Optional<Interval> bucketInterval(DateTime dt) that materializes all the intervals but care is taken that the intervals are not materialized unless that particular method is invoked. Since this method is only called outside the Overlord the OOM issues should be less drastic than they are today.

Key changed/added classes in this PR

A public API change was made to the interface GranularitySpec. Now the method bucketIntervals() returns a Iterable<Interval> rather than a materialized set of intervals as it used to return before this change. The other significant change was to avoid materialization of the intervals in the constructor of the UniformGranularitySpec.

Most of the work is delegated to a new helper class IntervalsByGranularity which takes a Collection of intervals and a Granularity in its constructor to build an Iterator avoiding the materialization of the intervals. This class also ensures that the intervals returned by the iterator are appropriately sorted. Then this class is used in the critical places were materialization was taken place. The class ArbitraryGranularitySpec still materializes all intervals so use the one with care.

This PR has:

[ X] been self-reviewed.
- using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
[X ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

loquisgon · 2021-01-12T01:44:51Z

I will cleanup the javadoc in next commit

loquisgon · 2021-01-12T17:03:32Z

Contract indicates that inputIntervals must be left "as is".... will remove sorting in next commit

…e overlord

…e<Interval>

…iformGranularitySpec

… Travis forbidden method errors in IntervalsByGranularity

maytasm · 2021-01-14T09:19:06Z

+      ArrayList<Interval> condensedIntervals = JodaUtils.condenseIntervals(() -> uniqueIntervals.iterator());
+      intervalIterator = condensedIntervals.iterator();
+    } else {
+      IntervalsByGranularity intervalsByGranularity = new IntervalsByGranularity(intervals, segmentGranularity);


Are we no longer condensing the intervals if segmentGranularity != null?

maytasm · 2021-01-14T09:33:45Z

+  private final Iterable<Interval> intervalIterable;
+  private final TreeSet<Interval> intervals;
+
+  public LookupIntervalBuckets(Iterable<Interval> intervalIterable)


Is this class basically something like a lazy load of the TreeSet?

I added this class to more cleanly re-use the TreeSet logic for bucket extraction among the ArbitraryGranularitySpec and the UniformGranularitySpec. The code in this PR goal is to avoid materialization of intervals in the Overlord only. We will deal with materialization issues elsewhere later. Thus when code outside the Overlord needs to get a bucket for a particuar DateTime then the code will actually materialize all the intervals and stores them in a fast lookup data structure (like TreeSet) to return the bucket. In the UniformGranularitySpec, we could still iterate through all the intervals and find the bucket (since the intervals are sorted) but that would be linear in the number of the intervals for every call. This is not acceptable given the frequency of calling the method to find the bucket. We feel we can improve on this but we decided to divide the work and at least for now prevent the materialization of intervals in the overlord (see PR description also).

maytasm · 2021-01-14T09:44:34Z

-      return Optional.of(JodaUtils.condenseIntervals(setOptional.get()));
+    Iterable<Interval> bucketIntervals = schema.getDataSchema().getGranularitySpec().bucketIntervals();
+    if (bucketIntervals.iterator().hasNext()) {
+      return Optional.of(JodaUtils.condenseIntervals(schema.getDataSchema().getGranularitySpec().bucketIntervals()));


nit: can reuse bucketIntervals variable here

…d to the lock method

…eated elements (see added unit tests that were broken before this change)

…ng the intervals

maytasm · 2021-01-21T01:00:04Z

      IntervalsByGranularity intervalsByGranularity = new IntervalsByGranularity(intervals, segmentGranularity);
-      intervalIterator = intervalsByGranularity.granularityIntervalsIterator();
+      // the following is calling a condense that does not materialize the intervals:
+      uniqueCondensedIntervals.addAll(JodaUtils.condenseIntervals(intervalsByGranularity.granularityIntervalsIterator()));


doesn't addAll materialize all the intervals? Is materializing the condensed intervals fine as long as we are not materializing the original intervals?

Yes, addAll is using the materialized intervals being returned by the condenseIntervals call. The assumption here is that the condensed intervals are much fewer than the bucket intervals. I think this assumption will hold in the majority of real cases. I think that in a rare corner case where the original intervals cannot be condensed then we still would have issues, but not in the majority of production cases.

Yes, addAll is using the materialized intervals being returned by the condenseIntervals call. The assumption here is that the condensed intervals are much fewer than the bucket intervals. I think this assumption will hold in the majority of real cases. I think that in a rare corner case where the original intervals cannot be condensed then we still would have issues, but not in the majority of production cases.

I agree but it doesn't seem hard to avoid materialization. How about adding a new method condensedIntervalsIterator() which returns an iterator that lazily computes condensed intervals? It could be something like this:

public static Iterator<Interval> condensedIntervalsIterator(Iterator<Interval> sortedIntervals) { if (!sortedIntervals.hasNext()) { return Iterators.emptyIterator(); } final PeekingIterator<Interval> peekingIterator = Iterators.peekingIterator(sortedIntervals); return new Iterator<Interval>() { @Override public boolean hasNext() { return peekingIterator.hasNext(); } @Override public Interval next() { if (!hasNext()) { throw new NoSuchElementException(); } Interval currInterval = peekingIterator.next(); while (peekingIterator.hasNext()) { Interval next = peekingIterator.peek(); if (currInterval.abuts(next)) { currInterval = new Interval(currInterval.getStart(), next.getEnd()); peekingIterator.next(); } else if (currInterval.overlaps(next)) { DateTime nextEnd = next.getEnd(); DateTime currEnd = currInterval.getEnd(); currInterval = new Interval( currInterval.getStart(), nextEnd.isAfter(currEnd) ? nextEnd : currEnd ); peekingIterator.next(); } else { break; } } return currInterval; } }; }

Then, tryTimeChunkLock() doesn't have to materialize intervals at all after condensing.

protected boolean tryTimeChunkLock(TaskActionClient client, List<Interval> intervals) throws IOException { // The given intervals are first converted to align with segment granularity. This is because, // when an overwriting task finds a version for a given input row, it expects the interval // associated to each version to be equal or larger than the time bucket where the input row falls in. // See ParallelIndexSupervisorTask.findVersion(). final Iterator<Interval> intervalIterator; final Granularity segmentGranularity = getSegmentGranularity(); if (segmentGranularity == null) { intervalIterator = JodaUtils.condenseIntervals(intervals).iterator(); } else { IntervalsByGranularity intervalsByGranularity = new IntervalsByGranularity(intervals, segmentGranularity); // the following is calling a condense that does not materialize the intervals: intervalIterator = JodaUtils.condensedIntervalsIterator(intervalsByGranularity.granularityIntervalsIterator()); } // Intervals are already condensed to avoid creating too many locks. // Intervals are also sorted and thus it's safe to compare only the previous interval and current one for dedup. Interval prev = null; while (intervalIterator.hasNext()) { final Interval cur = intervalIterator.next(); if (prev != null && cur.equals(prev)) { continue; } prev = cur; final TaskLock lock = client.submit(new TimeChunkLockTryAcquireAction(TaskLockType.EXCLUSIVE, cur)); if (lock == null) { return false; } } return true; }

Hmm, on the second thought, I think the current code is good enough. However, please add some comment explaining why it is OK to materialize intervals here. I think your comment above would be nice.

I like your suggestion so I followed it and will avoid materializing them.

jihoonson · 2021-01-26T04:04:46Z

+  /**
+   * This method does not materialize the intervals represented by the
+   * sortedIntervals iterator. However, caller needs to insure that sortedIntervals
+   * is already sorted in ascending order.


Since Interval is not Comparable (it's not because there could be multiple ways to compare them), it would be nice to explicitly state the order should be Comparators.intervalsByStartThenEnd() to make it more clear.

How about adding a sanity check in the method, so that ti can break if this expectation doesn't meet? Otherwise, it will be hard to debug if something goes wrong in the future.

jihoonson · 2021-01-26T04:16:53Z

+    intervalSet.addAll(intervals);
+    this.sortedIntervals = new ArrayList<>(intervals.size());
+    this.sortedIntervals.addAll(intervalSet);
+    this.sortedIntervals.sort(Comparators.intervalsByStartThenEnd());


It seems that IntervalIterator would work only when sortedIntervals don't overlap each other because the intervals returned from IntervalIterator may not be sorted otherwise. Is this correct? Then, please add a sanity check after dedup to make sure there is no overlap.

Great catch! I will add the sanity check

@jihoonson @loquisgon How can we be sure that intervals don't overlap each?

jihoonson · 2021-01-26T04:19:54Z

+          if (currentIterator.hasNext()) {
+
+            // drop all subsequent intervals that are the same as the previous...
+            while (previous != null && previous.equals(currentIterator.peek())) {


Does the iterator created from granularity.getIterable() ever return the same interval? It doesn't seem like so and this check seems unnecessary.

I put comments in the code to illustrate when this can happen (also I have unit tests to catch this)

jihoonson · 2021-01-26T04:23:48Z

+    if (sortedIntervals.isEmpty()) {
+      ite = Collections.emptyIterator();
+    } else {
+      ite = new IntervalIterator(sortedIntervals);


Does IntervalIterator just flatten the nested iterators and concat them? If this is the case, I suggest to use FluentIterable.from(sortedIntervals).transformAndConcat(interval -> granularity.getIterable(interval)) instead because it's better to use a well-tested library than inventing another wheel.

jihoonson · 2021-01-26T04:30:38Z

      groupByJob.setOutputFormatClass(SequenceFileOutputFormat.class);
      groupByJob.setPartitionerClass(DetermineHashedPartitionsPartitioner.class);
-      if (!config.getSegmentGranularIntervals().isPresent()) {
+      if (!config.getSegmentGranularIntervals().iterator().hasNext()) {


This code just wants to know whether inputIntervals are set or not. Suggest to add a new method hasInputIntervals() in GranularitySpec for better readability.

Same comment for other places where it calls iterator.hasNext() for the same purpose.

Rather than adding a new method I decided to use GranularitySpec::inputIntervals().isEmpty(). Done.

jihoonson · 2021-01-26T05:11:18Z

      IntervalsByGranularity intervalsByGranularity = new IntervalsByGranularity(intervals, segmentGranularity);
-      intervalIterator = intervalsByGranularity.granularityIntervalsIterator();
+      // the following is calling a condense that does not materialize the intervals:
+      uniqueCondensedIntervals.addAll(JodaUtils.condenseIntervals(intervalsByGranularity.granularityIntervalsIterator()));


Yes, addAll is using the materialized intervals being returned by the condenseIntervals call. The assumption here is that the condensed intervals are much fewer than the bucket intervals. I think this assumption will hold in the majority of real cases. I think that in a rare corner case where the original intervals cannot be condensed then we still would have issues, but not in the majority of production cases.

I agree but it doesn't seem hard to avoid materialization. How about adding a new method condensedIntervalsIterator() which returns an iterator that lazily computes condensed intervals? It could be something like this:

public static Iterator<Interval> condensedIntervalsIterator(Iterator<Interval> sortedIntervals) { if (!sortedIntervals.hasNext()) { return Iterators.emptyIterator(); } final PeekingIterator<Interval> peekingIterator = Iterators.peekingIterator(sortedIntervals); return new Iterator<Interval>() { @Override public boolean hasNext() { return peekingIterator.hasNext(); } @Override public Interval next() { if (!hasNext()) { throw new NoSuchElementException(); } Interval currInterval = peekingIterator.next(); while (peekingIterator.hasNext()) { Interval next = peekingIterator.peek(); if (currInterval.abuts(next)) { currInterval = new Interval(currInterval.getStart(), next.getEnd()); peekingIterator.next(); } else if (currInterval.overlaps(next)) { DateTime nextEnd = next.getEnd(); DateTime currEnd = currInterval.getEnd(); currInterval = new Interval( currInterval.getStart(), nextEnd.isAfter(currEnd) ? nextEnd : currEnd ); peekingIterator.next(); } else { break; } } return currInterval; } }; }

Then, tryTimeChunkLock() doesn't have to materialize intervals at all after condensing.

protected boolean tryTimeChunkLock(TaskActionClient client, List<Interval> intervals) throws IOException { // The given intervals are first converted to align with segment granularity. This is because, // when an overwriting task finds a version for a given input row, it expects the interval // associated to each version to be equal or larger than the time bucket where the input row falls in. // See ParallelIndexSupervisorTask.findVersion(). final Iterator<Interval> intervalIterator; final Granularity segmentGranularity = getSegmentGranularity(); if (segmentGranularity == null) { intervalIterator = JodaUtils.condenseIntervals(intervals).iterator(); } else { IntervalsByGranularity intervalsByGranularity = new IntervalsByGranularity(intervals, segmentGranularity); // the following is calling a condense that does not materialize the intervals: intervalIterator = JodaUtils.condensedIntervalsIterator(intervalsByGranularity.granularityIntervalsIterator()); } // Intervals are already condensed to avoid creating too many locks. // Intervals are also sorted and thus it's safe to compare only the previous interval and current one for dedup. Interval prev = null; while (intervalIterator.hasNext()) { final Interval cur = intervalIterator.next(); if (prev != null && cur.equals(prev)) { continue; } prev = cur; final TaskLock lock = client.submit(new TimeChunkLockTryAcquireAction(TaskLockType.EXCLUSIVE, cur)); if (lock == null) { return false; } } return true; }

jihoonson · 2021-01-26T05:23:55Z


      interval = maybeInterval.get();
-      if (!bucketIntervals.get().contains(interval)) {
+      if (!Iterators.contains(bucketIntervals.iterator(), interval)) {


This method is called whenever subtasks need to allocate a new segment via the supervisor task. As a result, this code is never called in the Overlord. It might be better to do this check without materialization for less memory pressure, but will degrade ingestion performance. I suggest to keep the current behaviour of materializing all bucket intervals here because I'm not sure what is the best way to handle this yet. We need to think more about how to fix the OOM error in the task as a follow-up.

I will materialize them then

jihoonson · 2021-01-26T05:26:40Z

+import java.util.Iterator;
+import java.util.TreeSet;
+
+public class LookupIntervalBuckets


Please add some javadoc explaining what this class does and what is for.

jihoonson · 2021-01-26T05:27:21Z

+   * @return Iterable of all time groups
   */
-  Optional<SortedSet<Interval>> bucketIntervals();
+  Iterable<Interval> bucketIntervals();


Suggest sortedBucketIntervals() to make it clear the results are sorted.

jihoonson · 2021-01-26T05:30:34Z

-       ingestionSchema,
-       getContext(),
-       intervalToPartitions
+        toolbox,


nit: I think this PR is OK, but just FYI, it's usually recommended to not fix the code style unless you are modifying that area because it makes hard to find what are the real changes of the PR.

Probably that was an automatic intelli-j change.

jihoonson · 2021-01-26T05:39:53Z

Forgot to mention one more thing. I think this PR won't fix all OOM errors in the Overlord and the next place where we should fix is likely TaskLockbox that manages all task locks per interval. However, I would say this PR is a good start to look into the OOM errors in the Overlord 🙂

…y checks

…or performance)

… when element is null for consistency with other methods in this class (as well that null interval when condensing does not make sense)

…lution

jihoonson

LGTM. +1 after CI. @loquisgon thank you!

loquisgon force-pushed the granularity-interval-materialization branch from e866686 to 48b49a6 Compare January 11, 2021 23:11

loquisgon commented Jan 12, 2021

View reviewed changes

loquisgon marked this pull request as ready for review January 12, 2021 18:37

loquisgon force-pushed the granularity-interval-materialization branch from a1a334c to 9f81cf7 Compare January 13, 2021 00:45

Agustin Gonzalez added 8 commits January 12, 2021 17:51

Prevent interval materialization for UniformGranularitySpec inside th…

ff4f32a

…e overlord

Change API of bucketIntervals in GranularitySpec to return an Iterabl…

fdcbf27

…e<Interval>

Javadoc update, respect inputIntervals contract

fcbea37

Eliminate dependency on wrappedspec (i.e. ArbitraryGranularity) in Un…

86e2db0

…iformGranularitySpec

Added one boundary condition test to UniformGranularityTest and fixed…

3016b30

… Travis forbidden method errors in IntervalsByGranularity

Fix Travis style & other checks

dfdd427

Refactor TreeSet to facilitate re-use in UniformGranularitySpec

1f55fc3

Make sure intervals are unique when there is no segment granularity

86c3f54

loquisgon force-pushed the granularity-interval-materialization branch from 9f81cf7 to 86c3f54 Compare January 13, 2021 00:52

Agustin Gonzalez added 2 commits January 13, 2021 12:01

Style/bugspot fixes...

cb92c34

More travis checks

19d9642

maytasm reviewed Jan 14, 2021

View reviewed changes

Agustin Gonzalez added 2 commits January 14, 2021 16:59

Add condensedIntervals method to GranularitySpec and pass it as neede…

dbf3adf

…d to the lock method

Style & PR feedback

ffdbf0f

clintropolis added the Area - Batch Ingestion label Jan 15, 2021

Agustin Gonzalez added 6 commits January 15, 2021 10:56

Fixed failing test

550cd0e

Fixed bug in IntervalsByGranularity iterator that it would return rep…

666ed2a

…eated elements (see added unit tests that were broken before this change)

Refactor so that we can get the condensed buckets without materializi…

889537c

…ng the intervals

Get rid of GranularitySpec::condensedInputIntervals ... not needed

eac0412

Travis failures fixes

54b3b17

Travis checkstyle fix

159c681

maytasm reviewed Jan 21, 2021

View reviewed changes

Edited/added javadoc comments and a method name (code review feedback)

68d23c9

suneet-s reviewed Jan 22, 2021

View reviewed changes

Comment thread core/src/main/java/org/apache/druid/java/util/common/granularity/IntervalsByGranularity.java Outdated

maytasm approved these changes Jan 22, 2021

View reviewed changes

Fixed jacoco coverage by moving class and adding more coverage

18b42fb

jihoonson reviewed Jan 26, 2021

View reviewed changes

Agustin Gonzalez added 12 commits January 26, 2021 15:37

Avoid materializing the condensed intervals when locking

b988916

Deal with overlapping intervals

936f8fe

Remove code and use library code instead

25ee743

Refactor intervals by granularity using the FluentIterable, add sanit…

6bed491

…y checks

Change !hasNext() to inputIntervals().isEmpty()

422caa6

Remove redundant lambda

4f309ce

Use materialized intervals here since this is outside the overlord (f…

55eb9e0

…or performance)

Name refactor to reflect the fact that bucket intervals are sorted.

b01da22

Style fixes

cd1657c

Removed redundant method and have condensedIntervalIterator throw IAE…

376a376

… when element is null for consistency with other methods in this class (as well that null interval when condensing does not make sense)

Remove forbidden api

28c49ea

Move helper class inside common base class to reduce public space pol…

837ced1

…lution

jihoonson approved these changes Jan 28, 2021

View reviewed changes

suneet-s merged commit 0e4750b into apache:master Jan 29, 2021

loquisgon deleted the granularity-interval-materialization branch January 29, 2021 16:55

This was referenced Feb 20, 2021

Compaction tasks with segment locking can fail when segment granularity is changed #10911

Open

Allow overlapping intervals for the compaction task #10912

Merged

clintropolis added this to the 0.22.0 milestone Aug 12, 2021

Conversation

loquisgon commented Jan 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Key changed/added classes in this PR

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loquisgon Jan 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loquisgon Jan 14, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loquisgon Jan 26, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loquisgon Jan 27, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loquisgon Jan 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

loquisgon commented Jan 11, 2021 •

edited

Loading

loquisgon Jan 15, 2021 •

edited

Loading

loquisgon Jan 14, 2021 •

edited

Loading

loquisgon Jan 26, 2021 •

edited

Loading

loquisgon Jan 27, 2021 •

edited

Loading

loquisgon Jan 28, 2021 •

edited

Loading