Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity #11009
Conversation
…ready have the configured segmentGranularity
…ready have the configured segmentGranularity
…ready have the configured segmentGranularity
| } | ||
| } else if (!config.getGranularitySpec().getSegmentGranularity().equals(existingSegmentGranularity)) { | ||
| log.info( | ||
| "Configured granularitySpec[%s] is different from the one[%s] of segments. Needs compaction", |
There was a problem hiding this comment.
nit: maybe you can just log the segment granularity instead of the granularity spec?
| .map(segment -> GranularityType.fromPeriod(segment.getInterval().toPeriod()).getDefaultGranularity()) | ||
| .collect(Collectors.toSet()); | ||
| if (segmentGranularities.size() != 1 || !segmentGranularities.contains(config.getGranularitySpec().getSegmentGranularity())) { | ||
| needsCompaction = true; |
There was a problem hiding this comment.
do we need a log statement here as well?
| // We need to check if all segments have the same segment granularity and if it is the same | ||
| // as the configured segment granularity. | ||
| Set<Granularity> segmentGranularities = candidates.segments.stream() | ||
| .map(segment -> GranularityType.fromPeriod(segment.getInterval().toPeriod()).getDefaultGranularity()) |
There was a problem hiding this comment.
GranularityType.fromPeriod() can throw an exception if the segment interval is not one of the enums in GranularityType. I think we should probably compare the duration of the granularity as well as their timezone and origin.
There was a problem hiding this comment.
Should we just catch the exception and set needsCompaction to true in that case?
There was a problem hiding this comment.
Since that means it must not be the same as the configured segmentGranularity in the auto compaction spec. Since it if it is the same, then it must be resolvable to a Granularity
There was a problem hiding this comment.
Done. I think this should work
jihoonson
left a comment
There was a problem hiding this comment.
LGTM, but please address my comment before merging this PR. The CI seems flaky, I restarted failed jobs.
| // Candidate segments were all compacted without segment granularity set. | ||
| // We need to check if all segments have the same segment granularity as the configured segment granularity. | ||
| needsCompaction = candidates.segments.stream() | ||
| .anyMatch(segment -> !config.getGranularitySpec().getSegmentGranularity().isAligned(segment.getInterval())); |
There was a problem hiding this comment.
Do you want to print a log in this case too?
There was a problem hiding this comment.
Thanks for the review. I accidentally removed the logging message. Added it back in.
Auto-compaction with segment granularity should skip segments that already have the configured segmentGranularity
Description
Auto-compaction now supports setting SegmentGranularity. However, it currently submits compaction tasks unnecessary for segments that was compacted (lastCompactionState != null) but lastCompactionState.getGranularitySpec() == null (i.e. compacted when the auto compaction spec does not set SegmentGranularity) and the segments' SegmentGranularity is already the same as the SegmentGranularity configured in the auto compaction spec. This change will make auto compaction checks the existing segments' SegmentGranularity and determine if compaction is needed or not even when SegmentGranularity is not stored in lastCompactionState.
Note that, with this change, we actually never need to check SegmentGranularity in lastCompactionState to determine if segments need compaction or not. However, I still keep that check there for the following reasons:
This PR has: