Skip to content

Stateful auto compaction #8489

@jihoonson

Description

@jihoonson

Motivation

In auto compaction, the coordinator searches for segments to compact based on their byte size. This algorithm is currently a stateless algorithm. At each coordinator run, it traverses all segments of all datasources from the latest one to the oldest one, compares their size against targetCompactionSizeBytes (this is currently missing which causes #8481), and issues a compaction task if it finds some segments smaller than targetCompactionSizeBytes.

However, only comparing the segment size against targetCompactionSizeBytes is not enough to tell that a given segment requires a further compaction or not because the segment could be created with various types of partitionsSpec and later compacted with one of them. (As of now, auto compaction supports only maxRowsPerSegment, maxTotalRows, and targetCompactionSizeBytes, but it should support all partitionsSpec types in the future.)

As of now, we have 3 partitionsSpec, i.e., DynamicPartitionsSpec, HashedPartitionsSpec, and SingleDimensionPartitionsSpec.

  • DynamicPartitionsSpec has maxRowsPerSegment, and maxTotalRows.
  • HashedPartitionsSpec has targetPartitionSize (target number of rows per segment), numShards, and partitionDimensions.
  • SingleDimensionPartitionsSpec has targetPartitionSize (target number of rows per segment), maxPartitionSize (max number of rows per segment), partitionDimensions.

In the coordinator, most of these configurations are not easy to use to search for segments which need compaction because the number of rows in each segment is not available in the coordinator. And even if we had that in the coordinator, hash or range partitioned segments could have totally different number of rows from targetPartitionSize, which makes hard to tell a given segment needs compaction or not. Note that a segment doesn't need further compaction if it's already compacted with the same partitionsSpec. Auto compaction based on parallel indexing can also make things complicated. In parallel indexing with DynamicPartitionsSpec, the last segment created by each task can have rows less than maxRowsPerSegment.

Proposed changes

To address this issue, I propose to store the state of auto compaction in the metadata store, so that auto compaction can search for compaction candidates from the segments which are not compacted yet.

DataSegment change

A new compactionPartitionsSpec will be added to DataSegment. compactionPartitionsSpec will be filled in only the coordinator.

public class DataSegment
{
  private final Integer binaryVersion;
  private final SegmentId id;
  @Nullable
  private final Map<String, Object> loadSpec;
  private final List<String> dimensions;
  private final List<String> metrics;
  private final ShardSpec shardSpec;
  @Nullable
  private final PartitionsSpec compactionPartitionsSpec;
  private final long size;
}

Changes in publishing segments

When a compaction task creates segments, it can add its partitionsSpec to them (Appenderator.add()). Other task types don't add it.

Changes in the coordinator

SQLMetadataSegmentManager loads DataSegment and keeps them in memory same as now. Since PartitionsSpec is not very diverse in general, interning would be useful to reduce memory usage.

DruidCoordinatorSegmentCompactor checks that the compactionPartitionsSpec is available for a given segment. If it's missing, that segment should be a new segment created by a non-compaction task, which means it will be a compaction candidate. If it exists, the coordinator compares the compactionPartitionsSpec with the partitionsSpec in auto compaction configuration. The segment will be a compaction candidate if it has a different partitionsSpec.

Rationale

Dropped alternative

The coordinator can get the number of rows of each segment from the system schema to find the compaction candidates. However, as mentioned in the motivation section, number of rows is not enough to determine that a given segment needs a compaction or not.

Operational impact

There should be no operational impact.

Test plan

  • Will add unit tests
  • Will test in our internal cluster

Future work

In DataSegment, similar to compactionPartitionsSpec, there are a couple of fields which are loaded only in some particular places; they are null or some default value elsewhere. But, even when they are null or default, they will still use 8 more bytes per segment. I think it would probably be worth to split DataSegment into several classes, so that only necessary fields are loaded to reduce memory usage.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions