Skip to content

Optimize coordinator API to retrieve segments with overshadowed status #7571

@surekhasaharan

Description

@surekhasaharan

Coordinator API /druid/coordinator/v1/metadata/segments?includeOvershadowedStatus builds VersionedIntervalTimeline which can be memory intensive and expensive if this API is called multiple times but multiple brokers in a cluster.

The comment here suggests :

  • isOvershadowed becomes a non-final field of DataSegment object itself, not participating in equals() and hashCode().
  • Add interface SegmentsAccess { ImmutableDruidDataSource prepare(String dataSource); Iterable<DataSegment> iterateAll(); } (strawman naming)
  • Add DataSourceAccess computeOvershadowed() method to SQLSegmentMetadataManager, which performs this computation for every snapshot of SQLSegmentMetadataManager.dataSources (which is updated in poll()) at most once, lazily.
  • Both endpoints in MetadataResource and Coordination balancing logic (which currently computes isOvershadowed status on its own, too) use this API.
  • On the side of MetadataSegmentView, maintain something like a Map<DataSegment, DataSegment> and update overshadowed status like map.get(segmentFromCoordinator).setOvershadowed(segmentFromCoordinator.isOvershadowed())

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions