Skip to content

Segment publishing order should be preserved in kafka indexing service #6001

@jihoonson

Description

@jihoonson

In Kafka indexing service, the overlord does a sanity check that the start offsets of partitions of current publishing segments are same with the ones stored in metastore, so that it guarantees that all segments are published in order. Because of this check, some tasks might fail in this scenario.

  1. The supervisor created a task (T1) with a start offset O1.
  2. Somehow, the supervisor couldn't send an endOffset O2 to T1 in taskDuration. Instead, it sent an endOffset O3 to T1 after taskDuration * 10. (In our case, supervisor couldn't send because of too frequent HTTP connection refused errors.)
  3. T1 started to merge, push, and publish segments.
  4. The supervisor created a new task, T2, with a start offset O3.
  5. After taskDuration, it sent an endOffset O4 to T2.
  6. T2 started to merge, push, and publish segments.
  7. Since T1 had run for a much longer time, it had much more segments to publish than T2. As a result, T2 tried to publish before T1 complete publishing.
  8. T2 failed to publish because of the sanity check when updating metastore.

So, I think the supervisor should be able to guarantee segment publishing order across all running tasks like below.

T1: indexing ===> publishing ===> handoff
              T2: indexing ===> publishing ===> handoff
                            T3: indexing ===> publishing ===> handoff
                                          ...

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions