Skip to content

Coordinator Election -- Dual Coordinator Election After ZK Crash #17781

@razinbouzar

Description

@razinbouzar

Affected Version

28.0.1

Description

The issue occurs as follows:

  • ZK failure, and a leader re-elected. coordinator-0 was leader at the time of the ZK failure.
  • After failure, coordinator-0 loses leader status (observed in a log line saying "I am no longer the leader.").
  • coordinator-1 becomes leader (observed in a log line "I am the leader of the coordinators, all must bow!").
  • Seemingly simultaneously a third coordinator, coordinator-2, thinks it is the leader and begins killing tasks started by coordinator-1 and vice versa. There are no log lines that indicate it was elected the leader as was seen in an earlier issue.
  • Killing coordinator-2 restores the cluster health and tasks resume normally.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions