Skip to content

[Bug] Lots of version incomplete replica created when decommission BE #4147

@morningman

Description

@morningman

Describe the bug
When decommission BE, some of tablets may has lots of replicas generated by clone task.

Why

Give examples of why:

  1. Tablet X has 3 replicas on A, B, C 3 BEs.
  2. C is decommission, so we choose the Backend D to relocating the new replica,
  3. After relocating, Tablet X has 4 replicas: A, B, C(decommision), D(may be version incomplete)
  4. D may be version incomplete because the clone task ran a long time, the new version has been published.
  5. At the next time of tablet checking, Tablet X's status is still REPLICA_RELOCATING.
    If we don't choose D as dest BE to do the new relocating, it will choose new backend E
    to store the new replicas. So back and forth, the number of replicas will increase forever.

So a better solution is to select D as dest BE again to do the clone task. This may trigger an increment clone task
that can be done more fast.

Metadata

Metadata

Assignees

Labels

area/balanceIssues or PRs related to data balancekind/fixCategorizes issue or PR as related to a bug.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions