Skip to content

Support parallel replacements for pods from the same rack #873

@adejanovski

Description

@adejanovski

What is missing?

In case a full AZ got disconnected, we'd want the ability to replace the whole rack concurrently.
The option I can think of right now is to allow ReplaceNode CassandraTasks to run concurrently and re-bootstrap multiple nodes in parallel (allow multiple jobs in a single task).

The starting sequence needs to be reviewed so that we can have a "fast startup path" for replacements, like we have for restarts.

We need to verify that Cassandra allows multiple replacements concurrently and if that requires additional jvm options to be set (and see if they reduce the safety of the system overall).

Why is this needed?

With very high densities it can be more challenging to run repair compared to replacing nodes, but currently CassandraTasks allow replacing a single node at a time, which might not be efficient enough.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions