-
Notifications
You must be signed in to change notification settings - Fork 63
Description
I've been been working on how to operate Autopilot Pattern apps across multiple data centers (geographically distinct data centers connected over a WAN). In Consul, that led to a data center naming question autopilotpattern/consul#23, and others.
As I explore how to do this in MySQL (using autopilotpattern/wordpress#27 as the scenario), I'm trying to determine the importance of data center awareness. On the one hand, it's important to have a solid strategy for recovering from complete data center failures. On the other, the risk of split brain scenarios grows dramatically over a WAN.
For the purpose of this question and the scenario in autopilotpattern/wordpress#27, let's assume a standard master-replica replication topology (not multi-master, not sharded).
From a data center that's remote from the primary, how can we determine the difference between a failure of the primary, the failure of the entire data center the primary is in, or a network partition of the two data centers?