StackExchange.Redis GetStatus() returns multiple masters on failover in Sentinel cluster

Hi. I didn't see a similar question here or anywhere.
I was trying to see why it took a long time ( ~ 30s ) for an application to start working with Redis successfully again after a failover is triggered (debug sleep 30). Sentinel and Redis instances show the correct replication info (a new master) within 5-6 seconds (as expected with failover timeout set to 5s).
However, I created a simple console application which reads a string from Redis in a loop just to see what's going on and this confuses me and I'd be grateful if someone explained it to me.
Setup is 3 Redis instances and 5 Sentinel instances. I have changed IPs and hostnames so that security from my company doesn't haunt me.

// Before failover
```
13:40:44.5727: 10.100.100.100:6379: Standalone v6.2.5, master; 16 databases; keep-alive: 00:01:00; int: ConnectedEstablished; sub: n/a
13:40:44.5727: 10.100.100.100:6379: int ops=57, qu=0, qs=0, qc=0, wr=0, socks=1; sub ops=0, qu=0, qs=0, qc=0, wr=0
13:40:44.5727: 10.100.100.100:6379: Circular op-count snapshot; int: 0+19+6+7+5+6+7+6+1=57 (5,70 ops/s; spans 10s); sub:
13:40:44.5727: 10.100.100.300:6379: Standalone v6.2.5, replica; 16 databases; keep-alive: 00:01:00; int: ConnectedEstablished; sub: n/a
13:40:44.5727: 10.100.100.300:6379: int ops=10, qu=0, qs=0, qc=0, wr=0, socks=1; sub ops=0, qu=0, qs=0, qc=0, wr=0
13:40:44.5728: 10.100.100.300:6379: Circular op-count snapshot; int: 0+10=10 (1,00 ops/s; spans 10s); sub:
13:40:44.5728: 10.100.100.200:6379: Standalone v6.2.5, replica; 16 databases; keep-alive: 00:01:00; int: ConnectedEstablished; sub: n/a
13:40:44.5728: 10.100.100.200:6379: int ops=10, qu=0, qs=0, qc=0, wr=0, socks=1; sub ops=0, qu=0, qs=0, qc=0, wr=0
13:40:44.5728: 10.100.100.200:6379: Circular op-count snapshot; int: 0+10=10 (1,00 ops/s; spans 10s); sub:
13:40:44.5728: Sync timeouts: 0; async timeouts: 0; fire and forget: 0; last heartbeat: 0s ago
```
StackExchange.Redis is conected to three Redis instances and the first one is master. All good.

// Failover starts happening
```
SocketClosed (ReadEndOfStream, last-recv: 0) on 10.100.100.200:6379/Interactive, Idle/MarkProcessed, last: GET, origin: ReadFromPipe, outstanding: 0, last-read: 0s ago, last-write: 13s ago, keep-alive: 60s, state: ConnectedEstablished, mgr: 9 of 10 available, in: 0, in-pipe: 0, out-pipe: 0, last-heartbeat: 0s ago, last-mbeat: 0s ago, global: 0s ago, v: 2.2.88.56325, SocketClosed

Timeout awaiting response (outbound=0KiB, inbound=0KiB, 5797ms elapsed, timeout is 5000ms), command=GET, next: GET RedisFailoverTest.TestKey, inst: 0, qu: 0, qs: 2, aw: False, rs: ReadAsync, ws: Idle, in: 0, in-pipe: 0, out-pipe: 0, serverEndpoint: 10.100.100.100:6379, mc: 1/1/0, mgr: 10 of 10 available, clientName: LocalComputerHostname, IOCP: (Busy=0,Free=1000,Min=12,Max=1000), WORKER: (Busy=3,Free=32764,Min=12,Max=32767), v: 2.2.88.56325 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)

RedisConnectionRestored: 10.100.100.200:6379, , None

SocketClosed (ReadEndOfStream, last-recv: 0) on 10.100.100.300:6379/Interactive, Idle/MarkProcessed, last: PING, origin: ReadFromPipe, outstanding: 0, last-read: 0s ago, last-write: 0s ago, keep-alive: 60s, state: ConnectedEstablished, mgr: 9 of 10 available, in: 0, in-pipe: 0, out-pipe: 0, last-heartbeat: 0s ago, last-mbeat: 0s ago, global: 0s ago, v: 2.2.88.56325, SocketClosed

RedisConnectionRestored: 10.100.100.300:6379, , None

Timeout awaiting response (outbound=0KiB, inbound=0KiB, 5813ms elapsed, timeout is 5000ms), command=GET, next: GET RedisFailoverTest.TestKey, inst: 0, qu: 0, qs: 4, aw: False, rs: ReadAsync, ws: Idle, in: 0, in-pipe: 0, out-pipe: 0, serverEndpoint: 10.100.100.100:6379, mc: 1/1/0, mgr: 10 of 10 available, clientName: LocalComputerHostname, IOCP: (Busy=0,Free=1000,Min=12,Max=1000), WORKER: (Busy=3,Free=32764,Min=12,Max=32767), v: 2.2.88.56325 (Please take a look at this article for some common client-side issues that can cause timeouts: https://stackexchange.github.io/StackExchange.Redis/Timeouts)
```
This is pretty much what one expects, StackExchange.Redis reconnects to live instances and the commands that were in the queue fail with timeout. Still all good.

// After failover
```
13:41:02.7888: 10.100.100.100:6379: Standalone v6.2.5, master; 16 databases; keep-alive: 00:01:00; int: ConnectedEstablished; sub: n/a
13:41:02.7888: 10.100.100.100:6379: int ops=62, qu=0, qs=5, qc=0, wr=0, socks=1; sub ops=0, qu=0, qs=0, qc=0, wr=0
13:41:02.7888: 10.100.100.100:6379: Circular op-count snapshot; int: 60+1+1=62 (0,20 ops/s; spans 10s); sub:
13:41:02.7888: 10.100.100.300:6379: Standalone v6.2.5, replica; 16 databases; keep-alive: 00:01:00; int: ConnectedEstablished; sub: n/a
13:41:02.7888: 10.100.100.300:6379: int ops=22, qu=0, qs=0, qc=0, wr=0, socks=2; sub ops=0, qu=0, qs=0, qc=0, wr=0
13:41:02.7888: 10.100.100.300:6379: Circular op-count snapshot; int: 21+1=22 (0,10 ops/s; spans 10s); sub:
13:41:02.7888: 10.100.100.200:6379: Standalone v6.2.5, master; 16 databases; keep-alive: 00:01:00; int: ConnectedEstablished; sub: n/a
13:41:02.7888: 10.100.100.200:6379: int ops=25, qu=0, qs=0, qc=0, wr=0, socks=2; sub ops=0, qu=0, qs=0, qc=0, wr=0
13:41:02.7888: 10.100.100.200:6379: Circular op-count snapshot; int: 20+1+2+2=25 (0,50 ops/s; spans 10s); sub:
13:41:02.7888: Sync timeouts: 0; async timeouts: 5; fire and forget: 0; last heartbeat: 0s ago
```
This puzzles me. After failover, **both the new master and the old master are marked as "master"** and only the one slave that remained a slave is marked as "replica". 
Is this expected or a bug? How does it work? Am I doing something wrong?

After ~20s command to Redis fails on the "wrong" master, it gets reconnected and from that point forward it's marked as replica. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

StackExchange.Redis GetStatus() returns multiple masters on failover in Sentinel cluster #1910

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

StackExchange.Redis GetStatus() returns multiple masters on failover in Sentinel cluster #1910

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions