redis-sentinel 模式观察

未分类 xiaogua 366℃ 0评论

分析发现,当从服务器如果死掉.再重新启动的时候.

redis-sentinel 启动时的事件消息

81:X 11 Jul 02:58:10.742 # Sentinel ID is d9cf89999123e4229b17fafaa4b92ec7e9bef4ff
81:X 11 Jul 02:58:10.742 # +monitor master mymaster 127.0.0.1 6382 quorum 2
81:X 11 Jul 02:58:10.743 * +slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6382
81:X 11 Jul 02:58:10.744 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6382
81:X 11 Jul 02:58:12.784 * +sentinel sentinel 3132498f7650a152b114aa1e8d95b13dd04c3349 127.0.0.1 26381 @ mymaster 127.0.0.1 6382
81:X 11 Jul 02:58:12.843 * +sentinel sentinel 51eaf1b9699de4435940afa4144fae28089fd76b 127.0.0.1 26380 @ mymaster 127.0.0.1 6382

主节点发现了这个从服务器. 主服务器日志

Accepted 127.0.0.1:42564
1:M 10 Jul 09:27:57.419 * Slave 127.0.0.1:6380 asks for synchronization
1:M 10 Jul 09:27:57.419 * Full resync requested by slave 127.0.0.1:6380
1:M 10 Jul 09:27:57.419 * Starting BGSAVE for SYNC with target: disk
1:M 10 Jul 09:27:57.423 * Background saving started by pid 130
1:M 10 Jul 09:27:58.896 – DB 0: 7 keys (0 volatile) in 8 slots HT.
1:M 10 Jul 09:27:58.896 – 6 clients connected (2 slaves), 324115440 bytes in use
130:C 10 Jul 09:27:59.995 * DB saved on disk
130:C 10 Jul 09:27:59.998 * RDB: 4 MB of memory used by copy-on-write
1:M 10 Jul 09:28:00.099 * Background saving terminated with success
1:M 10 Jul 09:28:00.218 * Synchronization with slave 127.0.0.1:6380 succeeded

从服务器器的日志信息

126:S 10 Jul 09:27:57.418 * DB loaded from append only file: 5.206 seconds
126:S 10 Jul 09:27:57.418 * The server is now ready to accept connections on port 6380
126:S 10 Jul 09:27:57.418 – DB 0: 7 keys (0 volatile) in 8 slots HT.
126:S 10 Jul 09:27:57.418 – 6 clients connected (0 slaves), 323075512 bytes in use
126:S 10 Jul 09:27:57.418 * Connecting to MASTER 127.0.0.1:6379
126:S 10 Jul 09:27:57.418 * MASTER <-> SLAVE sync started
126:S 10 Jul 09:27:57.418 * Non blocking connect for SYNC fired the event.
126:S 10 Jul 09:27:57.419 * Master replied to PING, replication can continue…
126:S 10 Jul 09:27:57.419 * Partial resynchronization not possible (no cached master)
126:S 10 Jul 09:27:57.424 * Full resync from master: 7da74582f33f93b2716543cacfcea5436c810860:733715
126:S 10 Jul 09:28:00.100 * MASTER <-> SLAVE sync: receiving 54444637 bytes from master
126:S 10 Jul 09:28:00.251 * MASTER <-> SLAVE sync: Flushing old data
126:S 10 Jul 09:28:02.273 * MASTER <-> SLAVE sync: Loading DB in memory
126:S 10 Jul 09:28:02.273 . Unrecognized RDB AUX field: ‘redis-ver’
126:S 10 Jul 09:28:02.274 . Unrecognized RDB AUX field: ‘redis-bits’
126:S 10 Jul 09:28:02.274 . Unrecognized RDB AUX field: ‘ctime’
126:S 10 Jul 09:28:02.274 . Unrecognized RDB AUX field: ‘used-mem’
126:S 10 Jul 09:28:04.790 * MASTER <-> SLAVE sync: Finished with success
126:S 10 Jul 09:28:04.812 * Background append only file rewriting started by pid 131
126:S 10 Jul 09:28:06.917 – DB 0: 7 keys (0 volatile) in 8 slots HT.
126:S 10 Jul 09:28:06.917 – 7 clients connected (0 slaves), 322981928 bytes in use
126:S 10 Jul 09:28:07.031 * AOF rewrite child asks to stop sending diffs.
131:C 10 Jul 09:28:07.031 * Parent agreed to stop sending diffs. Finalizing AOF…
131:C 10 Jul 09:28:07.031 * Concatenating 0.00 MB of AOF diff received from parent.
131:C 10 Jul 09:28:07.031 * SYNC append only file rewrite performed
131:C 10 Jul 09:28:07.037 * AOF rewrite: 0 MB of memory used by copy-on-write
126:S 10 Jul 09:28:07.117 * Background AOF rewrite terminated with success
126:S 10 Jul 09:28:07.117 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
126:S 10 Jul 09:28:07.118 * Background AOF rewrite finished successfully
126:S 10 Jul 09:28:07.118 – Background AOF rewrite signal handler took 299us

模拟主节点挂掉

kill -9 pid |date 查看关闭的时间

Tue Jul 11 03:12:58 UTC 2017

此时从服务器就一直在重试,

945440 bytes in use
57:S 11 Jul 03:13:07.430 * Connecting to MASTER 127.0.0.1:6382
57:S 11 Jul 03:13:07.430 * MASTER <-> SLAVE sync started
57:S 11 Jul 03:13:07.430 # Error condition on socket for SYNC: Connection refused
57:S 11 Jul 03:13:08.430 * Connecting to MASTER 127.0.0.1:6382
57:S 11 Jul 03:13:08.431 * MASTER <-> SLAVE sync started
57:S 11 Jul 03:13:08.431 # Error condition on socket for SYNC: Connection refused

观察端口 26382 的哨兵的日志

81:X 11 Jul 03:14:58.624 # +sdown master mymaster 127.0.0.1 6382
81:X 11 Jul 03:14:58.707 # +odown master mymaster 127.0.0.1 6382 #quorum 2/2
81:X 11 Jul 03:14:58.707 # +new-epoch 1
81:X 11 Jul 03:14:58.707 # +try-failover master mymaster 127.0.0.1 6382
81:X 11 Jul 03:14:58.708 # +vote-for-leader d9cf89999123e4229b17fafaa4b92ec7e9bef4ff 1
81:X 11 Jul 03:14:58.709 # 51eaf1b9699de4435940afa4144fae28089fd76b voted for d9cf89999123e4229b17fafaa4b92ec7e9bef4ff 1
81:X 11 Jul 03:14:58.709 # 3132498f7650a152b114aa1e8d95b13dd04c3349 voted for d9cf89999123e4229b17fafaa4b92ec7e9bef4ff 1
81:X 11 Jul 03:14:58.808 # +elected-leader master mymaster 127.0.0.1 6382
81:X 11 Jul 03:14:58.808 # +failover-state-select-slave master mymaster 127.0.0.1 6382
81:X 11 Jul 03:14:58.871 # +selected-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6382
81:X 11 Jul 03:14:58.871 * +failover-state-send-slaveof-noone slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6382
81:X 11 Jul 03:14:58.927 * +failover-state-wait-promotion slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6382
81:X 11 Jul 03:14:59.166 # +promoted-slave slave 127.0.0.1:6381 127.0.0.1 6381 @ mymaster 127.0.0.1 6382
81:X 11 Jul 03:14:59.166 # +failover-state-reconf-slaves master mymaster 127.0.0.1 6382
81:X 11 Jul 03:14:59.223 * +slave-reconf-sent slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6382
81:X 11 Jul 03:14:59.780 # -odown master mymaster 127.0.0.1 6382
81:X 11 Jul 03:15:00.194 * +slave-reconf-inprog slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6382
81:X 11 Jul 03:15:07.565 * +slave-reconf-done slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6382
81:X 11 Jul 03:15:07.649 # +failover-end master mymaster 127.0.0.1 6382
81:X 11 Jul 03:15:07.649 # +switch-master mymaster 127.0.0.1 6382 127.0.0.1 6381
81:X 11 Jul 03:15:07.649 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
81:X 11 Jul 03:15:07.649 * +slave slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381
81:X 11 Jul 03:17:07.666 # +sdown slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381

哨兵 26381的日志信息

87:X 11 Jul 03:14:58.634 # +sdown master mymaster 127.0.0.1 6382
87:X 11 Jul 03:14:58.708 # +new-epoch 1
87:X 11 Jul 03:14:58.708 # +vote-for-leader d9cf89999123e4229b17fafaa4b92ec7e9bef4ff 1
87:X 11 Jul 03:14:58.710 # +odown master mymaster 127.0.0.1 6382 #quorum 3/2
87:X 11 Jul 03:14:58.710 # Next failover delay: I will not start a failover before Tue Jul 11 03:16:58 2017
87:X 11 Jul 03:14:59.223 # +config-update-from sentinel d9cf89999123e4229b17fafaa4b92ec7e9bef4ff 127.0.0.1 26382 @ mymaster 127.0.0.1 6382
87:X 11 Jul 03:14:59.223 # +switch-master mymaster 127.0.0.1 6382 127.0.0.1 6381
87:X 11 Jul 03:14:59.224 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
87:X 11 Jul 03:14:59.224 * +slave slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381
87:X 11 Jul 03:16:59.256 # +sdown slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381

哨兵 26380的日志信息

85:X 11 Jul 03:14:58.582 # +sdown master mymaster 127.0.0.1 6382
85:X 11 Jul 03:14:58.708 # +new-epoch 1
85:X 11 Jul 03:14:58.708 # +vote-for-leader d9cf89999123e4229b17fafaa4b92ec7e9bef4ff 1
85:X 11 Jul 03:14:59.223 # +config-update-from sentinel d9cf89999123e4229b17fafaa4b92ec7e9bef4ff 127.0.0.1 26382 @ mymaster 127.0.0.1 6382
85:X 11 Jul 03:14:59.223 # +switch-master mymaster 127.0.0.1 6382 127.0.0.1 6381
85:X 11 Jul 03:14:59.223 * +slave slave 127.0.0.1:6380 127.0.0.1 6380 @ mymaster 127.0.0.1 6381
85:X 11 Jul 03:14:59.223 * +slave slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381
85:X 11 Jul 03:16:59.231 # +sdown slave 127.0.0.1:6382 127.0.0.1 6382 @ mymaster 127.0.0.1 6381

 

通过上面的日志我们看到了一个精彩的竞选过程.最终由 26382当选领导权

 

与此同时 从服务器之间发生了一些变化,很幸运的是从服务6381获得了晋升资格.

57:M 11 Jul 03:14:58.927 * Discarding previously cached master state.
57:M 11 Jul 03:14:58.927 * MASTER MODE enabled (user request from ‘id=3 addr=127.0.0.1:40114 fd=6 name=sentinel-d9cf8999-cmd age=1008 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=0 qbuf-free=32768 obl=36 oll=0 omem=0 events=r cmd=exec’)
57:M 11 Jul 03:14:58.927 # CONFIG REWRITE executed with success.
57:M 11 Jul 03:14:58.934 – Accepted 127.0.0.1:42146
57:M 11 Jul 03:14:59.006 – Accepted 127.0.0.1:42148
57:M 11 Jul 03:14:59.224 – Client closed connection
57:M 11 Jul 03:14:59.224 – Reading from client: Connection reset by peer
57:M 11 Jul 03:14:59.224 – Client closed connection
57:M 11 Jul 03:14:59.224 – Client closed connection
57:M 11 Jul 03:14:59.641 – Accepted 127.0.0.1:42168
57:M 11 Jul 03:14:59.641 – Accepted 127.0.0.1:42170
57:M 11 Jul 03:14:59.847 – Accepted 127.0.0.1:42172
57:M 11 Jul 03:14:59.848 – Accepted 127.0.0.1:42174
57:M 11 Jul 03:15:00.075 – Accepted 127.0.0.1:42180
57:M 11 Jul 03:15:00.076 * Slave 127.0.0.1:6380 asks for synchronization
57:M 11 Jul 03:15:00.076 * Full resync requested by slave 127.0.0.1:6380
57:M 11 Jul 03:15:00.076 * Starting BGSAVE for SYNC with target: disk
57:M 11 Jul 03:15:00.083 * Background saving started by pid 154
154:C 11 Jul 03:15:01.973 * DB saved on disk
154:C 11 Jul 03:15:01.978 * RDB: 2 MB of memory used by copy-on-write
57:M 11 Jul 03:15:02.031 * Background saving terminated with success
57:M 11 Jul 03:15:02.127 * Synchronization with slave 127.0.0.1:6380 succeeded
57:M 11 Jul 03:15:02.632 – DB 0: 10 keys (0 volatile) in 16 slots HT.
57:M 11 Jul 03:15:02.632 – 6 clients connected (1 slaves), 323987600 bytes in use
57:M 11 Jul 03:15:07.640 – DB 0: 10 keys (0 volatile) in 16 slots HT.
57:M 11 Jul 03:15:07.640 – 6 clients connected (1 slaves), 324029016 bytes in use
57:M 11 Jul 03:15:07.649 – Client closed connection
57:M 11 Jul 03:15:07.649 – Client closed connection
57:M 11 Jul 03:15:08.157 – Accepted 127.0.0.1:42282
57:M 11 Jul 03:15:08.158 – Accepted 127.0.0.1:42284

 

另外一台从服务器被要求使用新的主节点

56:S 11 Jul 03:14:59.223 * Discarding previously cached master state.
56:S 11 Jul 03:14:59.223 * SLAVE OF 127.0.0.1:6381 enabled (user request from ‘id=3 addr=127.0.0.1:42166 fd=6 name=sentinel-d9cf8999-cmd age=1009 idle=0 flags=x db=0 sub=0 psub=0 multi=3 qbuf=133 qbuf-free=32635 obl=36 oll=0 omem=0 events=r cmd=exec’)
56:S 11 Jul 03:14:59.224 # CONFIG REWRITE executed with success.
56:S 11 Jul 03:14:59.225 – Client closed connection
56:S 11 Jul 03:14:59.225 – Client closed connection
56:S 11 Jul 03:14:59.261 – Accepted 127.0.0.1:44208
56:S 11 Jul 03:14:59.261 – Accepted 127.0.0.1:44210
56:S 11 Jul 03:14:59.306 – Accepted 127.0.0.1:44220
56:S 11 Jul 03:14:59.307 – Accepted 127.0.0.1:44222
56:S 11 Jul 03:15:00.075 * Connecting to MASTER 127.0.0.1:6381
56:S 11 Jul 03:15:00.075 * MASTER <-> SLAVE sync started
56:S 11 Jul 03:15:00.075 * Non blocking connect for SYNC fired the event.
56:S 11 Jul 03:15:00.076 * Master replied to PING, replication can continue…
56:S 11 Jul 03:15:00.076 * Partial resynchronization not possible (no cached master)
56:S 11 Jul 03:15:00.083 * Full resync from master: 5ddf0d9e182cacc1cbf0a2c965d5768abe92fc3f:1
56:S 11 Jul 03:15:02.031 * MASTER <-> SLAVE sync: receiving 54444685 bytes from master
56:S 11 Jul 03:15:02.152 * MASTER <-> SLAVE sync: Flushing old data
56:S 11 Jul 03:15:04.211 * MASTER <-> SLAVE sync: Loading DB in memory
56:S 11 Jul 03:15:04.211 . Unrecognized RDB AUX field: ‘redis-ver’
56:S 11 Jul 03:15:04.211 . Unrecognized RDB AUX field: ‘redis-bits’
56:S 11 Jul 03:15:04.211 . Unrecognized RDB AUX field: ‘ctime’
56:S 11 Jul 03:15:04.211 . Unrecognized RDB AUX field: ‘used-mem’
56:S 11 Jul 03:15:06.737 * MASTER <-> SLAVE sync: Finished with success
56:S 11 Jul 03:15:06.743 * Background append only file rewriting started by pid 155
56:S 11 Jul 03:15:07.644 – DB 0: 10 keys (0 volatile) in 16 slots HT.
56:S 11 Jul 03:15:07.644 – 7 clients connected (0 slaves), 323116008 bytes in use
56:S 11 Jul 03:15:07.649 – Client closed connection
56:S 11 Jul 03:15:07.649 – Client closed connection
56:S 11 Jul 03:15:07.726 – Accepted 127.0.0.1:44334
56:S 11 Jul 03:15:07.726 – Accepted 127.0.0.1:44336
56:S 11 Jul 03:15:08.947 * AOF rewrite child asks to stop sending diffs.
155:C 11 Jul 03:15:08.947 * Parent agreed to stop sending diffs. Finalizing AOF…
155:C 11 Jul 03:15:08.947 * Concatenating 0.00 MB of AOF diff received from parent.
155:C 11 Jul 03:15:08.947 * SYNC append only file rewrite performed
155:C 11 Jul 03:15:08.952 * AOF rewrite: 2 MB of memory used by copy-on-write
56:S 11 Jul 03:15:09.045 * Background AOF rewrite terminated with success
56:S 11 Jul 03:15:09.045 * Residual parent diff successfully flushed to the rewritten AOF (0.00 MB)
56:S 11 Jul 03:15:09.045 * Background AOF rewrite finished successfully
56:S 11 Jul 03:15:09.053 – Background AOF rewrite signal handler took 7782us

ss

转载请注明:南瓜的小窝 » redis-sentinel 模式观察

喜欢 (0)
发表我的评论
取消评论
表情

Hi,您需要填写昵称和邮箱!

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址