暂无图片
暂无图片
2
暂无图片
暂无图片
暂无图片

记一次Redis哨兵自动切换失败故障定位及处理

IT小Chen 2023-07-10
1828

环境说明:

    redis实例:
    192.168.126.128:6379 主
    192.168.126.128:6380 从
    192.168.126.128:6381 从
      redis哨兵:
      192.168.126.128:26379
      192.168.126.128:26380
      192.168.126.128:26381
        redis版本:
        测试4.x,5.x,6.x等版本都有类似问题。

        问题现象1:

        redis 哨兵架构下,关闭master主库后,主从没有自动切换,哨兵日志提示如下:

          tail -100f sentinel_26379.log
            ......
            9569:X 08 Jul 2023 18:57:09.888 # +sdown master mymaster 192.168.126.128 6379
            9569:X 08 Jul 2023 18:57:09.956 # +odown master mymaster 192.168.126.128 6379 #quorum 2/2
            9569:X 08 Jul 2023 18:57:09.956 # +new-epoch 1
            9569:X 08 Jul 2023 18:57:09.956 # +try-failover master mymaster 192.168.126.128 6379
            9569:X 08 Jul 2023 18:57:09.966 # +vote-for-leader d3c28e6d3315e1b63cefffc09de23131f67c9309 1
            9569:X 08 Jul 2023 18:57:10.268 # b98812cd6e0b3e0a3a8527244cc09bb3a0c5f251 voted for d3c28e6d3315e1b63cefffc09de23131f67c9309 1
            9569:X 08 Jul 2023 18:57:10.270 # c56a629c526c80483792ecae8d80eba98796583b voted for d3c28e6d3315e1b63cefffc09de23131f67c9309 1
            9569:X 08 Jul 2023 18:57:10.343 # +elected-leader master mymaster 192.168.126.128 6379
            9569:X 08 Jul 2023 18:57:10.343 # +failover-state-select-slave master mymaster 192.168.126.128 6379
            9569:X 08 Jul 2023 18:57:10.410 # -failover-abort-no-good-slave master mymaster 192.168.126.128 6379
            9569:X 08 Jul 2023 18:57:10.463 # Next failover delay: I will not start a failover before Sat Jul 8 19:03:10 2023

            显示:-failover-abort-no-good-slave 没有可用的slave,故障转移终止,6分钟后再次尝试故障转移:

              [redis@cjc-db-01 log]$ cat sentinel_26379.log |grep delay
              9569:X 08 Jul 2023 18:57:10.463 # Next failover delay: I will not start a failover before Sat Jul 8 19:03:10 2023
              9569:X 08 Jul 2023 19:03:10.337 # Next failover delay: I will not start a failover before Sat Jul 8 19:09:10 2023
              9569:X 08 Jul 2023 19:09:10.515 # Next failover delay: I will not start a failover before Sat Jul 8 19:15:10 2023
              9569:X 08 Jul 2023 19:15:10.697 # Next failover delay: I will not start a failover before Sat Jul 8 19:21:10 2023
              9569:X 08 Jul 2023 19:21:11.081 # Next failover delay: I will not start a failover before Sat Jul 8 19:27:11 2023
              9569:X 08 Jul 2023 19:27:11.709 # Next failover delay: I will not start a failover before Sat Jul 8 19:33:11 2023
              ......

              最终故障转移一直失败,没有成功

              问题现象2:

              通过redis哨兵failover切换命令,执行报错:

                127.0.0.1:26379> sentinel failover mymaster
                (error) NOGOODSLAVE No suitable replica to promote

                两个问题现象都表示主从无法正常切换。

                问题原因:

                通常情况下,Next failover delay: I will not start a failover before Sat Jul ..错误的可能原因有以下几种:

                1:bind 参数

                  配置文件中 bind 参数没有配置或配置有问题,如无特殊要求, 可按如下方式配置:
                  bind 0.0.0.0

                  2.auth-pass参数

                    哨兵配置文件没有记录redis密码,或密码记录错误
                    添加参数,例如:
                    sentinel auth-pass mymaster 密码

                    3.protected-mode 参数

                      哨兵配置文件中启用了protected-mode,需要关闭,添加:
                      protected-mode no

                      4.rename-command CONFIG配置

                        redis.conf配置文件中,配置了rename-command CONFIG参数,例如:
                        rename-command CONFIG ""
                        将CONFIG命令禁用,防止人为修改
                        需要关闭此参数:
                        ###rename-command CONFIG ""

                        经排查,并不是上述四种原因导致的。

                        既然切换时找不到可以提升为master的从库,还有一种可能:

                        从库的slave-priority或replica-priority参数配置为0,其中0表示不能提升为master。

                        查询从库slave-priority/replica-priority参数:

                          127.0.0.1:6379> config get slave-priority
                          1) "slave-priority"
                          2) "0"
                            127.0.0.1:6381> config get slave-priority
                            1) "slave-priority"
                            2) "0"
                              127.0.0.1:6380> config get slave-priority
                              1) "slave-priority"
                              2) "0"

                              居然都是0,不允许提升为master主库,检查配置文件也都是0:

                                [redis@cjc-db-01 conf]$ cat redis_6379.conf |grep -i priority
                                replica-priority 0
                                [redis@cjc-db-01 conf]$ cat redis_6380.conf |grep -i priority
                                replica-priority 0
                                [redis@cjc-db-01 conf]$ cat redis_6381.conf |grep -i priority
                                replica-priority 0

                                很奇怪,slave-priority/replica-priority参数值默认值100,安装redis时replica-priority配置的值也是100。

                                为什么在没有手动调整redis.conf配置文件情况下,replica-priority值自动变成0了?

                                之前手动切换时,确实在redis内部修改过slave-priority参数,但是没有手动修改redis.conf配置文件,只是临时修改一次,用于切换。

                                  redis-cli -p 6379
                                  config set slave-priority 0

                                  难道redis修改参数类似于oracle的scope=both方式吗,修改后内存和配置文件同时生效?

                                  验证:

                                  修改参数前:

                                    127.0.0.1:6379> config get slave-priority
                                    1) "slave-priority"
                                    2) "100"

                                    查看配置文件

                                      [redis@cjc-db-01 conf]$ cat redis_6379.conf |grep -i priority
                                      replica-priority 100

                                      开始修改参数:

                                        127.0.0.1:6379> config set slave-priority 0

                                        查看配置文件,参数并没有发生改变

                                          [redis@cjc-db-01 conf]$ cat redis_6379.conf |grep -i priority
                                          replica-priority 100

                                          此时执行一次redis切换

                                            redis-cli -p 26379

                                            查看

                                              info Sentinel

                                              切换

                                                sentinel failover mymaster

                                                查看,切换完成

                                                  info Sentinel

                                                  再次查看配置文件:

                                                    [redis@cjc-db-01 conf]$ cat redis_6379.conf |grep -i priority
                                                    replica-priority 0

                                                    此时redis_6379.conf配置文件中replica-priority参数自动发生了改变,从100自动变成了0。

                                                    如果重启此redis实例,重启后,值仍为0。

                                                    解决方案:

                                                    在redis哨兵架构下,为了实现自动故障转移,建议将slave-priority或replica-priority参数都调为100。

                                                    即:

                                                    当前运行参数应该为100

                                                      127.0.0.1:6379> config get slave-priority 100

                                                      如不是100,手动修改

                                                        127.0.0.1:6379> config set slave-priority 100

                                                        配置文件中replica-priority参数应为100:

                                                          cat redis_6379.conf |grep -i priority

                                                          如果不是,需要手动修改

                                                            replica-priority 100

                                                            如果要模拟切换,保障至少1个节点slave-priority/replica-priority参数不为0。

                                                            总结:

                                                            由于之前模拟过几次redis哨兵自动切换,为了实现将主库切换到指定节点,手动调整了redis实例的slave-priority,从100改成0,修改后主库故障就会将master切换到另一个没有修改slave-priority参数的节点上,然后将之前修改的参数改回原值100。

                                                            但是:

                                                            切换完成后,修改slave-priority参数节点的redis.conf配置文件中replica-priority参数会被自动更新为0,如果之后这个节点发生过重启,由于redis.conf配置文件记录的replica-priority是0,重启后redis当前运行值也是0。如果两个从节点都经历过类似的情况,最终就会出现两个从节点的slave-priority/replica-priority运行参数都是0,最终导致无法正常执行切换。

                                                            实验过程如下:

                                                            启动实例

                                                              redis-server redis/conf/redis_6379.conf 
                                                              redis-server redis/conf/redis_6380.conf
                                                              redis-server redis/conf/redis_6381.conf

                                                              查看进程

                                                                ps -ef|grep redis|grep redis
                                                                redis 8984 1 0 17:56 ? 00:00:00 redis-server 0.0.0.0:6379
                                                                redis 8990 1 0 17:56 ? 00:00:00 redis-server 0.0.0.0:6380
                                                                redis 8998 1 0 17:56 ? 00:00:00 redis-server 0.0.0.0:6381

                                                                查看主从状态

                                                                  redis@cjc-db-01 conf]$ redis-cli -p 6379
                                                                  127.0.0.1:6379> auth 111
                                                                  OK
                                                                  127.0.0.1:6379> info Replication
                                                                  # Replication
                                                                  role:master
                                                                  connected_slaves:2
                                                                  min_slaves_good_slaves:2
                                                                  slave0:ip=192.168.126.128,port=6380,state=online,offset=112,lag=0
                                                                  slave1:ip=192.168.126.128,port=6381,state=online,offset=112,lag=1
                                                                  master_failover_state:no-failover
                                                                  master_replid:009ba10699b6689dbce7d1919495ca5786fe9ba4
                                                                  master_replid2:0000000000000000000000000000000000000000

                                                                  写入测试数据

                                                                    127.0.0.1:6379> get xxx
                                                                    (nil)
                                                                    127.0.0.1:6379> set xxx cjc
                                                                    OK
                                                                    127.0.0.1:6379> get xxx
                                                                    "cjc"

                                                                    检查数据同步

                                                                      [redis@cjc-db-01 conf]$ redis-cli -p 6380
                                                                      127.0.0.1:6380> auth 111
                                                                      OK
                                                                      127.0.0.1:6380> get xxx
                                                                      "cjc"


                                                                      [redis@cjc-db-01 conf]$ redis-cli -p 6381
                                                                      127.0.0.1:6381> auth 111
                                                                      OK
                                                                      127.0.0.1:6381> get xxx
                                                                      "cjc"

                                                                      先不启动哨兵,执行手动切换:

                                                                      模拟主库故障

                                                                        [redis@cjc-db-01 conf]$ ps -ef|grep 6379
                                                                        [redis@cjc-db-01 conf]$ kill -9 8984

                                                                        查看从库状态

                                                                          [redis@cjc-db-01 conf]$ redis-cli -p 6380
                                                                          127.0.0.1:6380> auth 111
                                                                          OK
                                                                          127.0.0.1:6380> info Replication
                                                                          # Replication
                                                                          role:slave
                                                                          master_host:192.168.126.128
                                                                          master_port:6379
                                                                          master_link_status:down
                                                                          master_last_io_seconds_ago:-1
                                                                          ......

                                                                          没有哨兵,不会自动切换

                                                                          从库不支持写操作

                                                                            127.0.0.1:6380> get xxx
                                                                            "cjc"
                                                                            127.0.0.1:6380> set yyy aaa
                                                                            (error) READONLY You can't write against a read only replica.
                                                                              [redis@cjc-db-01 conf]$ redis-cli -p 6381
                                                                              127.0.0.1:6381> auth 111
                                                                              OK
                                                                              127.0.0.1:6381> info Replication
                                                                              # Replication
                                                                              role:slave
                                                                              master_host:192.168.126.128
                                                                              master_port:6379
                                                                              master_link_status:down
                                                                              ......


                                                                              127.0.0.1:6381> get xxx
                                                                              "cjc"
                                                                              127.0.0.1:6381> set yyy aaa
                                                                              (error) READONLY You can't write against a read only replica.

                                                                              手动切换

                                                                              将6380提升为主

                                                                              6380:

                                                                                [redis@cjc-db-01 conf]$ redis-cli -p 6380
                                                                                127.0.0.1:6380> auth 111
                                                                                OK

                                                                                中断主从关系,角色变成master,原来同步所得的数据集不会被丢弃

                                                                                  127.0.0.1:6380> slaveof no one
                                                                                  OK
                                                                                  127.0.0.1:6380> info Replication
                                                                                  # Replication
                                                                                  role:master
                                                                                  connected_slaves:0
                                                                                  min_slaves_good_slaves:0
                                                                                  master_failover_state:no-failover
                                                                                  master_replid:43bb3d9953e3af5b34a8c83193043c55ad45fb09
                                                                                  master_replid2:009ba10699b6689dbce7d1919495ca5786fe9ba4
                                                                                  master_repl_offset:390
                                                                                  second_repl_offset:391
                                                                                  repl_backlog_active:1
                                                                                  repl_backlog_size:1048576
                                                                                  repl_backlog_first_byte_offset:1
                                                                                  repl_backlog_histlen:390

                                                                                  将6381节点对应主库改成6380

                                                                                    [redis@cjc-db-01 conf]$ redis-cli -p 6381
                                                                                    127.0.0.1:6381> auth 111
                                                                                    OK


                                                                                    127.0.0.1:6381> info Replication
                                                                                    # Replication
                                                                                    role:slave
                                                                                    master_host:192.168.126.128
                                                                                    master_port:6379
                                                                                    master_link_status:down
                                                                                    ......


                                                                                    127.0.0.1:6381> slaveof 192.168.126.128 6380
                                                                                    OK


                                                                                    127.0.0.1:6381> info Replication
                                                                                    # Replication
                                                                                    role:slave
                                                                                    master_host:192.168.126.128
                                                                                    master_port:6380
                                                                                    master_link_status:up
                                                                                    master_last_io_seconds_ago:2
                                                                                    ......

                                                                                    主从同步

                                                                                      redis@cjc-db-01 conf]$ redis-cli -p 6380
                                                                                      127.0.0.1:6380> auth 111
                                                                                      OK


                                                                                      127.0.0.1:6380> info Replication
                                                                                      # Replication
                                                                                      role:master
                                                                                      connected_slaves:1
                                                                                      min_slaves_good_slaves:1
                                                                                      slave0:ip=192.168.126.128,port=6381,state=online,offset=488,lag=0
                                                                                      master_failover_state:no-failover
                                                                                      master_replid:43bb3d9953e3af5b34a8c83193043c55ad45fb09
                                                                                      master_replid2:009ba10699b6689dbce7d1919495ca5786fe9ba4


                                                                                      127.0.0.1:6380> set zzz iii

                                                                                      启动6379,并加入到主从

                                                                                        redis-server redis/conf/redis_6379.conf 
                                                                                        [redis@cjc-db-01 conf]$ redis-cli -p 6379
                                                                                        127.0.0.1:6379> auth 111
                                                                                        OK


                                                                                        127.0.0.1:6379> info Replication
                                                                                        # Replication
                                                                                        role:master
                                                                                        connected_slaves:0
                                                                                        min_slaves_good_slaves:0


                                                                                        加入到主从
                                                                                        127.0.0.1:6379> get zzz
                                                                                        (nil)
                                                                                        127.0.0.1:6379> slaveof 192.168.126.128 6380
                                                                                        OK
                                                                                        127.0.0.1:6379> get zzz
                                                                                        "iii"


                                                                                        [redis@cjc-db-01 conf]$ redis-cli -p 6380
                                                                                        127.0.0.1:6380> auth 111
                                                                                        OK
                                                                                        127.0.0.1:6380> info Replication
                                                                                        # Replication
                                                                                        role:master
                                                                                        connected_slaves:2
                                                                                        min_slaves_good_slaves:2
                                                                                        slave0:ip=192.168.126.128,port=6381,state=online,offset=2068,lag=1
                                                                                        slave1:ip=192.168.126.128,port=6379,state=online,offset=2068,lag=0
                                                                                        master_failover_state:no-failover

                                                                                        手动切回原主库

                                                                                          [redis@cjc-db-01 conf]$ redis-cli -p 6379
                                                                                          127.0.0.1:6379> auth 111
                                                                                          OK
                                                                                          127.0.0.1:6379> SLAVEOF NO ONE 


                                                                                          [redis@cjc-db-01 conf]$ redis-cli -p 6380
                                                                                          127.0.0.1:6380> auth 111
                                                                                          OK
                                                                                          127.0.0.1:6380> slaveof 192.168.126.128 6379
                                                                                          OK


                                                                                          [redis@cjc-db-01 conf]$ redis-cli -p 6381
                                                                                          127.0.0.1:6381> auth 111
                                                                                          OK
                                                                                          127.0.0.1:6381> slaveof 192.168.126.128 6379
                                                                                          OK

                                                                                          查看

                                                                                            [redis@cjc-db-01 conf]$ redis-cli -p 6379
                                                                                            127.0.0.1:6379> auth 111
                                                                                            OK


                                                                                            127.0.0.1:6379> info Replication
                                                                                            # Replication
                                                                                            role:master
                                                                                            connected_slaves:2
                                                                                            min_slaves_good_slaves:2
                                                                                            slave0:ip=192.168.126.128,port=6380,state=online,offset=2222,lag=0
                                                                                            slave1:ip=192.168.126.128,port=6381,state=online,offset=2222,lag=0
                                                                                            master_failover_state:no-failover

                                                                                            以上就是手动故障转移过程,如果使用哨兵,会帮我们自动完成上述过程:

                                                                                            启动哨兵

                                                                                              redis-sentinel /redis/conf/redis_26379.conf

                                                                                              可以看到,启动26379哨兵后,对应配置文件新增# Generated by CONFIG REWRITE部分,自动发现了两个从节点:

                                                                                              原文件内容:

                                                                                                [redis@cjc-db-01 conf]$ cat redis_26379.conf 
                                                                                                port 26379
                                                                                                logfile "/redis/log/sentinel_26379.log"
                                                                                                dir "/redis/26379/data"
                                                                                                pidfile "/redis/26379/pid/redis_26379.pid"
                                                                                                bind 0.0.0.0
                                                                                                protected-mode no
                                                                                                daemonize yes
                                                                                                sentinel monitor mymaster 192.168.126.128 6380 2
                                                                                                sentinel down-after-milliseconds mymaster 8000
                                                                                                sentinel auth-pass mymaster 111

                                                                                                启动后,文件内容:

                                                                                                  [redis@cjc-db-01 conf]$ cat redis_26379.conf 


                                                                                                  port 26379
                                                                                                  logfile "/redis/log/sentinel_26379.log"
                                                                                                  dir "/redis/26379/data"
                                                                                                  pidfile "/redis/26379/pid/redis_26379.pid"
                                                                                                  bind 0.0.0.0
                                                                                                  protected-mode no
                                                                                                  daemonize yes
                                                                                                  sentinel monitor mymaster 192.168.126.128 6379 2
                                                                                                  sentinel down-after-milliseconds mymaster 8000
                                                                                                  sentinel auth-pass mymaster 111
                                                                                                  # Generated by CONFIG REWRITE
                                                                                                  user default on nopass ~* &* +@all
                                                                                                  sentinel myid d3c28e6d3315e1b63cefffc09de23131f67c9309
                                                                                                  sentinel config-epoch mymaster 0
                                                                                                  sentinel leader-epoch mymaster 0
                                                                                                  sentinel current-epoch 0
                                                                                                  sentinel known-replica mymaster 192.168.126.128 6381
                                                                                                  sentinel known-replica mymaster 192.168.126.128 6380

                                                                                                  启动其他哨兵

                                                                                                    redis-sentinel /redis/conf/redis_26380.conf 
                                                                                                    redis-sentinel /redis/conf/redis_26381.conf

                                                                                                    查看最终的哨兵配置文件

                                                                                                    26379自动写入部分:

                                                                                                      # Generated by CONFIG REWRITE
                                                                                                      user default on nopass ~* &* +@all
                                                                                                      sentinel myid d3c28e6d3315e1b63cefffc09de23131f67c9309
                                                                                                      sentinel config-epoch mymaster 0
                                                                                                      sentinel leader-epoch mymaster 0
                                                                                                      sentinel current-epoch 0
                                                                                                      sentinel known-replica mymaster 192.168.126.128 6381
                                                                                                      sentinel known-replica mymaster 192.168.126.128 6380
                                                                                                      sentinel known-sentinel mymaster 192.168.126.128 26380 b98812cd6e0b3e0a3a8527244cc09bb3a0c5f251
                                                                                                      sentinel known-sentinel mymaster 192.168.126.128 26381 c56a629c526c80483792ecae8d80eba98796583b

                                                                                                      26380自动写入部分:

                                                                                                        # Generated by CONFIG REWRITE
                                                                                                        user default on nopass ~* &* +@all
                                                                                                        sentinel myid b98812cd6e0b3e0a3a8527244cc09bb3a0c5f251
                                                                                                        sentinel config-epoch mymaster 0
                                                                                                        sentinel leader-epoch mymaster 0
                                                                                                        sentinel current-epoch 0
                                                                                                        sentinel known-replica mymaster 192.168.126.128 6380
                                                                                                        sentinel known-replica mymaster 192.168.126.128 6381
                                                                                                        sentinel known-sentinel mymaster 192.168.126.128 26381 c56a629c526c80483792ecae8d80eba98796583b
                                                                                                        sentinel known-sentinel mymaster 192.168.126.128 26379 d3c28e6d3315e1b63cefffc09de23131f67c9309

                                                                                                        26381自动写入部分:

                                                                                                          # Generated by CONFIG REWRITE
                                                                                                          user default on nopass ~* &* +@all
                                                                                                          sentinel myid c56a629c526c80483792ecae8d80eba98796583b
                                                                                                          sentinel config-epoch mymaster 0
                                                                                                          sentinel leader-epoch mymaster 0
                                                                                                          sentinel current-epoch 0
                                                                                                          sentinel known-replica mymaster 192.168.126.128 6380
                                                                                                          sentinel known-replica mymaster 192.168.126.128 6381
                                                                                                          sentinel known-sentinel mymaster 192.168.126.128 26380 b98812cd6e0b3e0a3a8527244cc09bb3a0c5f251
                                                                                                          sentinel known-sentinel mymaster 192.168.126.128 26379 d3c28e6d3315e1b63cefffc09de23131f67c9309

                                                                                                          可以看到,哨兵节点启动后,会在配置文件中自动写入从节点IP、端口信息,其他两个哨兵节点的IP、端口信息。

                                                                                                          停止主库,查看哨兵变化

                                                                                                            [redis@cjc-db-01 conf]$ redis-cli -p 6379
                                                                                                            127.0.0.1:6379> auth 111
                                                                                                            OK
                                                                                                            127.0.0.1:6379> info Replication
                                                                                                            # Replication
                                                                                                            role:master
                                                                                                            connected_slaves:2
                                                                                                            min_slaves_good_slaves:2
                                                                                                            slave0:ip=192.168.126.128,port=6380,state=online,offset=317423,lag=1
                                                                                                            slave1:ip=192.168.126.128,port=6381,state=online,offset=317423,lag=0
                                                                                                            master_failover_state:no-failover
                                                                                                            master_replid:fc7968519b15a9090407133791ea3551cdd67716
                                                                                                            master_replid2:43bb3d9953e3af5b34a8c83193043c55ad45fb09
                                                                                                            ......
                                                                                                            127.0.0.1:6379> shutdown save
                                                                                                            not connected>

                                                                                                            查看哨兵日志

                                                                                                              tail -100f sentinel_26379.log 
                                                                                                              ......
                                                                                                              9569:X 08 Jul 2023 18:57:09.888 # +sdown master mymaster 192.168.126.128 6379
                                                                                                              9569:X 08 Jul 2023 18:57:09.956 # +odown master mymaster 192.168.126.128 6379 #quorum 2/2
                                                                                                              9569:X 08 Jul 2023 18:57:09.956 # +new-epoch 1
                                                                                                              9569:X 08 Jul 2023 18:57:09.956 # +try-failover master mymaster 192.168.126.128 6379
                                                                                                              9569:X 08 Jul 2023 18:57:09.966 # +vote-for-leader d3c28e6d3315e1b63cefffc09de23131f67c9309 1
                                                                                                              9569:X 08 Jul 2023 18:57:10.268 # b98812cd6e0b3e0a3a8527244cc09bb3a0c5f251 voted for d3c28e6d3315e1b63cefffc09de23131f67c9309 1
                                                                                                              9569:X 08 Jul 2023 18:57:10.270 # c56a629c526c80483792ecae8d80eba98796583b voted for d3c28e6d3315e1b63cefffc09de23131f67c9309 1
                                                                                                              9569:X 08 Jul 2023 18:57:10.343 # +elected-leader master mymaster 192.168.126.128 6379
                                                                                                              9569:X 08 Jul 2023 18:57:10.343 # +failover-state-select-slave master mymaster 192.168.126.128 6379
                                                                                                              9569:X 08 Jul 2023 18:57:10.410 # -failover-abort-no-good-slave master mymaster 192.168.126.128 6379
                                                                                                              9569:X 08 Jul 2023 18:57:10.463 # Next failover delay: I will not start a failover before Sat Jul 8 19:03:10 2023

                                                                                                              从这里可以看到sentinel由于failover超时,导致切换延迟,并告知在几点之后进行下一次failover。

                                                                                                              最新等待6分钟后仍然没有自动切换完成。

                                                                                                              查看主从状态

                                                                                                                [redis@cjc-db-01 conf]$ redis-cli -p 6380
                                                                                                                127.0.0.1:6380> auth 111
                                                                                                                OK
                                                                                                                127.0.0.1:6380> info Replication
                                                                                                                # Replication
                                                                                                                role:slave
                                                                                                                master_host:192.168.126.128
                                                                                                                master_port:6379
                                                                                                                master_link_status:down
                                                                                                                  [redis@cjc-db-01 conf]$ redis-cli -p 6381
                                                                                                                  127.0.0.1:6381> auth 111
                                                                                                                  OK
                                                                                                                  127.0.0.1:6381> info Replication
                                                                                                                  # Replication
                                                                                                                  role:slave
                                                                                                                  master_host:192.168.126.128
                                                                                                                  master_port:6379
                                                                                                                  master_link_status:down
                                                                                                                  master_last_io_seconds_ago:-1

                                                                                                                  查看哨兵状态

                                                                                                                    127.0.0.1:26379> info Sentinel
                                                                                                                    # Sentinel
                                                                                                                    sentinel_masters:1
                                                                                                                    sentinel_tilt:0
                                                                                                                    sentinel_running_scripts:0
                                                                                                                    sentinel_scripts_queue_length:0
                                                                                                                    sentinel_simulate_failure_flags:0
                                                                                                                    master0:name=mymaster,status=odown,address=192.168.126.128:6379,slaves=2,sentinels=3

                                                                                                                    启动原主库 6379

                                                                                                                      [redis@cjc-db-01 conf]$ redis-server redis_6379.conf 
                                                                                                                      [redis@cjc-db-01 conf]$ redis-cli -p 6379
                                                                                                                      127.0.0.1:6379> auth 111
                                                                                                                      OK

                                                                                                                      启动后,master会自动加回原主库

                                                                                                                        127.0.0.1:6379> info Replication
                                                                                                                        # Replication
                                                                                                                        role:master
                                                                                                                        connected_slaves:2
                                                                                                                        min_slaves_good_slaves:2
                                                                                                                        slave0:ip=192.168.126.128,port=6380,state=online,offset=6658,lag=1
                                                                                                                        slave1:ip=192.168.126.128,port=6381,state=online,offset=6658,lag=1
                                                                                                                        master_failover_state:no-failover
                                                                                                                        master_replid:361d438f4fae84a4c9501954f515d02cf31669ca
                                                                                                                        master_replid2:0000000000000000000000000000000000000000

                                                                                                                        调整权重

                                                                                                                          redis-cli -p 6379
                                                                                                                          redis-cli -p 6380
                                                                                                                          redis-cli -p 6381
                                                                                                                          config set slave-priority 100

                                                                                                                          再次关闭主库

                                                                                                                          查看,可以自动切换了

                                                                                                                            127.0.0.1:6379> info Replication
                                                                                                                            # Replication
                                                                                                                            role:master
                                                                                                                            connected_slaves:2
                                                                                                                            min_slaves_good_slaves:2
                                                                                                                            slave0:ip=192.168.126.128,port=6380,state=online,offset=63656,lag=0
                                                                                                                            slave1:ip=192.168.126.128,port=6381,state=online,offset=63510,lag=1

                                                                                                                            切换后,redis.conf配置文件会自动更新:

                                                                                                                            查看redis配置文件自动更新

                                                                                                                              cat redis_6381.conf

                                                                                                                              故障自动转移后,自动删除了:

                                                                                                                                slaveof 192.168.126.128 6379

                                                                                                                                自动添加了如下内容:

                                                                                                                                  replicaof 192.168.126.128 6380


                                                                                                                                  # Generated by CONFIG REWRITE
                                                                                                                                  user default on #f6e0a1e2ac41945a9aa7ff8a8aaa0cebc12a3bcc981a929ad5cf810a090e11ae ~* &* +@all

                                                                                                                                  查看6380节点

                                                                                                                                    cat redis_6380.conf

                                                                                                                                    自动删除了:

                                                                                                                                      slaveof 192.168.126.128 6379

                                                                                                                                      自动添加了

                                                                                                                                        # Generated by CONFIG REWRITE
                                                                                                                                        user default on #f6e0a1e2ac41945a9aa7ff8a8aaa0cebc12a3bcc981a929ad5cf810a090e11ae ~* &* +@all

                                                                                                                                        ###chenjuchao 20230708 17:00###

                                                                                                                                        文章转载自IT小Chen,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                                                                                                                                        评论