暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

openGauss 6.0 主备切换 switchover和failover 实操

openGauss 2024-12-06
412

主备架构概述

openGauss 主备数据库通过复制技术实现数据同步,主数据库负责处理业务请求,并将数据变更记录到 WAL(Write-Ahead Logging)日志中,备用数据库通过读取主数据库的 WAL 日志来保持与主数据库的数据同步。主备切换是为了确保数据库的高可用性,当主数据库出现故障时,能够切换到备用数据库,以保证业务的连续性。

主备切换的方式有以下两种:

  • switchover:主备之间的角色切换,通常用于计划内的切换,如主节点出于维护的需要,需要手动进行主备切换,将备节点切换为主,主节点切换成备节点。

  • failover:主节点发生故障后,备节点提升为主,保证系统可用性。

环境说明

当前环境如下:

角色主机名IP地址操作系统版本数据库版本
master10.10.10.165Kylin Linux V10openGauss 6.0.0
slave10.10.10.166Kylin Linux V10openGauss 6.0.0


当前主节点10.10.10.165,备节点为10.10.10.166,运行在银河麒麟的操作系统上。

switchover切换

switchover需要主备节点都正常的情况下,才可以完成。如果主库宕机状态下执行switchover,会因为无法获取主库状态而超时退出switchover。

查看主备情况

查看所有节点状态

    ###切换omm用户 
    # su - omm


    ###查看所有节点状态
    [omm@master ~]$ gs_om -t status --detail
    [ Cluster State ]


    cluster_state : Normal
    redistributing : No
    current_az : AZ_ALL


    [ Datanode State ]


    node node_ip port instance state
    ------------------------------------------------------------------------------------------------------------
    1 master 10.10.10.165 15400 6001 opt/software/openGauss5.0/install/data/dn P Primary Normal
    2  slave  10.10.10.166    15400      6002 /opt/software/openGauss5.0/install/data/dn   S Standby Normal

    节点状态参数说明:

    !

    执行主备切换

    使用omm用户登录到备节点,执行以下命令,完成切换即可。

      [root@slave ~]# su - omm
      Last login: Fri Nov 8 12:20:56 CST 2024 on pts/0
      [omm@slave ~]$ gs_ctl switchover -D opt/software/openGauss5.0/install/data/dn
      [2024-11-08 12:25:00.619][102526][][gs_ctl]: gs_ctl switchover ,datadir is opt/software/openGauss5.0/install/data/dn
      [2024-11-08 12:25:00.619][102526][][gs_ctl]: switchover term (1)
      [2024-11-08 12:25:00.629][102526][][gs_ctl]: waiting for server to switchover........
      [2024-11-08 12:25:05.723][102526][][gs_ctl]: done
      [2024-11-08 12:25:05.723][102526][][gs_ctl]: switchover completed (/opt/software/openGauss5.0/install/data/dn)

      注意:

      1、对于同一数据库,上一次主备切换未完成,不能执行下一次切换。
      2、当业务正在操作时,发起switchover,可能主机的线程无法停止导致switchover显示超时,实际后台仍然在运行,等主机线程停止后,switchover即可完成。

      验证切换结果

      switchover完成后,执行如下命令查看当前主备信息:

        ###刷新主备配置信息
        $ gs_om -t refreshconf
        Generating dynamic configuration file for all nodes.
        Successfully generated dynamic configuration file.


        ###切换检查(可以看到 10.10.10.166,从 Standby Normal变为了 Primary Normal)
        [omm@slave ~]$ gs_om -t status --detail
        [ Cluster State ]


        cluster_state : Normal
        redistributing : No
        current_az : AZ_ALL


        [ Datanode State ]


        node node_ip port instance state
        ------------------------------------------------------------------------------------------------------------
        1 master 10.10.10.165 15400 6001 opt/software/openGauss5.0/install/data/dn P Standby Normal
        2 slave 10.10.10.166 15400 6002 opt/software/openGauss5.0/install/data/dn S Primary Normal


        可以看到当前主库为10.10.10.166,备库为10.10.10.165

        ###新主库插入测试数据

          INSERT INTO t_user(id, name, age) VALUES
          (1008, 'Kim',19),
          (1009, 'Annie ',16);

          ###新备库查询数据

            openGauss=# select * from t_user;
            id | name | age
            ------+--------+-----
            1001 | Ross | 17
            1002 | Julie | 19
            1003 | Tom | 20
            1004 | Joan | 21
            1005 | Joes | 18
            1006 | Lily | 19
            1007 | Linda | 17
            1008 | Kim | 19
            1009 | Annie | 16
            (9 rows)

            failover切换

            failover的过程是在备库独立完成的,不需要和主库进行交互。如果主库运行正常,执行failover会导致双主发生。

            模拟主节点宕机

            将数据库主实例所处节点进行关机,以达到模拟主实例挂机的效果。将主节点关机后,目前仅剩备节点,使用omm用户登录备节点后使用gs_om查询状态,会发现gs_om运行非常慢,且无法返回集群信息,所以在此需要使用gs_ctl工具查询集群状态,发现备库处于 Standby Need repair(Disconnected)。

              --使用gs_om查询会卡住
              [omm@master ~]$ gs_om -t status --detail


              --使用gs_ctl来进行查询
              [omm@master ~]$ gs_ctl query -Cv -D opt/software/openGauss5.0/install/data/dn
              [2024-11-08 13:25:28.334][17440][][gs_ctl]: gs_ctl query ,datadir is opt/software/openGauss5.0/install/data/dn
              HA state:
              local_role : Standby
              static_connections : 1
              db_state : Need repair
              detail_information : Disconnected


              Senders info:
              No information
              Receiver info:
              No information


              ###以上-D指定的是数据目录,当前数据库安装的数据目录/opt/software/openGauss5.0/install/data/dn

              备节点执行failover切换

              在当前备节点执行以下命令完成切换:

                [omm@master ~]$ gs_ctl failover -D opt/software/openGauss5.0/install/data/dn
                [2024-11-08 13:26:11.898][17799][][gs_ctl]: gs_ctl failover ,datadir is opt/software/openGauss5.0/install/data/dn
                [2024-11-08 13:26:11.898][17799][][gs_ctl]: failover term (1)
                [2024-11-08 13:26:11.904][17799][][gs_ctl]: waiting for server to failover...
                .
                [2024-11-08 13:26:12.945][17799][][gs_ctl]: done
                [2024-11-08 13:26:12.946][17799][][gs_ctl]:  failover completed (/opt/software/openGauss5.0/install/data/dn)

                查询状态

                  ###使用gs_om查询还是会卡住
                  [omm@master ~]$ gs_om -t status --detail
                  ###使用 gs_ctl 进行查询
                  [omm@master ~]$ gs_ctl query -Cv -D opt/software/openGauss5.0/install/data/dn
                  [2024-11-08 13:26:44.070][18724][][gs_ctl]: gs_ctl query ,datadir is opt/software/openGauss5.0/install/data/dn
                  HA state:
                  local_role : Primary
                  static_connections : 1
                  db_state : Normal
                  detail_information : Normal


                  Senders info:
                  No information
                  Receiver info:
                  No information


                  ###当前运行节点为1台

                  恢复故障节点

                  将关机的主机重新开机,然后再查询状态

                    [omm@master ~]$ gs_om -t status --detail
                    [ Cluster State ]


                    cluster_state : Degraded
                    redistributing : No
                    current_az : AZ_ALL


                    [ Datanode State ]


                    node node_ip port instance state
                    ------------------------------------------------------------------------------------------------------------
                    1 master 10.10.10.165 15400 6001 opt/software/openGauss5.0/install/data/dn P Primary Normal
                    2  slave  10.10.10.166    15400      6002 /opt/software/openGauss5.0/install/data/dn   S Down    Manually stopped

                    可以看到节点上的实例状态为Down。

                    恢复到主备状态

                    如果直接在备节点使用gs_om -t start或者gs_ctl start -D opt/software/openGauss5.0/install/data/dn启动后,可能出现双主的情况,会导致整个集群出现问题。

                      [omm@master ~]$ gs_om -t status --detail
                      [ Cluster State ]


                      cluster_state : Unavailable
                      redistributing : No
                      current_az : AZ_ALL


                      [ Datanode State ]


                      node node_ip port instance state
                      ------------------------------------------------------------------------------------------------------------
                      1 master 10.10.10.165 15400 6001 opt/software/openGauss5.0/install/data/dn P Primary Normal
                      2 slave 10.10.10.166 15400 6002 opt/software/openGauss5.0/install/data/dn S Primary Normal


                      ###可以看到上面出现双主问题

                      解决办法:

                      如果已经出现双主机模式,那么需要先将打算作为备实例的节点先关闭服务,然后启动实例时指定运行模式为standby,

                        ##备节点关闭
                        gs_ctl stop -D opt/software/openGauss5.0/install/data/dn


                        ##备节点启动
                        gs_ctl start -M standby -D /opt/software/openGauss5.0/install/data/dn

                        验证结果

                        启动后,使用refreshconf刷新集群配置信息:

                          [omm@master ~]$ gs_om -t refreshconf
                          Generating dynamic configuration file for all nodes.
                          Successfully generated dynamic configuration file.
                          [omm@master ~]$ gs_om -t status --detail
                          [ Cluster State ]


                          cluster_state : Degraded
                          redistributing : No
                          current_az : AZ_ALL


                          [ Datanode State ]


                          node node_ip port instance state
                          ------------------------------------------------------------------------------------------------------------
                          1 master 10.10.10.165 15400 6001 /opt/software/openGauss5.0/install/data/dn P Primary Normal
                          2  slave  10.10.10.166    15400      6002 /opt/software/openGauss5.0/install/data/dn   S Standby Need repair(WAL)

                          查询当前实例状态时,standby节点状态提示“Need repair(WAL)”。

                          在备节点上执行以下命令进行修复,修复中会有faild报错,可以忽略,主要看最后几行都执行成功即可:

                            [omm@slave ~]$ gs_ctl build -D /opt/software/openGauss5.0/install/data/dn
                            [2024-11-08 13:00:25.623][12030][][gs_ctl]: gs_ctl incremental build ,datadir is /opt/software/openGauss5.0/install/data/dn
                            [2024-11-08 13:00:25.623][12030][][gs_ctl]: fopen build pid file "/opt/software/openGauss5.0/install/data/dn/gs_build.pid" success
                            [2024-11-08 13:00:25.623][12030][][gs_ctl]: fprintf build pid file "/opt/software/openGauss5.0/install/data/dn/gs_build.pid" success
                            [2024-11-08 13:00:25.624][12030][][gs_ctl]: fsync build pid file "/opt/software/openGauss5.0/install/data/dn/gs_build.pid" success
                            waiting for server to shut down..... done
                            server stopped
                            [2024-11-08 13:00:27.662][12030][dn_6001_6002][gs_ctl]: Get repl_auth_mode is and repl_uuid is
                            [2024-11-08 13:00:27.669][12030][dn_6001_6002][gs_ctl]: build try host(10.10.10.165) port(15401) success
                            [2024-11-08 13:00:27.670][12030][dn_6001_6002][gs_rewind]: set gaussdb state file when incremental build:db state(BUILDING_STATE), server mode(STANDBY_MODE), build mode(INC_BUILD).
                            [2024-11-08 13:00:27.693][12030][dn_6001_6002][gs_rewind]: connected to server: host=10.10.10.165 port=15401 dbname=postgres application_name=gs_rewind connect_timeout=5 rw_timeout=600
                            [2024-11-08 13:00:27.715][12030][dn_6001_6002][gs_rewind]: connect to primary success
                            [2024-11-08 13:00:27.719][12030][dn_6001_6002][gs_rewind]: get pg_control success
                            [2024-11-08 13:00:27.719][12030][dn_6001_6002][gs_rewind]: target server was interrupted in mode 2.
                            [2024-11-08 13:00:27.719][12030][dn_6001_6002][gs_rewind]: sanityChecks success
                            [2024-11-08 13:00:27.719][12030][dn_6001_6002][gs_rewind]: find last checkpoint at 0/4B32AC0 and checkpoint redo at 0/4B32A40 from source control file
                            [2024-11-08 13:00:27.719][12030][dn_6001_6002][gs_rewind]: find last checkpoint at 0/4B32240 and checkpoint redo at 0/4B32240 from target control file
                            [2024-11-08 13:00:27.744][12030][dn_6001_6002][gs_rewind]: find max lsn success, find max lsn rec (0/4B32240) success.
                            [2024-11-08 13:00:27.744][12030][dn_6001_6002][gs_rewind]: Get repl_auth_mode is and repl_uuid is
                            [2024-11-08 13:00:27.749][12030][dn_6001_6002][gs_rewind]: build try host(10.10.10.165) port(15401) success
                            [2024-11-08 13:00:27.750][12030][dn_6001_6002][gs_rewind]: request lsn is 0/4B32240 and its crc(source, target):[406373680, 3196163370]
                            [2024-11-08 13:00:27.755][12030][dn_6001_6002][gs_rewind]: build try host(10.10.10.165) port(15401) success
                            [2024-11-08 13:00:27.756][12030][dn_6001_6002][gs_rewind]: request lsn is 0/4B321A0 and its crc(source, target):[1420499081, 1310807395]
                            [2024-11-08 13:00:27.761][12030][dn_6001_6002][gs_rewind]: build try host(10.10.10.165) port(15401) success
                            [2024-11-08 13:00:27.761][12030][dn_6001_6002][gs_rewind]: request lsn is 0/4B32080 and its crc(source, target):[428762520, 2353216142]
                            [2024-11-08 13:00:27.767][12030][dn_6001_6002][gs_rewind]: build try host(10.10.10.165) port(15401) success
                            [2024-11-08 13:00:27.767][12030][dn_6001_6002][gs_rewind]: request lsn is 0/4B31F48 and its crc(source, target):[1942017569, 1250331274]
                            [2024-11-08 13:00:27.773][12030][dn_6001_6002][gs_rewind]: build try host(10.10.10.165) port(15401) success
                            [2024-11-08 13:00:27.773][12030][dn_6001_6002][gs_rewind]: request lsn is 0/4B31E28 and its crc(source, target):[603467580, 3056955434]
                            [2024-11-08 13:00:27.779][12030][dn_6001_6002][gs_rewind]: build try host(10.10.10.165) port(15401) success
                            [2024-11-08 13:00:27.780][12030][dn_6001_6002][gs_rewind]: request lsn is 0/4B31D08 and its crc(source, target):[2885812938, 2885812938]
                            [2024-11-08 13:00:27.780][12030][dn_6001_6002][gs_rewind]: find common checkpoint 0/4B31D08
                            [2024-11-08 13:00:27.780][12030][dn_6001_6002][gs_rewind]: find diverge point success
                            [2024-11-08 13:00:27.780][12030][dn_6001_6002][gs_rewind]: read checkpoint redo (0/4B31D08) success before rewinding.
                            [2024-11-08 13:00:27.780][12030][dn_6001_6002][gs_rewind]: rewinding from checkpoint redo point at 0/4B31D08 on timeline 1
                            [2024-11-08 13:00:27.780][12030][dn_6001_6002][gs_rewind]: diverge xlogfile is 000000010000000000000004, older ones will not be copied or removed.
                            [2024-11-08 13:00:27.783][12030][dn_6001_6002][gs_rewind]: targetFileStatThread success pid 140323422025472.
                            [2024-11-08 13:00:27.783][12030][dn_6001_6002][gs_rewind]: reading source file list
                            [2024-11-08 13:00:27.783][12030][dn_6001_6002][gs_rewind]: traverse_datadir start.
                            [2024-11-08 13:00:27.783][12030][dn_6001_6002][gs_rewind]: Get log directory guc is /var/log/omm/omm/pg_log/dn_6002
                            [2024-11-08 13:00:27.807][12030][dn_6001_6002][gs_rewind]: filemap_list_to_array start.
                            [2024-11-08 13:00:27.808][12030][dn_6001_6002][gs_rewind]: filemap_list_to_array end sort start. length is 2018
                            [2024-11-08 13:00:27.808][12030][dn_6001_6002][gs_rewind]: sort end.
                            [2024-11-08 13:00:27.816][12030][dn_6001_6002][gs_rewind]: targetFileStatThread return success.
                            [2024-11-08 13:00:27.827][12030][dn_6001_6002][gs_rewind]: reading target file list
                            [2024-11-08 13:00:27.832][12030][dn_6001_6002][gs_rewind]: traverse target datadir success
                            [2024-11-08 13:00:27.832][12030][dn_6001_6002][gs_rewind]: reading WAL in target
                            [2024-11-08 13:00:27.832][12030][dn_6001_6002][gs_rewind]: could not read WAL record at 0/4B322E0: invalid record length at 0/4B322E0: wanted 32, got 0
                            [2024-11-08 13:00:27.833][12030][dn_6001_6002][gs_rewind]: calculate totals rewind success
                            [2024-11-08 13:00:27.833][12030][dn_6001_6002][gs_rewind]: need to copy 60MB (total source directory size is 109MB)
                            [2024-11-08 13:00:27.833][12030][dn_6001_6002][gs_rewind]: starting background WAL receiver
                            [2024-11-08 13:00:27.833][12030][dn_6001_6002][gs_rewind]: Starting copy xlog, start point: 0/4B31D08
                            [2024-11-08 13:00:27.833][12030][dn_6001_6002][gs_rewind]: in gs_rewind proecess,so no need remove.
                            [2024-11-08 13:00:27.840][12030][dn_6001_6002][gs_rewind]: build try host(10.10.10.165) port(15401) success
                            [2024-11-08 13:00:27.841][12030][dn_6001_6002][gs_rewind]: check identify system success
                            [2024-11-08 13:00:27.841][12030][dn_6001_6002][gs_rewind]: send START_REPLICATION 0/4000000 success
                            Begin fetching files
                            Progress: [==================================================] 100% (61721/61721KB). fetch files
                            Finish fetching files
                            [2024-11-08 13:00:28.301][12030][dn_6001_6002][gs_rewind]: execute file map success
                            [2024-11-08 13:00:28.313][12030][dn_6001_6002][gs_rewind]: find minRecoveryPoint success from xlog insert location 0/4B37148
                            [2024-11-08 13:00:28.313][12030][dn_6001_6002][gs_rewind]: update pg_control file success, minRecoveryPoint: 0/4B37148, ckpLoc:0/4B32AC0, ckpRedo:0/4B32A40, preCkp:0/4B329A0
                            [2024-11-08 13:00:28.316][12030][dn_6001_6002][gs_rewind]: update pg_dw file success
                            [2024-11-08 13:00:28.317][12030][dn_6001_6002][gs_rewind]: xlog end point: 0/4B37148
                            [2024-11-08 13:00:28.317][12030][dn_6001_6002][gs_rewind]: waiting for background process to finish streaming...
                            [2024-11-08 13:00:32.903][12030][dn_6001_6002][gs_rewind]: truncating and removing old xlog files
                            [2024-11-08 13:00:32.911][12030][dn_6001_6002][gs_rewind]: truncate and remove old xlog files success
                            [2024-11-08 13:00:32.911][12030][dn_6001_6002][gs_rewind]: creating backup label and updating control file
                            [2024-11-08 13:00:32.911][12030][dn_6001_6002][gs_rewind]: create backup label success
                            [2024-11-08 13:00:32.911][12030][dn_6001_6002][gs_rewind]: read checkpoint redo (0/4B31D08) success.
                            [2024-11-08 13:00:32.911][12030][dn_6001_6002][gs_rewind]: read checkpoint rec (0/4B31D08) success.
                            [2024-11-08 13:00:32.912][12030][dn_6001_6002][gs_rewind]: dn incremental build completed.
                            [2024-11-08 13:00:32.917][12030][dn_6001_6002][gs_rewind]: build try host(10.10.10.165) port(15401) success
                            [2024-11-08 13:00:32.917][12030][dn_6001_6002][gs_rewind]: fetching MOT checkpoint
                            [2024-11-08 13:00:32.936][12030][dn_6001_6002][gs_ctl]: waiting for server to start...
                            .0 LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env.


                            0 LOG: [Alarm Module]Host Name: slave


                            0 LOG: [Alarm Module]Host IP: slave. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>


                            0 LOG: [Alarm Module]Cluster Name: dbCluster


                            0 LOG: [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58


                            0 WARNING: failed to open feature control file, please check whether it exists: FileName=gaussdb.version, Errno=2, Errmessage=No such file or directory.
                            0 WARNING: failed to parse feature control file: gaussdb.version.
                            0 WARNING: Failed to load the product control file, so gaussdb cannot distinguish product version.
                            The core dump path is an invalid directory
                            2024-11-08 13:00:32.973 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [BACKEND] LOG: base_page_saved_interval is 400, ori is 400.
                            2024-11-08 13:00:32.977 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 DB010 0 [REDO] LOG: Recovery parallelism, cpu count = 2, max = 4, actual = 2
                            2024-11-08 13:00:32.977 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 DB010 0 [REDO] LOG: ConfigRecoveryParallelism, true_max_recovery_parallelism:4, max_recovery_parallelism:4
                            2024-11-08 13:00:32.981 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [BACKEND] LOG: [Alarm Module]can not read GAUSS_WARNING_TYPE env.


                            2024-11-08 13:00:32.981 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [BACKEND] LOG: [Alarm Module]Host Name: slave


                            2024-11-08 13:00:32.981 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [BACKEND] LOG: [Alarm Module]Host IP: slave. Copy hostname directly in case of taking 10s to use 'gethostbyname' when /etc/hosts does not contain <HOST IP>


                            2024-11-08 13:00:32.981 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [BACKEND] LOG: [Alarm Module]Cluster Name: dbCluster


                            2024-11-08 13:00:32.981 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [BACKEND] LOG: [Alarm Module]Invalid data in AlarmItem file! Read alarm English name failed! line: 58


                            2024-11-08 13:00:32.982 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [BACKEND] LOG: loaded library "security_plugin"
                            2024-11-08 13:00:32.983 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 01000 0 [BACKEND] WARNING: could not create any HA TCP/IP sockets
                            2024-11-08 13:00:32.984 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [BACKEND] LOG: InitNuma numaNodeNum: 1 numa_distribute_mode: none inheritThreadPool: 0.
                            2024-11-08 13:00:32.985 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 01000 0 [BACKEND] WARNING: Failed to initialize the memory protect for g_instance.attr.attr_storage.cstore_buffers (1024 Mbytes) or shared memory (3696 Mbytes) is larger.
                            2024-11-08 13:00:33.030 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [CACHE] LOG: set data cache size(805306368)
                            2024-11-08 13:00:33.106 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [SEGMENT_PAGE] LOG: Segment-page constants: DF_MAP_SIZE: 8156, DF_MAP_BIT_CNT: 65248, DF_MAP_GROUP_EXTENTS: 4175872, IPBLOCK_SIZE: 8168, EXTENTS_PER_IPBLOCK: 1021, IPBLOCK_GROUP_SIZE: 4090, BMT_HEADER_LEVEL0_TOTAL_PAGES: 8323072, BktMapEntryNumberPerBlock: 2038, BktMapBlockNumber: 25, BktBitMaxMapCnt: 512
                            2024-11-08 13:00:33.130 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [BACKEND] LOG: gaussdb: fsync file "/opt/software/openGauss5.0/install/data/dn/gaussdb.state.temp" success
                            2024-11-08 13:00:33.131 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [BACKEND] LOG: create gaussdb state file success: db state(STARTING_STATE), server mode(Standby), connection index(1)
                            2024-11-08 13:00:33.159 672d9af0.1 [unknown] 140367339682176 [unknown] 0 dn_6001_6002 00000 0 [BACKEND] LOG: max_safe_fds = 972, usable_fds = 1000, already_open = 18
                            The core dump path is an invalid directory
                            .
                            [2024-11-08 13:00:35.020][12030][dn_6001_6002][gs_ctl]: done
                            [2024-11-08 13:00:35.020][12030][dn_6001_6002][gs_ctl]: server started (/opt/software/openGauss5.0/install/data/dn)
                            [2024-11-08 13:00:35.023][12030][dn_6001_6002][gs_ctl]: fopen build pid file "/opt/software/openGauss5.0/install/data/dn/gs_build.pid" success
                            [2024-11-08 13:00:35.023][12030][dn_6001_6002][gs_ctl]: fprintf build pid file "/opt/software/openGauss5.0/install/data/dn/gs_build.pid" success
                            [2024-11-08 13:00:35.032][12030][dn_6001_6002][gs_ctl]: fsync build pid file "/opt/software/openGauss5.0/install/data/dn/gs_build.pid" success

                            修复完成后,查看实例状态

                              [omm@master ~]$ gs_om -t status --detail
                              [ Cluster State ]


                              cluster_state : Normal
                              redistributing : No
                              current_az : AZ_ALL


                              [ Datanode State ]


                              node node_ip port instance state
                              ------------------------------------------------------------------------------------------------------------
                              1 master 10.10.10.165 15400 6001 /opt/software/openGauss5.0/install/data/dn P Primary Normal
                              2  slave  10.10.10.166    15400      6002 /opt/software/openGauss5.0/install/data/dn   S Standby Normal

                              可以看到数据库当前的实例状态正常。


                              点击阅读原文跳转作者文章

                              文章转载自openGauss,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                              评论