暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Oracle RAC运维管理之节点删除和添加

DBA随笔记 2024-08-14
49

STEP1:删除实例

    [grid@p19c01:/home/grid]$ olsnodes -s -t
    p19c01 Active Unpinned
    p19c02 Active Unpinned
    [grid@p19c01:/home/grid]$ srvctl config database -d p19c0
    Database unique name: p19c0
    Database name: p19c0
    Oracle home: u01/app/oracle/product/19.3.0/db
    Oracle user: oracle
    Spfile: +DATA/P19C0/PARAMETERFILE/spfile.267.1101425147
    Password file: +DATA/P19C0/PASSWORD/pwdp19c0.256.1101422929
    Domain:
    Start options: open
    Stop options: immediate
    Database role: PRIMARY
    Management policy: AUTOMATIC
    Server pools:
    Disk Groups: DATA
    Mount point paths:
    Services:
    Type: RAC
    Start concurrency:
    Stop concurrency:
    OSDBA group: dba
    OSOPER group: oper
    Database instances: p19c01,p19c02
    Configured nodes: p19c01,p19c02
    CSS critical: no
    CPU count: 0
    Memory target: 0
    Maximum memory: 0
    Default network number for database services:
    Database is administrator managed


    查看OCR备份
    [root@p19c01:/root]$ ocrconfig -showbackup
    手动备份OCR
    [root@p19c01:/root]$ ocrconfig -manualbackup
    p19c01 2022/04/09 21:42:52 +OCR:/p19c-cluster/OCRBACKUP/backup_20220409_214252.ocr.267.1101591773 3331580692
    p19c02 2022/04/08 17:14:23 +OCR:/p19c-cluster/OCRBACKUP/backup_20220408_171423.ocr.263.1101489263 3331580692


    1.1 停止实例(在任意一个节点上)
    [root@p19c01:/root]$ srvctl stop instance -d p19c0 -n p19c02


    1.2 oracle用户在保留节点使用dbca的静默模式进行删除实例,删除节点DB instance
    dbca -silent -deleteInstance -nodeList p19c02 -gdbName p19c0 -instanceName p19c02 -sysDBAUserName sys -sysDBAPassword oracle


    [root@p19c01:/root]$ su - oracle
    Last login: Sat Apr 9 21:49:45 CST 2022
    [oracle@p19c01:/home/oracle]$ dbca -silent -deleteInstance -nodeList p19c02 -gdbName p19c0 -instanceName p19c02 -sysDBAUserName sys -sysDBAPassword oracle
    [WARNING] [DBT-19203] The Database Configuration Assistant will delete the Oracle instance and its associated OFA directory structure. All information about this instance will be deleted.


    Prepare for db operation
    40% complete
    Deleting instance
    48% complete
    52% complete
    56% complete
    60% complete
    64% complete
    68% complete
    72% complete
    76% complete
    80% complete
    Completing instance management.
    100% complete
    Instance "p19c02" deleted successfully from node "p19c02".
    Look at the log file "/u01/app/oracle/cfgtoollogs/dbca/p19c0/p19c01.log" for further details.




    [oracle@p19c01:/home/oracle]$ srvctl config database -d p19c0
    Database unique name: p19c0
    Database name: p19c0
    Oracle home: u01/app/oracle/product/19.3.0/db
    Oracle user: oracle
    Spfile: +DATA/P19C0/PARAMETERFILE/spfile.267.1101425147
    Password file: +DATA/P19C0/PASSWORD/pwdp19c0.256.1101422929
    Domain:
    Start options: open
    Stop options: immediate
    Database role: PRIMARY
    Management policy: AUTOMATIC
    Server pools:
    Disk Groups: DATA
    Mount point paths:
    Services:
    Type: RAC
    Start concurrency:
    Stop concurrency:
    OSDBA group: dba
    OSOPER group: oper
    Database instances: p19c01
    Configured nodes: p19c01
    CSS critical: no
    CPU count: 0
    Memory target: 0
    Maximum memory: 0
    Default network number for database services:
    Database is administrator managed

    STEP2:删除数据库软件

      2.1 禁用和停止被删除节点的监听
      [grid]$ srvctl disable listener -listener LISTENER -node p19c02
      [grid]$ srvctl stop listener -listener LISTENER -node p19c02


      2.2 更新inventory(在被删除的节点上运行)
      [oracle@p19c02 bin]$ cd $ORACLE_HOME/oui/bin
      [oracle@p19c02 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES=p19c02" -local
      p19c02是要被删除的节点


      2.3 卸载ORACLE HOME(在被删除的节点上运行),就是删除ORACLE DATABASE软件
      [oracle@p19c02 db_home]$ $ORACLE_HOME/deinstall/deinstall -local




      2.4 更新inventory(在被保留的节点上运行)
      [oracle@rac1 bin]$ cd $ORACLE_HOME/oui/bin
      [oracle@rac1 bin]$ ./runInstaller -updateNodeList ORACLE_HOME=$ORACLE_HOME "CLUSTER_NODES=p19c01" -local
      节点名是要被保留的节点列表

      step3:从clusterware中删除节点

        在要被删除的节点执行下面的步骤
        以grid用户登录操作系统


        3.1 查看节点状态
        [grid@rac2 bin]$ olsnodes -s -t
        rac1 Active Unpinned
        rac2 Inactive Unpinned


        如果节点是被pin住的,则需要执行下面的命令进行解pin
        [root@rac2 ~]#
        crsctl unpin css -n p19c02




        3.2 移除RAC grid home,在删除的节点上执行
        [grid@p19c02:/u01/app/19.3.0/grid/deinstall]$ cd $ORACLE_HOME/deinstall
        [grid@p19c02:/u01/app/19.3.0/grid/deinstall]$ ./deinstall -local
        会提示以root用户运行rootcrs.sh脚本
        [root@p19c02:/]$
        /u01/app/19.3.0/grid/crs/install/rootcrs.sh -force -deconfig -paramfile "/tmp/deinstall2022-04-09_11-10-16PM/response/deinstall_OraGI19Home1.rsp"


        3.4 在保留的节点上,执行下面命令,更新inventory
        cd $GRID_HOME/oui/bin
        ./runInstaller -updateNodeList ORACLE_HOME=$GRID_HOME "CLUSTER_NODES=p19c01" CRS=TRUE -silent
        p19c01是留下的节点


        例:
        在所有保留节点上以grid用户 更新保留节点的Inventory
        [grid@p19c01:/home/grid]$ cd $ORACLE_HOME/oui/bin
        [grid@p19c01:/home/grid]$ ./runInstaller -updateNodeList ORACLE_HOME=/u01/app/19.3.0/grid "CLUSTER_NODES={p19c01}" CRS=TRUE -silent -local




        3.5 此时会保留目录/u01/app/19.3.0和/u01/app/grid
        在保留节点的其中一个节点上运行以下命令删除群集节点:
        [root@p19c01 ~]# cd u01/app/19.3.0/grid/bin/
        [root@p19c01 bin]# ./crsctl delete node -n p19c02
        CRS-4661: Node p19c02 successfully deleted.


        [root@p19c01 bin]# ./olsnodes -s -t
        p19c01 Active Unpinned


        3.6 运行以下CVU命令以验证指定节点是否已成功从群集中删除:
        $ cluvfy stage -post nodedel -n node_list [-verbose]


        [grid@p19c01 bin]$
        cluvfy stage -post nodedel -n p19c02 -verbose
        olsnodes -s -t




        [grid@p19c01:/u01/app/19.3.0/grid/oui/bin]$ cluvfy stage -post nodedel -n p19c02 -verbose
        This software is "360" days old. It is a best practice to update the CRS home by downloading and applying the latest release update. Refer to MOS note 2731675.1 for more details.


        Verifying Node Removal ...
        Verifying CRS Integrity ...PASSED
        Verifying Clusterware Version Consistency ...PASSED
        Verifying Node Removal ...PASSED


        Post-check for node removal was successful.


        CVU operation performed: stage -post nodedel
        Date: Apr 9, 2022 11:39:15 PM
        CVU home: u01/app/19.3.0/grid/
        User: grid


        [grid@p19c01:/u01/app/19.3.0/grid/oui/bin]$ olsnodes -s -t
        p19c01 Active Unpinned

        step4: 添加节点

        4.1  环境准备

        节点二重装操作系统,配置Oracle环境,配置共享存储

        4.2 配置SSH互信

        对 grid 和 oracle 用户配置 SSH互信

          cd $ORACLE_HOME/oui/prov/resources/scripts
          [grid@p19c01:/u01/app/19.3.0/grid/oui/prov/resources/scripts]$
          ./sshUserSetup.sh -user grid -hosts "p19c01 p19c02" -advanced -noPromptPassphrase
          ./sshUserSetup.sh -user oracle -hosts "p19c01 p19c02" -advanced -noPromptPassphrase

          4.3 使用CVU验证添加的节点是否满足要求

          在现有集群节点的grid用户下执行以下命令验证添加的节点是否满足GI软件的要求(对新节点做安装前的检查)

            [grid@p19c01 .ssh]$ cluvfy comp peer -refnode p19c01 -n p19c02 -verbose
            [grid@p19c01 .ssh]$ cluvfy stage -pre nodeadd -n p19c02 -verbose -fixup

            4.4 添加Clusterware

            执行以下命令将添加新节点Clusterware软件 (在现有集群节点的grid用户执行)

              >>在节点rac1上安装GI
              [grid@rac1 ~]$ cd u01/app/19.3.0/grid/addnode/


              如果没有配置dns,可以这样忽略dns检查
              [grid@rac1 ~]export IGNORE_PREADDNODE_CHECKS=Y
              [grid@rac1 ~]$ ./addnode.sh -silent -ignorePrereq "CLUSTER_NEW_NODES={p19c02}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={p19c02-vip}" "CLUSTER_NEW_NODE_ROLES={hub}"


              提示:
              Update Inventory in progress.
              You can find the log of this install session at:
              u01/app/oraInventory/logs/addNodeActions2022-04-10_11-27-03AM.log


              Update Inventory successful.
              .................................................. 97% Done.


              As a root user, execute the following script(s):
              1. u01/app/oraInventory/orainstRoot.sh
              2. u01/app/19.3.0/grid/root.sh


              Execute u01/app/oraInventory/orainstRoot.sh on the following nodes:
              [p19c02]
              Execute u01/app/19.3.0/grid/root.sh on the following nodes:
              [p19c02]


              The scripts can be executed in parallel on all the nodes.


              Successfully Setup Software with warning(s).
              .................................................. 100% Done.

              上一步执行成功之后,在新节点以root用户身份运行以下两个脚本

                # /u01/app/oraInventory/orainstRoot.sh
                # /u01/app/19.0.0/grid/root.sh
                运行root.sh 时,见到“'UpdateNodeList' was successful.”才表示脚本运行成功。
                root.sh脚本会启动相关的服务


                [root@rac2 ~]# /u01/app/oraInventory/orainstRoot.sh
                [root@rac2 ~]# /u01/app/19.3.0/grid/root.sh
                  重复运行root.sh会产生错误,所以,在每次运行前,最好卸载残留的安装
                  [root@rac2 ~]# /u01/app/19.0.0/grid/crs/install/rootcrs.pl -deconfig -force

                  4.5 验证

                    [grid@rac1 ~]$ crsctl status res -t
                    [grid@rac1 ~]$ crsctl status res -t -init
                    [grid@rac1 ~]$ crsctl check cluster -all
                    [grid@rac1 ~]$ olsnodes -n
                    [grid@rac1 ~]$ srvctl status asm
                    [grid@rac1 ~]$ srvctl status listener

                    4.6 新节点安装ORACLE DATABASE软件

                    为新节点添加Database软件 (在现有集群节点以oracle用户执行)

                      [oracle]$ cd /u01/app/oracle/product/19.3.0/db/addnode/
                      [oracle@p19c01:/u01/app/oracle/product/19.3.0/db/addnode]$
                      ./addnode.sh -silent -ignorePrereq "CLUSTER_NEW_NODES={p19c02}"

                      上一步完成之后,在新的节点以root用户身份运行以下脚本

                        提示
                        Setup Oracle Base successful.
                        .................................................. 96% Done.


                        As a root user, execute the following script(s):
                        1. /u01/app/oracle/product/19.3.0/db/root.sh


                        Execute /u01/app/oracle/product/19.3.0/db/root.sh on the following nodes:
                        [p19c02]


                        在新节点rac2上,运行root脚本
                        [root@rac1 ~]# /u01/app/oracle/product/19.3.0/db/root.sh

                        在现有集群节点或新节点,在grid和oracle用户下执行以下命令验证Clusterware和Database软件是否添加正确

                          [grid]$ cluvfy stage -post nodeadd -n p19c02 -verbose

                          4.7 添加DB instance

                            登录rac1,做如下查询
                            SQL> select instance_name from Gv$instance;


                            INSTANCE_NAME
                            ----------------
                            p19c01
                            可以看到整个集群中,只有1个实例

                            方案1:

                            使用dbca工具执行以下命令,以静默模式添加新节点数据库实例(在现有集群节点以oracle用户执行)

                              [oracle@p19c01 ~]$ dbca -silent -addInstance -gdbName "p19c0" -nodeName "p19c02" -instanceName "p19c02" -sysDBAUserName "sys" -sysDBAPassword "oracle"

                              方案2:

                              在现有节点以 oracle 用户运行 dbca

                              oracle RAC database instance management–>add an instence

                              4.8 检查集群和数据库是否正常

                                SQL> select instance_number,instance_name,status from gv$instance;
                                SQL> select thread#,status,instance from gv$thread;
                                [grid@rac1 ~]$ crsctl status res -t




                                文章转载自DBA随笔记,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                                评论