暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

华为GaussDB T 替换主机-不换IP

墨天轮 2019-09-28
543

替换主机-不换IP

集群的主机出现故障,需要使用新的主机进行替换,如果新主机的IP和主机名称和故障主机保持一致,则可以使用命令直接进行替换。

前提条件

  • 分布式部署模式下替换主机的前提条件:
    • 故障节点不能存在CN实例。
    • 集群状态可查,主CM Server状态正常,集群状态不能为Unavailable。
    • 替换前不能锁定集群。
    • 节点替换必须在一个正常主机上执行。
    • 在节点替换前,基于当前集群配置文件执行前置成功。
    • 新物理节点的IP、主机名等信息要和损坏的物理节点一致。
    • 需保证新物理节点和集群其他节点时间一致。
    • 替换实例期间集群不能执行DDL和DML操作。
    • 各节点的系统时间必须同步。
  • 主备部署模式下替换主机的前提条件:
    • 集群状态可查,且主CM Server状态正常。
    • 对于故障节点上的DN所在的group,HA集群需保证主DN正常;Z-Paxos集群除了要保证主DN正常外,还需保证每组group中正常DN个数大于等于总数(不包含passive)的1/2+1(向下取整)个。
    • 替换前,故障节点上的DN所在的group不能被锁定。
    • 实例替换必须在一个正常主机上执行。
    • 新物理节点的IP、主机名等信息要和损坏的物理节点一致。
    • 需保证新物理节点和集群其他节点时间一致。
    • 在节点替换前,基于当前集群配置文件执行前置成功。
    • 替换节点期间故障节点上的DN所在的group不能执行DDL和DML操作。
    • 各节点的系统时间必须同步。

注意事项

  • 替换过程中,请勿执行数据的增删改等DML操作以及DDL操作,否则有可能数据丢失导致数据不一致。
  • 在前一次替换结束后才能再次执行替换。因此请不要同时在多个主机上同时执行替换操作。
  • 新主机的IP、主机名等信息要和损坏的主机一致。
  • 替换前需要先执行前置脚本以完成环境准备。建议在使用gs_preinstall之前,先清理新主机上/tmp/username、二进制程序安装目录、实例数据目录、/var/log/gaussdb/username目录下的内容。

操作步骤

以需要替换新主机名称为plat3为例。

  • omm用户身份登录要使用的新主机。
  • 检查字符集。

    echo $LANG en_US.UTF-8

    新主机上的字符集、编码方式等信息,使与其它主机保持一致。

    如果不一致请使用如下方式设置字符集。

    export LANG="en_US.UTF-8"

  • 以root用户登录集群的任意一台正常主机。
  • 准备集群环境。

    palt1:/opt/software/gaussdb/script #./gs_preinstall -U omm -G dbgrp -X /opt/software/gaussdb/clusterconfig.xml --alarm-type=1

    其中omm为数据库管理员(也是运行集群的操作系统用户),dbgrp为运行集群的操作系统用户的群组名称,/opt/software/gaussdb/clusterconfig.xml为集群配置文件路径。

    本地用户SYS的初始密码是Changeme_123。数据库管理员omm登录数据库集群的密码默认设置为gaussdb_123。为保证信息安全,请在成功安装集群并首次登录时,尽快修改本地用户SYS的初始密码和数据库管理员omm的默认密码。

    免密登录CN时,登录命令是“zsql / as clsmgr -D cn_data_dir”,免密登录DN时,登录命令是“zsql / as sysdba -D dn_data_dir”。

  • 切换为运行GaussDB 100的操作系统用户。

    palt1:~ #su - omm

  • 执行如下命令完成新增主机的安装操作。

    gs_replace -t install -h plat3

  • 执行如下命令完成新增主机的配置操作。

    gs_replace -t config -h plat3

  • 执行如下命令完成新增主机的启动操作。

    gs_replace -t start -h plat3

示例(替换主机-不换IP)

  • 准备集群环境。
    plat1:/opt/software/gaussdb/script #./gs_preinstall -U omm -G dbgrp -X /opt/software/gaussdb/clusterconfig.xml Parsing the configuration file. Successfully parsed the configuration file. Installing the tools on the local node. Successfully installed the tools on the local node. Are you sure you want to create trust for root (yes/no)? yes Please enter password for root. Password: Creating SSH trust for the root permission user. Checking network information. All nodes in the network are Normal. Successfully checked network information. Creating SSH trust. Creating the local key file. Successfully created the local key files. Appending local ID to authorized_keys. Successfully appended local ID to authorized_keys. Updating the known_hosts file. Successfully updated the known_hosts file. Appending authorized_key on the remote node. Successfully appended authorized_key on all remote node. Checking common authentication file content. Successfully checked common authentication content. Distributing SSH trust file to all node. Successfully distributed SSH trust file to all node. Verifying SSH trust on all hosts. Successfully verified SSH trust on all hosts. Successfully created SSH trust. Successfully created SSH trust for the root permission user. Pass over configuring LVM Distributing package. Successfully distributed package. Are you sure you want to create the user[omm] and create trust for it (yes/no)? yes Please enter password for cluster user. Password: Please enter password for cluster user again. Password: Creating [omm] user on all nodes. Successfully created [omm] user on all nodes. Installing the tools in the cluster. Successfully installed the tools in the cluster. Checking hostname mapping. Successfully checked hostname mapping. Creating SSH trust for [omm] user. Please enter password for current user[omm]. Password: Checking network information. All nodes in the network are Normal. Successfully checked network information. Creating SSH trust. Creating the local key file. Successfully created the local key files. Appending local ID to authorized_keys. Successfully appended local ID to authorized_keys. Updating the known_hosts file. Successfully updated the known_hosts file. Appending authorized_key on the remote node. Successfully appended authorized_key on all remote node. Checking common authentication file content. Successfully checked common authentication content. Distributing SSH trust file to all node. Successfully distributed SSH trust file to all node. Verifying SSH trust on all hosts. Successfully verified SSH trust on all hosts. Successfully created SSH trust. Successfully created SSH trust for [omm] user. Checking OS version. Successfully checked OS version. Creating cluster's path. Successfully created cluster's path. Setting SCTP service. Successfully set SCTP service. Set and check OS parameter. Setting OS parameters. Successfully set OS parameters. Set and check OS parameter completed. Preparing CRON service. Successfully prepared CRON service. Preparing SSH service. Successfully prepared SSH service. Setting user environmental variables. Successfully set user environmental variables. Configuring alarms on the cluster nodes. Successfully configured alarms on the cluster nodes. Setting the dynamic link library. Successfully set the dynamic link library. Fixing server package owner. Successfully fixed server package owner. Create logrotate service. Successfully create logrotate service. Setting finish flag. Successfully set finish flag. Preinstallation succeeded.
  • 安装新增的主机。
    omm@plat1:/opt/software/gaussdb/script> gs_replace -t install -h plat3 Distributing configuration to remote host. Successfully distributed configuration to remote host. Installing. Checking installation environment on nodes. Successfully checking installation environment on nodes. Installing applications on nodes. Installation is completed. ============================== Time statistics: Install replacement nodes: 9s total: 9s
  • 配置新增的主机。
    omm@plat1:/opt/software/gaussdb/script> gs_replace -t config -h plat3 Check cluster status for replace. Check the status of ETCD cluster. Successfully check the status of ETCD cluster. Successfully check cluster status. Filter out all valid hosts for replacing. Distributing configuration to remote host. Successfully distributed configuration to remote host. Configuring Stopping replace instances. Successfully stopped replace instances. Waiting for upgrading standby instances. Successfully upgraded standby instances. Configuring replacement instances. Delete broken instances for primary instances' raft. Successfully delete broken instances for primary instances' raft. Config replace instances. Successfully config replace instances. Add cover instances for primary instances' raft. Successfully add broken instances for primary instances' raft. Config standby DN of new instances. .............................. .............................. .............................. .............................. .............................. .............................. ............................. Config standby DN of new instances successfully. Successfully configured replacement instances. Configuration succeeded. ============================== Time statistics: Config replacement nodes: 268s total: 268s
  • 启动新增的主机。
    omm@plat1:/opt/software/gaussdb/script> gs_replace -t start -h plat3 Starting. ============================== .. Start cm agent on new nodes. Successfully start cm agent for new nodes. Starting the cluster. ...... Successfully started instance process. Waiting to become Normal. ============================== Successfully started cluster. ============================== Time statistics: Start replacement nodes: 35s total: 35s
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论