暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

华为GaussDB A gs_lcctl

墨天轮 2019-10-12
1031

gs_lcctl

背景信息

GaussDB 200提供了gs_lcctl工具来实现逻辑集群功能,支持逻辑集群内的资源管控及逻辑集群间的数据互访。

前提条件

  • 逻辑集群功能是可选功能。
  • 物理集群模式和逻辑集群模式不能并存。
  • 集群状态正常。
  • 如果是将物理集群转换成逻辑集群,物理集群中除了installation group之外,不能存在其他的Node Group。
  • 集群中的节点名称不能为all或者ALL。
  • 单机部署和一主多备模式下不支持逻辑集群特性。

注意事项

  • 逻辑集群的集群命名只能是数字、字母和下划线的组合,名称长度不超过63个字符,不能是group_version1、group_version2、group_version3、installation、elastic_group、optimal、query这些预留名字。
  • 通过gs_lcctl创建的逻辑集群名字是区分大小写的,如果用户指定的名字中包含大写字符,在SQL语句中引用该逻辑集群名字时都需要指定大小写。

语法

  • 逻辑集群创建
    gs_lcctl -t create --name=ngname {-h HOSTNAME | -f hostfile} [--upgrade] [--high-perform] [--non-interactive] [-l LOGFILE]
    说明:
    • 创建逻辑集群之前,必须保证集群状态正常、均衡且未在重分布中。
    • -h参数和-f参数只能存在一个。
    • --upgrade使用场景:当用户使用6.5.0之前的版本升级到当前版本时,如果用户希望使用逻辑集群特性,则必须指定该参数将整个物理集群一次全部转换为一个逻辑集群。
    • --high-perform是一个可选参数,用来校验-h参数值是否成环,保证集群高可用。
    • --non-interactive是一个可选参数,如果不指定该参数,会为逻辑集群创建一个和逻辑集群同名的逻辑集群管理员用户,同时创建过程中需要使用者交互式输入密码;如果希望创建逻辑集群管理员用户或者命令因为交互式输入不被打断,可以指定该参数。
  • 给指定的逻辑集群添加节点
    gs_lcctl -t add --name=ngname {-h HOSTNAME | -f hostfile} [-l LOGFILE]
    说明:
    • 使用该命令之前,如果用户想将当前集群外节点添加到指定的逻辑集群,则需要先执行gs_expand工具,将集群外节点添加到弹性集群中。
    • 该命令从弹性集群中取节点添加到指定的逻辑集群中。
    • -h参数和-f参数只能存在一个。
  • 删除逻辑集群
    gs_lcctl -t delete --name=ngname [-l LOGFILE]
    说明:
    • 执行该操作前,须确认当前集群状态为Normal,重分布状态为No。
    • 执行该操作前,请确认将要被删除的逻辑集群内的所有业务都已停止。
    • 只有存在两个或两个以上的逻辑集群时,才可以执行该操作。
    • 被删除的逻辑集群所包含的节点中包含CMServer、GTM、ETCD组件时,该逻辑集群无法删除。
    • 被删除的逻辑集群所包含的节点中只包含CN和DN组件,剔除CN成功后即可删除该逻辑集群。
    • 删除逻辑集群会清理该集群内的节点上的信息,在某些情况下,可能会出现节点未被正常清理,需要用户手动进行清理。
  • 删除弹性集群
    gs_lcctl -t delete --elastic-group [-h HOSTNAME | -f hostfile] [-l LOGFILE]
    说明:
    • 执行该操作前,须确认当前集群状态为Normal,重分布状态为No。
    • 删除弹性集群中的部分节点时,使用-h或者-f参数指定节点名称,删除弹性集群中的所有节点时,不需要加-h和-f。
    • 弹性集群中的节点包含CMServer、GTM、ETCD组件时,无法删除弹性集群所有节点,只能使用-h或者-f指定部分未包含上述组件的节点进行删除操作。
    • 弹性集群中的节点只包含CN和DN组件时,剔除CN成功后即可进行删除操作。
    • 删除时会清理节点上的信息,在某些情况下,可能会出现节点未被正常清理,需要用户手动进行清理。
  • 回滚逻辑集群
    gs_lcctl -t rollback --name=ngname [-l LOGFILE]
    说明:

    该回滚操作只能回滚从其他版本升级到6.5.0版本后创建的逻辑集群。其他回滚场景暂不支持。

  • 显示逻辑集群信息
    gs_lcctl -t display [--lcname-only] [-l LOGFILE]
    说明:
    • 查询信息的显示格式:逻辑集群名称---逻辑集群包含的节点名称。
    • 当指定--lcname-only参数时,只显示物理集群中的逻辑集群名称。
  • 显示帮助信息
    gs_lcctl -? | --help
  • 显示版本信息
    gs_lcctl -V | --version

参数说明

gs_lcctl参数可以分为如下几类:

  • 通用参数:
    • -t

      gs_lcctl命令参数类型。

      取值范围:create,rollback,add,display,delete。

    • -l

      指定日志文件及存放路径。

      默认值:$GAUSSLOG/om/gs_lcctl-YYYY-MM-DD_hhmmss.log

    • -?, --help

      显示帮助信息。

    • -V, --version

      显示版本号信息。

  • 逻辑集群创建参数:
    • --name=ngname

    指定需要创建的逻辑集群名称。

    取值范围:不包含特殊字符的字符串。

    说明:
    • 逻辑集群的名称不能包含以下特殊字符"|",";","&","$","<",">","`","\\","'","\"","{","}","(",")","[","]","~","*","?","!","\n", "/"。
    • 如果指定的名称中有转义字符,例如$#,shell会自动对其进行转义,会出现查询出来的group_name与用户指定的名称不一致情况。如果使用--name='$#'方式指定命令行参数,则不会出现转义,代码中会进行特殊字符的白名单校验。
    • -h HOSTNAME

    指定需要创建的逻辑集群包含的节点名称,中间用逗号分隔。

    • -f hostfile

    指定需要创建的逻辑集群包含的节点文件,文件每行为一个节点名称。

    说明:

    -h参数和-f参数只能同时指定一个。

    • --upgrade

    如果从其他版本升级到当前版本,则必须指定该参数。

    如果不指定,默认使用当前版本安装的集群。

    说明:

    用户自己保证该参数的正确使用。

    • --high-perform

    检查用户指定的-h参数是否成环。

    如果不指定,默认不检查-h参数值是否成环。

  • 逻辑集群加节点参数:
    • --name=ngname

      指定需要加节点的逻辑集群名称。

      取值范围:不包含特殊字符的字符串。

    • --h HOSTNAME

      指定需要加节点的逻辑集群包含的节点名称,中间用逗号分隔。

    • -f hostfile

      指定需要加节点的逻辑集群包含的节点文件,文件每行为一个节点名称。

      说明:

      -h参数和-f参数只能同时指定一个。

  • 回滚逻辑集群参数:
    • --name=ngname

      指定需要回滚的逻辑集群名称。

      取值范围:不包含特殊字符的字符串。

      说明:

      该回滚操作只能回滚从其他版本升级到6.5.0版本后创建的逻辑集群。其他回滚场景暂不支持。

  • 逻辑集群查询参数:
    • --lcname-only

      是否只显示逻辑集群名称。

      如果不指定,默认显示逻辑集群名称和逻辑集群对应的节点名称。

  • 删除逻辑集群:
    • --name=ngname

      指定需要删除的逻辑集群名称。

  • 删除弹性集群:
    • --elastic-group

      是否是删除弹性集群。

      如果不指定,默认不是删除弹性集群操作。

    • --h HOSTNAME

      指定需要删除弹性集群中的节点名称,中间用逗号分隔。

    • -f hostfile

      指定需要删除弹性集群中的节点名称文件,文件每行为一个节点名称。

      说明:

      -h参数和-f参数只能同时指定一个。

示例

创建逻辑集群,必须安装物理集群,并且集群状态正常。

perfadm@SIA1000124312:~> gs_install -X ${BIGDATA_HOME}/FusionInsight_MPPDB_6.5.1/install/FusionInsight-MPPDB-6.5.1/clusterconfig_dilatation.xml Parsing the configuration file. Check preinstall on every node. Successfully checked preinstall on every node. Creating the backup directory. Successfully created the backup directory. Installing the cluster. Checking the installation environment on all nodes. Installing applications on all nodes. Successfully installed APP. Cluster installation is completed. Configuring. Deleting instances from all nodes. Checking node configuration on all nodes. Initializing instances on all nodes. Updating instance configuration on all nodes. Configuring pg_hba on all nodes. Configuration is completed. Starting cluster. ====================================================================== Successfully started primary instance. Wait for standby instance. ====================================================================== . Successfully started cluster. ====================================================================== cluster_state : Normal redistributing : No node_count : 3 Coordinator State normal : 3 abnormal : 0 GTM State primary : 1 standby : 1 abnormal : 0 down : 0 Datanode State primary : 3 standby : 3 secondary : 3 building : 0 abnormal : 0 down : 0

示例一:正常流程创建逻辑集群。

perfadm@SIA1000125810:~> gs_lcctl -t create --name=group1 -h SIA1000125810,SIA1000022034,SIA1000022033 Checking if the node name is in the physical cluster. Checking host file. Successfully checked host file. Successfully checked if the node name is in the physical cluster. Checking if the specified logic cluster has been created. Successfully checked if the specified logic cluster has been created. Preapring the backup directory. Successfully preapred the backup directory. Checking the cluster status. Successfully checked the cluster status. Checking the count of general nodegroup. Successfully checked the general nodegroup. Checking if cluster is locked. Successfully checked if cluster is locked. Creating a logical cluster administrator. Obtaining the user password. Please enter password for the logic cluster user [group2]. Please enter password for logic cluster user. Password: Please enter password for logic cluster user again. Password: Successfully obtained the user password. Successfully created a logical cluster administrator. Creating a node group. Obtaining the IP list of the node through the hostname list. Successfully obtained the IP list of the node through the hostname list. Obtaining DN instance info through the IP list. Successfully obtained DN instance info through the IP list. Successfully created a node group. Creating a static configuration file for the logic cluster. Successfully create a static configuration file for the logic cluster. Configuring cgroup. Operating a logic cluster resource management configuration file. Successfully operated a logic cluster resource management configuration file. Loading the cgroup config file. Successfully loaded the cgroup config file. Successfully configured cgroup. Successfully created a logic cluster.

示例二:创建过程中ctrl+C,然后重入继续ctrl+C。逻辑集群创建失败。

perfadm@SIA1000124312:/opt/software/VC_ronghe/script> ./gs_lcctl -t create --name=group1 -h SIA1000125810,SIA1000022034,SIA1000022033 Checking if the node name is in the physical cluster. Checking host file. Successfully checked host file. Successfully checked if the node name is in the physical cluster. Checking if the specified logic cluster has been created. Successfully checked if the specified logic cluster has been created. Preapring the backup directory. Successfully preapred the backup directory. Checking the cluster status. Successfully checked the cluster status. Checking the count of general nodegroup. Successfully checked the general nodegroup. Checking if cluster is locked. Successfully checked if cluster is locked. Operating a logic cluster resource management configuration file. Successfully operated a logic cluster resource management configuration file. Creating a logical cluster administrator. Obtaining the user password. Please enter password for the logic cluster user [group1]. Please enter password for logic cluster user. Password: Please enter password for logic cluster user again. Password: Successfully obtained the user password. Successfully created a logical cluster administrator. Creating a node group. Obtaining the IP list of the node through the hostname list. Successfully obtained the IP list of the node through the hostname list. Obtaining DN instance info through the IP list. Successfully obtained DN instance info through the IP list. ^CTraceback (most recent call last): File "./gs_lcctl", line 3137, in <module> main() File "./gs_lcctl", line 3128, in main runner.run() File "./gs_lcctl", line 3099, in run self.doCreateLC() File "./gs_lcctl", line 2643, in doCreateLC self.createNodegroupForCreate() File "./gs_lcctl", line 1505, in createNodegroupForCreate ignoreError = False) File "/opt/software/VC_ronghe/script/gspylib/common/Common.py", line 3643, in remoteSQLCommand (status1, output1) = commands.getstatusoutput(cmd) File "/usr/lib64/python2.6/commands.py", line 56, in getstatusoutput text = pipe.read() KeyboardInterrupt perfadm@SIA1000124312:/opt/software/VC_ronghe/script> ./gs_lcctl -t create --name=group1 -h SIA1000125810,SIA1000022034,SIA1000022033 Checking if the node name is in the physical cluster. Checking host file. Successfully checked host file. Successfully checked if the node name is in the physical cluster. Checking whether the reentry command is consistent with the previous command. Successfully checked whether the reentry command is consistent with the previous command. Last time end with: Create node group. Rolling back. Deleting a logic cluster and cluster administrator. Successfully deleted a logic cluster and cluster administrator. Operating a logic cluster resource management configuration file. Successfully operated a logic cluster resource management configuration file. Rollback is completed. Preapring the backup directory. Successfully preapred the backup directory. Checking the cluster status. Successfully checked the cluster status. Checking the count of general nodegroup. Successfully checked the general nodegroup. Checking if cluster is locked. Successfully checked if cluster is locked. ^CTraceback (most recent call last): File "./gs_lcctl", line 3137, in <module> main() File "./gs_lcctl", line 3128, in main runner.run() File "./gs_lcctl", line 3099, in run self.doCreateLC() File "./gs_lcctl", line 2627, in doCreateLC self.writeOperateStep(step) File "/opt/software/VC_ronghe/script/gspylib/common/ParallelBaseOM.py", line 804, in writeOperateStep self.sshTool.scpFiles(self.operateStepFile, self.operateStepDir, nodes) File "/opt/software/VC_ronghe/script/gspylib/threads/SshTool.py", line 527, in scpFiles (status, output) = commands.getstatusoutput(scpCmd) File "/usr/lib64/python2.6/commands.py", line 56, in getstatusoutput text = pipe.read() KeyboardInterrupt

示例三:创建过程中ctrl+C,然后重入。逻辑集群创建成功。

perfadm@SIA1000124312:/opt/software/VC_ronghe/script> ./gs_lcctl -t create --name=group1 -h SIA1000125810,SIA1000022034,SIA1000022033 Checking if the node name is in the physical cluster. Checking host file. Successfully checked host file. Successfully checked if the node name is in the physical cluster. Checking whether the reentry command is consistent with the previous command. Successfully checked whether the reentry command is consistent with the previous command. Last time end with: Cgroup lc. Rolling back. Operating a logic cluster resource management configuration file. Successfully operated a logic cluster resource management configuration file. Rollback is completed. Preapring the backup directory. Successfully preapred the backup directory. Checking the cluster status. Successfully checked the cluster status. Checking the count of general nodegroup. Successfully checked the general nodegroup. Checking if cluster is locked. Successfully checked if cluster is locked. Operating a logic cluster resource management configuration file. Successfully operated a logic cluster resource management configuration file. Creating a logical cluster administrator. Obtaining the user password. Please enter password for the logic cluster user [group1]. Please enter password for logic cluster user. Password: Please enter password for logic cluster user again. Password: Successfully obtained the user password. Successfully created a logical cluster administrator. Creating a node group. Obtaining the IP list of the node through the hostname list. Successfully obtained the IP list of the node through the hostname list. Obtaining DN instance info through the IP list. Successfully obtained DN instance info through the IP list. Successfully created a node group. Creating static config file and map cgroup. Creating a static configuration file for the logic cluster. Successfully create a static configuration file for the logic cluster. Loading the cgroup config file. Successfully loaded the cgroup config file. Successfully created static config file and map cgroup. Successfully created a logic cluster.

示例四:给指定逻辑集群增加节点

perfadm@SIA1000124312:/opt/software/VC_ronghe/script> ./gs_lcctl -t add --name=group1 -h SIA1000125810,SIA1000022034,SIA1000022033 Checking if the node name is in the physical cluster. Checking host file. Successfully checked host file. Successfully checked if the node name is in the physical cluster. Getting new and old nodes. Successfully get new and old nodes. Checking if the specified logic cluster has been created. Successfully checked if the specified logic cluster has been created. Checking whether the -h parameter is valid. Successfully checked whether the -h parameter is valid. Successfully checked if hostname is looped. Completing the initialization before performing the addition. Deleting the table remaining after redistribution. Getting all the databases. Successfully deleted the table remaining after redistribution. Successfully completed the initialization. Stopping DN of new nodes. Successfully stopped DN of new nodes. Rebuilding new nodes. Locking cluster. Successfully locked cluster. Restoring new nodes. Successfully restored new nodes. Successfully rebuild new nodes. Doing post add. Starting new nodes. Successfully started new nodes. Waiting for the cluster status to become normal. ....... The cluster status is normal. Unlocking cluster. Successfully unlocked cluster. Successfully posted add. Creating a node group for add. Successfully created a node group for add. Updating cluster configuration. Creating a static configuration file for the logic cluster. Successfully create a static configuration file for the logic cluster. Successfully updated cluster configuration. Configuring cgroup. Operating a logic cluster resource management configuration file. Successfully operated a logic cluster resource management configuration file. Loading the cgroup config file. Successfully loaded the cgroup config file. Successfully configured cgroup. Successfully add hostname to the logic cluster.

示例五:查询逻辑集群信息

perfadm@SIA1000124312:/opt/software/VC_ronghe/script> ./gs_lcctl -t display logic cluster name -------------------------------- group2

示例六:查询逻辑集群信息,只显示逻辑集群名称

perfadm@SIA1000124312:/opt/software/VC_ronghe/script> ./gs_lcctl -t -t display --lcname-only logic cluster name | hostname ------------------------------------------------------------------------------------------------------ elastic_group | SIA1000124312 group2 | SIA1000124316 SIA1000124314

示例七:删除逻辑集群

perfadm@test1:~> gs_lcctl -t delete --name=lc2 Checking if the specified logic cluster has been created. Successfully checked if the specified logic cluster has been created. Delete the logic cluster. Completing the initialization before performing the delete. Checking deleted nodes. Successfully checked deleted nodes. Checking the count of the nodegroup. Successfully checked the count of the nodegroup. Checking if the specified logic cluster has been created. Successfully checked if the specified logic cluster has been created. Other user may be operating the lc2 now. Checking if cluster is locked. Successfully checked if cluster is locked. Successfully completed the initialization. Drop the node group database object. Drop all database schema. Getting all the databases. Successfully drop all database schema. Drop all database object. Successfully drop all database object. Successfully drop the node group database object. Deleting the lc name text file for the logic cluster. Successfully deleted a lc name text file for the logic cluster. Creating a lc name text file for the cluster. Successfully create a lc name text file for the logic cluster. Updating the lc name text file for the logic cluster. Successfully updated a lc name text file for the logic cluster. Unlocking cluster. Successfully unlocked cluster. Locking cluster. Successfully locked cluster. Waiting for the cluster status to become normal. .. The cluster status is normal. Unlocking cluster. Successfully unlocked cluster. Deleting logical cluster nodes. Warning: Failed to delete temporary directory. Successfully deleted logical cluster nodes. Deleting the static configuration file for the logic cluster. Successfully deleted a static configuration file for the logic cluster. Successfully delete the logic cluster.

示例八:删除弹性集群

perfadm@test1:~> gs_lcctl -t delete --elastic-group Checking if the node name is in the physical cluster. Checking host file. Successfully checked host file. Successfully checked if the node name is in the physical cluster. Successfully checked if hostname is looped. Delete the logic cluster. Completing the initialization before performing the delete. Checking deleted nodes. Successfully checked deleted nodes. Checking the count of the nodegroup. Successfully checked the count of the nodegroup. Checking if the specified logic cluster has been created. Successfully checked if the specified logic cluster has been created. Other user may be operating the elastic_group now. Checking if cluster is locked. Successfully checked if cluster is locked. Successfully completed the initialization. Drop the node group database object. Drop all database schema. Getting all the databases. Successfully drop all database schema. Drop all database object. Successfully drop all database object. Successfully drop the node group database object. Deleting the lc name text file for the logic cluster. Successfully deleted a lc name text file for the logic cluster. Creating a lc name text file for the cluster. Successfully create a lc name text file for the logic cluster. Updating the lc name text file for the logic cluster. Successfully updated a lc name text file for the logic cluster. Unlocking cluster. Successfully unlocked cluster. Locking cluster. Successfully locked cluster. Waiting for the cluster status to become normal. ... The cluster status is normal. Unlocking cluster. Successfully unlocked cluster. Deleting logical cluster nodes. Warning: Failed to delete temporary directory. Successfully deleted logical cluster nodes. Successfully delete the logic cluster.

示例九:删除弹性集群中指定节点

perfadm@test1:~> gs_lcctl -t delete --elastic-group -h test4,test5,test6 Checking if the node name is in the physical cluster. Checking host file. Successfully checked host file. Successfully checked if the node name is in the physical cluster. Successfully checked if hostname is looped. Delete the logic cluster. Completing the initialization before performing the delete. Checking deleted nodes. Successfully checked deleted nodes. Checking the count of the nodegroup. Successfully checked the count of the nodegroup. Checking if the specified logic cluster has been created. Successfully checked if the specified logic cluster has been created. Other user may be operating the elastic_group now. Checking if cluster is locked. Successfully checked if cluster is locked. Successfully completed the initialization. Drop the node group database object. Drop all database schema. Getting all the databases. Successfully drop all database schema. Drop all database object. Successfully drop all database object. Successfully drop the node group database object. Deleting the lc name text file for the logic cluster. Successfully deleted a lc name text file for the logic cluster. Creating a lc name text file for the cluster. Successfully create a lc name text file for the logic cluster. Updating the lc name text file for the logic cluster. Successfully updated a lc name text file for the logic cluster. Unlocking cluster. Successfully unlocked cluster. Locking cluster. Successfully locked cluster. Waiting for the cluster status to become normal. .. The cluster status is normal. Unlocking cluster. Successfully unlocked cluster. Deleting logical cluster nodes. Warning: Failed to delete temporary directory. Successfully deleted logical cluster nodes. Successfully delete the logic cluster.

示例十:从弹性集群节点给逻辑集群扩容时停止DN失败

1.问题定位: 在cms主节点上$GAUSSLOG/om/gs_lcctl-2019-01-25_201359.log查看日志: 扩容失败的原因是: [2019-01-25 10:53:13.990792][gs_lcctl][ERROR]:[FAILURE] 172-31-0-241: Stopping instance. ========================================= cm_ctl: stop the node: 3, datapath: /srv/BigData/mppdb/data1/dummyslave1. ............ cm_ctl: stop instance failed in (300)s! 2.故障修复处理: 以集群用户omm登陆失败节点,kill掉停止不了的DN进程: kill命令: ps ux | grep '\<datanode\>' | grep '/srv/BigData/mppdb/data1/dummyslave1' | grep -v grep | awk '{print $2}' | xargs -r kill -9 其中/srv/BigData/mppdb/data1/dummyslave1目录是停止失败的DN实例目录 3.重入进行从弹性集群往逻辑集群的扩容。 perfadm_R@testdb1:~> gs_lcctl -t add --name logic -h testdb6,testdb4,testdb5 Checking if the node name is in the physical cluster. Checking host file. Successfully checked host file. Successfully checked if the node name is in the physical cluster. Getting new and old nodes. Successfully get new and old nodes. Checking if the specified logic cluster has been created. Successfully checked if the specified logic cluster has been created. Checking whether the -h or -f parameter is valid. Successfully checked whether the -h or -f parameter is valid. Successfully checked if hostname is looped. Completing the initialization before performing the addition. Stopping DN of new nodes. Successfully stopped DN of new nodes. Rebuilding new nodes. Locking cluster. Successfully locked cluster. Restoring new nodes. Successfully restored new nodes. Successfully rebuild new nodes. Doing post add. Starting new nodes. Successfully started new nodes. Waiting for the cluster status to become normal. . The cluster status is normal. Unlocking cluster. Successfully unlocked cluster. Successfully posted add. Creating a node group for add. Successfully created a node group for add. Updating cluster configuration. Creating a static configuration file for the logic cluster. Successfully create a static configuration file for the logic cluster. Successfully updated cluster configuration. Configuring cgroup. Operating a logic cluster resource management configuration file. Successfully operated a logic cluster resource management configuration file. Loading the cgroup config file. Successfully loaded the cgroup config file. Successfully configured cgroup. Successfully add hostname to the logic cluster.

查看更多:华为GaussDB 200 服务端工具
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论