暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

0178. C cloudera-manager-agent被意外卸载后恢复

rundba 2022-06-28
1436

 CDP测试环境突然服务异常,不能正常使用,通过分析原因并进行恢复,并对问题再现。

0. ENV

CentOS 7.6;

cloudera-manager-agent-7.4.4(CM7.4.4)。



1.  问题现象

 某测试环境cloudera服务突然异常,Cloudera Management Service服务突然不能启动。


 

2. 问题原因

 

经仔细排查,发现节点2(nn02)上cloudera-manager-agent不翼而飞,且节点2上运行CMS服务,导致服务异常,先进行恢复,后续会再现问题,找到根因。

 

3. 恢复方法

 

3.1 在agent丢失节点安装和其它节点相同版本的agent

    [root@nn02 ~]# rpm -ivh cloudera-manager-agent-7.4.4-15850731.el7.x86_64.rpm 
    Preparing... ################################# [100%]
    Updating installing...
    1:cloudera-manager-agent-7.4.4-1585################################# [100%]
    Created symlink from etc/systemd/system/multi-user.target.wants/cloudera-scm-agent.service to usr/lib/systemd/system/cloudera-scm-agent.service.
    Created symlink from etc/systemd/system/multi-user.target.wants/cloudera-scm-supervisord.service to usr/lib/systemd/system/cloudera-scm-supervisord.service.


    3.2 修改配置文件中的CM Server指向

    修改15行的CM Server指向为182.168.80.122

      [root@nn02 ~]# vim etc/cloudera-scm-agent/config.ini      #和其它节点相同,需要将server_host更改为CM地址,
      13 [General]
      14 # Hostname of the CM server.
      15 server_host=192.168.80.122
      16
      17 # Port that the CM server is listening on.
      18 server_port=7182


      3.3 启动并确保agent服务正常

        [root@nn02 ~]# systemctl start cloudera-scm-agent
        [root@nn02 ~]# systemctl status cloudera-scm-agent


        3.4 在CM界面重启CMS

        重启CMS服务


        其它Runtime服务异常,稍等片刻后会恢复


        如果Runtime组件仍旧异常,建议重启其它组件。


         

        4. 问题复现及根因

         

        4.1 查看当前节点agent已经安装

          [root@nn02 ~]# rpm -qa | grep cloudera
          openjdk8-8.0+232_9-cloudera.x86_64
          cloudera-manager-daemons-7.4.4-15850731.el7.x86_64
          cloudera-manager-agent-7.4.4-15850731.el7.x86_64


          4.2 卸载nfs相关组件

          卸载NFS组件同时,也会移除依赖项cloudera-manager-agent,始料未及。


            [root@nn02 ~]# yum remove -y nfs-utils rpcbind
            Loaded plugins: fastestmirror, langpacks
            Resolving Dependencies
            There are unfinished transactions remaining. You might consider running yum-complete-transaction, or "yum-complete-transaction --cleanup-only" and "yum history redo last", first to finish them. If those don't work you'll have to try removing/installing packages by hand (maybe package-cleanup can help).
            --> Running transaction check
            ---> Package nfs-utils.x86_64 1:1.3.0-0.68.el7.2 will be erased
            ---> Package rpcbind.x86_64 0:0.2.0-49.el7 will be erased
            --> Processing Dependency: portmap for package: cloudera-manager-agent-7.4.4-15850731.el7.x86_64
            --> Processing Dependency: rpcbind for package: 1:quota-4.01-19.el7.x86_64
            --> Running transaction check
            ---> Package cloudera-manager-agent.x86_64 0:7.4.4-15850731.el7 will be erased
            ---> Package quota.x86_64 1:4.01-19.el7 will be erased
            --> Finished Dependency Resolution
            Dependencies Resolved


            =======================================================================================================================================================================
            Package Arch Version Repository Size
            =======================================================================================================================================================================
            Removing:
            nfs-utils x86_64 1:1.3.0-0.68.el7.2 @updates 1.1 M
            rpcbind x86_64 0.2.0-49.el7 @base 101 k
            Removing for dependencies:
            cloudera-manager-agent x86_64 7.4.4-15850731.el7 installed 154 M
            quota x86_64 1:4.01-19.el7 @base 887 k




            Transaction Summary
            =======================================================================================================================================================================
            Remove 2 Packages (+2 Dependent packages)


            Installed size: 156 M
            Downloading packages:
            Running transaction check
            Running transaction test
            Transaction test succeeded
            Running transaction
            Warning: RPMDB altered outside of yum.
            Erasing : cloudera-manager-agent-7.4.4-15850731.el7.x86_64 1/4
            warning: /etc/cloudera-scm-agent/config.ini saved as /etc/cloudera-scm-agent/config.ini.rpmsave
            Erasing : 1:nfs-utils-1.3.0-0.68.el7.2.x86_64 2/4
            warning: file /var/lib/nfs/v4recovery: remove failed: No such file or directory
            warning: file /var/lib/nfs/statd/sm.bak: remove failed: No such file or directory
            warning: file /var/lib/nfs/statd/sm: remove failed: No such file or directory
            warning: file /var/lib/nfs/statd: remove failed: No such file or directory
            warning: directory /var/lib/nfs/rpc_pipefs: remove failed: Device or resource busy
            Erasing : 1:quota-4.01-19.el7.x86_64 3/4
            Erasing : rpcbind-0.2.0-49.el7.x86_64 4/4
            warning: file /var/lib/rpcbind: remove failed: No such file or directory
            Verifying : cloudera-manager-agent-7.4.4-15850731.el7.x86_64 1/4
            Verifying : 1:quota-4.01-19.el7.x86_64 2/4
            Verifying : 1:nfs-utils-1.3.0-0.68.el7.2.x86_64 3/4
            Verifying : rpcbind-0.2.0-49.el7.x86_64 4/4


            Removed:
            nfs-utils.x86_64 1:1.3.0-0.68.el7.2 rpcbind.x86_64 0:0.2.0-49.el7


            Dependency Removed:
            cloudera-manager-agent.x86_64 0:7.4.4-15850731.el7 quota.x86_64 1:4.01-19.el7


            Complete!


            4.3 查看当前节点agent已经被卸载

              [root@nn02 ~]# rpm -qa | grep cloudera
              openjdk8-8.0+232_9-cloudera.x86_64
              cloudera-manager-daemons-7.4.4-15850731.el7.x86_64

               

              5. 建议

               

              cloudera-manager-agent依赖NFS组件卸载NFS时,不小心也删除了agnet组件,也是始料未及生产环境非必要不执行yum remove -y操作,如果确需执行,可以将-y参数去掉,删除前会提示将要删除的依赖包,确定无害后,再执行卸载。


              c
              文章转载自rundba,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

              评论