暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

误删cloudera-scm-agent恢复

数据湖 2020-09-14
2109

之前在测试集群在折腾Cloudera Manager,有一次误把cloudera-scm-agent给删了。原因是卸载httpd的时候,没有发现cloudera-scm-agent依赖http服务,卸载的时候连同cloudera-scm-agent一起给删了。那次我重新安装了cloudera-manager-agent,反复折腾,CM就是无法发现这台主机。无奈之下,由于是测试集群,我就重装了一遍Cloudera Manager。

仔细一想,分布式集群,挂了一台从节点,按道理从节点恢复后,根据IP或者主机名,从节点应该能连接上主结点的,不可能需要重装。难道出在连接IP或者主机名的过程中。

后来仔细看了这个节点的cloudera-scm-agent.log日志,发现原来真是IP的问题

    [13/Sep/2020 05:01:33 +0800] 22503 MainThread agent        ERROR    Heartbeating to localhost:7182 failed.
    Traceback (most recent call last):
    File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1390, in _send_heartbeat
    self.cfg.master_port)
    File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 469, in __init__
    self.conn.connect()
    File "/usr/lib64/python2.7/httplib.py", line 833, in connect
    self.timeout, self.source_address)
    File "/usr/lib64/python2.7/socket.py", line 571, in create_connection
    raise err
    error: [Errno 111] Connection refused
    [13/Sep/2020 05:01:55 +0800] 22503 MainThread heartbeat_tracker INFO HB stats (seconds): num:1 LIFE_MIN:0.00 min:0.00 mean:0.00 max:0.00 LIFE_MAX:0.00

    单独启动cloudera-scm-agent后,连接的是 localhost:7182 而不是 server端的ip

    于是我们需要修改cloudera-scm-agent连接的cloudera-scm-server配置

      [root@cdh2 cloudera-scm-agent]# vim etc/cloudera-scm-agent/config.ini


      # Configuration file for cloudera-scm-agent.
      # Please note that this file supports multi-line values. Multi-line
      # values are indicated by indenting following lines with a space.
      #
      # If you have whitespace in front of a parameter name, it will be
      # read as a continuation of the previous parameter value. Please
      # be careful not to leave spaces in front of parameter names.
      #
      # To check if this file has spaces in front of parameters names
      # you can do a grep like this:
      # grep '^[[:blank:]]' etc/cloudera-scm-agent/config.ini


      [General]
      # Hostname of the CM server.
      server_host=192.168.0.171


      # Port that the CM server is listening on.
      server_port=7182


      然后重启cloudera-scm-agent就可以了

        systemctl restart cloudera-scm-agent
        文章转载自数据湖,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

        评论