暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

大数据开发系列四:hadoop& flink 配置kerberos认证

IT那活儿 2022-12-03
3881

点击上方“IT那活儿”公众号,关注后了解更多内容,不管IT什么活儿,干就完了!!!


前 言

我们之前的大数据开发系列文章介绍了kerberos是如何安装与使用,本次在已安装的kerberos服务基础上对hadoop 与flink组件进行kerberos认证配置。

环境依赖


类型
主机
主机hostname
安装组件
kerberos服务端
192.168.199.102
bigdata-03
krb5-server
krb5-workstation
krb5-libs
krb5-devel
kerberos客户端
192.168.199.104
bigdata-05
krb5-workstation
krb5-devel
hadoop环境
192.168.199.104
bigdata-05
hadoop-3.3.3


hadoop认证配置

3.1 创建principle添加用户
hadoop的kerberos认证,一般需要配置三种principle,分别是 hadoop, host, HTTP。
格式为:用户名/主机hostname@HADOOP.COM。
如果现有的HDFS和YARN守护程序用的是同一个用户身份运行,可以配置成一个hadoop principle。
kadmin.local -q "addprinc -randkeyhadoop/bigdata-03@HADOOP.COM"
kadmin.local -q "addprinc -randkeyhadoop/bigdata-05@HADOOP.COM"

listprincs/list_principals 查询所有用户。
3.2 创建keytab密码文件
kadmin.local -q "xst -k root/keytabs/kerberos/hadoop.keytabhadoop/bigdata-03@HADOOP.COM"
kadmin.local -q "xst -k /root/keytabs/kerberos/hadoop.keytab hadoop/bigdata-05@HADOOP.COM"

查看:
klist -kt /root/keytabs/kerberos/hadoop.keytab
klist -kt /home/gpadmin/hadoop.keytab

3.3 hadoop配置修改
1)core-site.xml 配置新增
<property>
<name>hadoop.security.authentication</name>
<value>kerberos</value>
</property>
<property>
<name>hadoop.security.authorization</name>
<value>true</value>
</property>

2)hdfs-site.xml 配置新增
<!-- 访问DataNode数据块时需通过Kerberos认证 -->
    <property>
      <name>dfs.block.access.token.enable</name>
      <value>true</value>
    </property>
                <property>
                  <name>dfs.permissions.enabled</name>
                  <value>false</value>
                </property> 
    <!-- NameNode服务的Kerberos主体,_HOST会自动解析为服务所在的主机名 -->
    <property>
      <name>dfs.namenode.kerberos.principal</name>
      <value>hadoop/_HOST@EXAMPLE.COM</value>
    </property>
    <!-- NameNode服务的Kerberos密钥文件路径 -->
    <property>
      <name>dfs.namenode.keytab.file</name>
      <value>/home/gpadmin/hadoop.keytab</value>
    </property>

      <!-- Secondary NameNode服务的Kerberos主体 -->
    <property>
      <name>dfs.secondary.namenode.kerberos.principal</name>
      <value>hadoop/_HOST@EXAMPLE.COM</value>
    </property>
    <!-- Secondary NameNode服务的Kerberos密钥文件路径 -->
    <property>
      <name>dfs.secondary.namenode.keytab.file</name>
      <value>/home/gpadmin/hadoop.keytab</value>
    </property>

    <!-- WebHDFS REST服务的Kerberos主体 -->
    <property>
      <name>dfs.web.authentication.kerberos.principal</name>
      <value>hadoop/_HOST@EXAMPLE.COM</value>
    </property>
        <!-- Hadoop Web UI的Kerberos密钥文件路径 -->
    <property>
      <name>dfs.web.authentication.kerberos.keytab</name>
      <value>/home/gpadmin/hadoop.keytab</value>
    </property>


    <!-- DataNode服务的Kerberos主体 -->
    <property>
      <name>dfs.datanode.kerberos.principal</name>
      <value>hadoop/_HOST@EXAMPLE.COM</value>
    </property>

    <!-- DataNode服务的Kerberos密钥文件路径 -->
    <property>
      <name>dfs.datanode.keytab.file</name>
      <value>/home/gpadmin/hadoop.keytab</value>
    </property>


    <!-- 配置DataNode数据传输保护策略为仅认证模式 -->
    <property>
      <name>dfs.data.transfer.protection</name>
      <value>authentication</value>
    </property>

    <!-- 使用HTTPS协议 -->
    <property>
      <name>dfs.http.policy</name>
      <value>HTTPS_ONLY</value>
      <description>所有开启的web页面均使用https, 细节在ssl server 和client那个配置文件内配置</description>
    </property>

3)yarn-site.xml配置新增
<!-- Resource Manager 服务的Kerberos主体 -->
<property>
  <name>yarn.resourcemanager.principal</name>
  <value>hadoop/_HOST@EXAMPLE.COM</value>
</property>

<!-- Resource Manager 服务的Kerberos密钥文件 -->
<property>
  <name>yarn.resourcemanager.keytab</name>
  <value>/home/gpadmin/hadoop.keytab</value>
</property>

<!-- Node Manager 服务的Kerberos主体 -->
<property>
  <name>yarn.nodemanager.principal</name>
  <value>hadoop/_HOST@EXAMPLE.COM</value>
</property>

<!-- Node Manager 服务的Kerberos密钥文件 -->
<property>
  <name>yarn.nodemanager.keytab</name>
  <value>/home/gpadmin/hadoop.keytab</value>
</property>

4)mapred.xml配置新增
<!-- 历史服务器的Kerberos密钥文件 -->
<property>
<name>mapreduce.jobhistory.principal</name>
<value>hadoop/_HOST@EXAMPLE.COM</value>
</property>

<!-- 历史服务器的Kerberos主体 -->
<property>
<name>mapreduce.jobhistory.keytab</name>
<value>/home/gpadmin/hadoop.keytab</value>
</property>

5)ssl-server配置新增
<property>
<name>ssl.server.truststore.location</name>
<value>/home/gpadmin/kerberos_https/keystore</value>
<description>Truststore to be used by NN and DN. Must be specified.
</description>
</property>

<property>
<name>ssl.server.truststore.password</name>
<value>password</value>
<description>Optional. Default value is "".
</description>
</property>

<property>
<name>ssl.server.truststore.type</name>
<value>jks</value>
<description>Optional. The keystore file format, default value is "jks".
</description>
</property>

<property>
<name>ssl.server.truststore.reload.interval</name>
<value>10000</value>
<description>Truststore reload check interval, in milliseconds.
Default value is 10000 (10 seconds).
</description>
</property>

<property>
<name>ssl.server.keystore.location</name>
<value>/home/gpadmin/kerberos_https/keystore</value>
<description>Keystore to be used by NN and DN. Must be specified.
</description>
</property>

<property>
<name>ssl.server.keystore.password</name>
<value>password</value>
<description>Must be specified.
</description>
</property>

<property>
<name>ssl.server.keystore.keypassword</name>
<value>password</value>
<description>Must be specified.
</description>
</property>

<property>
<name>ssl.server.keystore.type</name>
<value>jks</value>
<description>Optional. The keystore file format, default value is "jks".
</description>
</property>

<property>
<name>ssl.server.exclude.cipher.list</name>
<value>TLS_ECDHE_RSA_WITH_RC4_128_SHA,SSL_DHE_RSA_EXPORT_WITH_DES40_CBC_SHA,
SSL_RSA_WITH_DES_CBC_SHA,SSL_DHE_RSA_WITH_DES_CBC_SHA,
SSL_RSA_EXPORT_WITH_RC4_40_MD5,SSL_RSA_EXPORT_WITH_DES40_CBC_SHA,
SSL_RSA_WITH_RC4_128_MD5</value>
<description>Optional. The weak security cipher suites that you want excluded
from SSL communication.</description>
</property>

3.4 https证书配置
keytool -keystore keystore -alias hadoop -validity 365000 -
keystore/home/gpadmin/kerberos_https/keystore/keystore -
genkey -keyalg RSA -keysize 2048 -dname "CN=hadoop,
OU=shsnc, O=snc, L=hunan, ST=changsha, C=CN"

生成keystore 证书文件。
3.5 认证测试
查看 hdfs 目录:hdfs  dfs  -ls  /
报错信息:2022-11-22 10:22:15,444 WARN ipc.Client: Exception encountered while connecting to the server
org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]
说明已加了认证后不能直接访问,客户端先进行认证才能正常访问目录结构。
kinit  -kt /home/gpadmin/hadoop.keytabhadoop/bigdata-05@HADOOP.COM


flink认证配置

4.1 认证用户配置
如果hdfs-site.xml 属性项配置:
<property>
<name>dfs.permissions.enabled</name>
<value>true</value>
</property>

  • 为true时,新建凭证为hadoop 安装用户,如以gpadmin用户安装了hadoop。
    kadmin.local -q "xst -k /root/keytabs/kerberos/hadoop.keytab gpadmin@HADOOP.COM"
  • 为false时,新建凭证可以不是hadoop 安装用户。
    kadmin.local -q "xst -k /root/keytabs/kerberos/hadoop.keytab xx@HADOOP.COM"
验证:
klist -kt /root/keytabs/kerberos/hadoop.keytab
4.2 flink-conf.yaml 新增配置
security.kerberos.login.use-ticket-cache: true
security.kerberos.login.keytab: /home/gpadmin/hadoop.keytab
security.kerberos.login.principal: gpadmin@HADOOP.COM
security.kerberos.login.contexts: Client

4.3 认证测试
flink run -m yarn-cluster
-p 1
-yjm 1024
-ytm 1024
-ynm amp_zabbix
-c com.shsnc.fk.task.tokafka.ExtratMessage2KafkaTask
-yt /home/gpadmin/jar_repo/config/krb5.conf
-yD env.java.opts.jobmanager=-Djava.security.krb5.conf=krb5.conf
-yD env.java.opts.taskmanager=-Djava.security.krb5.conf=krb5.conf
-yD security.kerberos.login.keytab=/home/gpadmin/hadoop.keytab
-yD security.kerberos.login.principal=gpadmin@HADOOP.COM

$jarname

在提交到flink 任务参数里面加入红色部份认证配置,能正常提交到yarn 集群且日志没有相关认证报错信息,说明认证配置成功。


本文作者:长研架构小组(上海新炬王翦团队)

本文来源:“IT那活儿”公众号

文章转载自IT那活儿,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论