solr-infra存储ranger audit数据默认在一台主机上,在CDP Private Cloud Base 7.1.1 ~ 7.1.6上TTL删除不生效,导致占用大量系统磁盘空间,上期没有采用升级方法,采用设置策略的方法,默认保留90天数据,删除了旧的数据。本期对solr-infra进行优化,根据机器数量对solr进行分片,将数据均匀分布到不同主机上,使数据均匀分布,并进行冗余,对搜索时间过长或索引接近其机器的物理限制时,提供了较好的方式,避免单点压力过大。

0. ENV

CentOS 7.6;
CDP Private Cloud Base 7.1.6。
当前使用了kerberos认证。
1. 使用solr用户以kerberos认证登录任意一台Solr Server

可通过CM > CDP-INFRA-SOLR > 实例,确认都有哪些Solr Server。
进入最新的Solr Server目录
[root@dn08 ~]# cd /var/run/cloudera-scm-agent/process/8581-solr-SOLR_SERVER #8581为最新的solr进程目录
查看solr用户信息
[root@dn08 ~]# klist -ket solr.keytab
以solr用户和solr.keytab文件kerberos登录
[root@dn08 ~]# kinit -kt solr.keytab solr/dn08.rundba.com@RUNDBA.COM
查看当前登录用户-solr
[root@dn08 8581-solr-SOLR_SERVER]# klistTicket cache: FILE:/tmp/krb5cc_0Default principal: solr/dn08.rundba.com@RUNDBA.COMValid starting Expires Service principal02/16/2022 13:20:19 02/17/2022 13:20:19 krbtgt/RUNDBA.COM@RUNDBA.COMrenew until 02/21/2022 13:20:1902/16/2022 13:20:57 02/17/2022 13:20:19 HTTP/dn01.rundba.com@RUNDBA.COMrenew until 02/21/2022 13:20:19
2. 删除原有的ranger_audits信息

删除前空间确认
[root@dn08 data]# df -hFilesystem Size Used Avail Use% Mounted on/dev/mapper/centos-root 442G 199G 244G 45% /...
删除原有的ranger audit信息
[root@dn08 8581-solr-SOLR_SERVER]# solrctl collection --delete ranger_audits
删除后空间确认
[root@dn08 data]# df -hFilesystem Size Used Avail Use% Mounted on/dev/mapper/centos-root 442G 45G 398G 11% /devtmpfs 252G 0 252G 0% /dev
此时空间已大量释放,当solr空间占用较小时,删除对空间影响不明显。
查看solr信息
执行list后,无ranger_audits一行显示,说明ranger_audits信息已经删除。
[root@dn08 8581-solr-SOLR_SERVER]# solrctl collection --listvertex_index (5)edge_index (5)fulltext_index (5)
3. 根据主机数量配置solr分片数量和冗余数量

3.1 根据主机数量设置分片和数据冗余份数
假设当前共10个datanode主机,ranger audits设置为5个solr分片,每个分片2份数据,有效利用10台主机作为solr server。
CM > Ranger > 配置,搜索audit关键字,分别设置Shards for Solr Collection of Ranger Audits为5,设置Replicas for Solr Collection of Ranger Audits为2。

稍等片刻后,点击重启Ranger服务

点击“重启过时服务”

默认滚动重启,点击“立即重启”

点击继续

3.2 命令行再次查看solr信息
执行list后,有ranger_audits一行显示,说明ranger_audits信息已经生效。
[root@dn08 8581-solr-SOLR_SERVER]# solrctl collection --listvertex_index (5)edge_index (5)fulltext_index (5)ranger_audits (5) #存在该行
4. 确认分片配置是否生效

4.1 命令行查看分片
[root@dn08 8581-solr-SOLR_SERVER]# solrctl cluster --get-clusterstate /tmp/clusterstate_`date +%Y%m%d%H%M%S`.json[root@dn08 8581-solr-SOLR_SERVER]# ls -lrt /tmp/*json-rw-r--r-- 1 root root 8826 Feb 16 14:07 /tmp/clusterstate_20220216140706.json[root@dn08 8581-solr-SOLR_SERVER]# view /tmp/clusterstate_20220216140706.json
5个sharded,每个sharded中有两个replication,摘录部分片段:
91 "ranger_audits":{92 "pullReplicas":"0",93 "replicationFactor":"2",94 "shards":{95 "shard1":{96 "range":"80000000-b332ffff",97 "state":"active",98 "replicas":{99 "core_node3":{100 "core":"ranger_audits_shard1_replica_n1",101 "base_url":"http://dn01.rundba.com:8993/solr",102 "node_name":"dn01.rundba.com:8993_solr",103 "state":"active",104 "type":"NRT",105 "force_set_state":"false"},106 "core_node5":{107 "core":"ranger_audits_shard1_replica_n2",108 "base_url":"http://dn07.rundba.com:8993/solr",109 "node_name":"dn07.rundba.com:8993_solr",110 "state":"active",111 "type":"NRT",112 "force_set_state":"false",113 "leader":"true"}}},...
也可以通过solr WEB UI确认。
4.2 通过“Solr 服务器 Web UI”查看分片
当前已存在5个分片,每个分片2份数据:

5. bug处理

在CDP 7.1.1 ~ 7.1.6中,solr默认保留90天ranger audits数据,但TTL并未生效,在不升级到CDP 7.1.7时,可以通过更改配置文件逻辑解决。
5.1 下载solr配置到/tmp/ranger_audits目录
[root@dn08 8581-solr-SOLR_SERVER]# solrctl instancedir --get ranger_audits /tmp/ranger_auditsDownloading configs from yw-namenode02.rundba.com:2181,yw-namenode01.rundba.com:2181,yw-cm.rundba.com:2181/solr-infra to /tmp/ranger_audits. This may take up to a minute.[root@dn08 8581-solr-SOLR_SERVER]# pwd/var/run/cloudera-scm-agent/process/8581-solr-SOLR_SERVER
5.2 调整配置顺序
[root@dn08 8581-solr-SOLR_SERVER]# cd /tmp/ranger_audits/[root@dn08 ranger_audits]# cd conf/[root@dn08 conf]# lsadmin-extra.html admin-extra.menu-top.html LICENSE.txt NOTICE.txt solrconfig.xml.j2admin-extra.menu-bottom.html elevate.xml managed-schema solrconfig.xml zknode.data[root@dn08 conf]# vim solrconfig.xml...
移动3行(先删除3行)
1045 <processor class="solr.LogUpdateProcessorFactory"/>1046 <processor class="solr.DistributedUpdateProcessorFactory"/>1047 <processor class="solr.RunUpdateProcessorFactory"/>
段落内容
1044 processor="uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date,add-schema-fields">/* 此处三行已删除 */1045 <processor class="solr.DefaultValueUpdateProcessorFactory">1046 <str name="fieldName">_ttl_</str>1047 <str name="value">+90DAYS</str>1048 </processor>1049 <processor class="solr.processor.DocExpirationUpdateProcessorFactory">1050 <int name="autoDeletePeriodSeconds">86400</int>1051 <str name="ttlFieldName">_ttl_</str>1052 <str name="expirationFieldName">_expire_at_</str>1053 </processor>1054 <processor class="solr.FirstFieldValueUpdateProcessorFactory">1055 <str name="fieldName">_expire_at_</str>1056 </processor> /* 移动到该行后 */1057 <processor class="solr.LogUpdateProcessorFactory"/>1058 <processor class="solr.DistributedUpdateProcessorFactory"/>1059 <processor class="solr.RunUpdateProcessorFactory"/>
5.3 上传配置
1) 上传配置失败
[root@dn08 conf]# solrctl instancedir --update ranger_audits /tmp/ranger_auditsError: can't delete configuration
2) 查找最新jaas.conf配置文件
查看最新的solr信息,8581-solr-SOLR_SERVER该行为solr最新配置。
[root@dn08 conf]# ls -lrt /var/run/cloudera-scm-agent/processtotal 0...drwxr-x--x 6 kudu kudu 420 Feb 14 02:47 10225-KUDU-kudu-KUDU_TSERVER-7517310a47b84873982699b9f8a0f11c-RoleDiagnosticsdrwxr-x--x 3 root root 220 Feb 14 02:47 10197-host-inspectordrwxr-x--x 4 root root 240 Feb 14 02:47 10147-collect-host-statisticsdrwxr-x--x 7 impala impala 420 Feb 16 04:09 8703-impala-IMPALADdrwxr-x--x 5 solr solr 340 Feb 16 14:15 8581-solr-SOLR_SERVER #这行为较新配置
3) 查看8581中的solr配置文件--jaas.conf
[root@dn08 conf]# ls -lrt /var/run/cloudera-scm-agent/process/8581-solr-SOLR_SERVERtotal 60-rw------- 1 solr solr 1474 Nov 24 19:45 solr.keytab-rw------- 1 root root 5936 Nov 24 19:45 supervisor.confdrwxr-x--x 2 solr solr 80 Nov 24 19:45 logs-rw------- 1 root root 6070 Nov 24 19:45 proc.json-rw------- 1 root root 8242 Nov 24 19:45 config.zip-rw------- 1 solr solr 0 Nov 24 19:45 exit_code-rw-r----- 1 solr solr 329 Nov 24 19:45 cloudera-monitor.properties-rw-r----- 1 solr solr 250 Nov 24 19:45 jaas.conf-rw-r----- 1 solr solr 3969 Nov 24 19:45 log4j2.properties-rw-r----- 1 solr solr 1661 Nov 24 19:45 redaction-rules.jsondrwxr-x--x 2 solr solr 180 Nov 24 19:45 hadoop-conf-rw-r----- 1 solr solr 344 Nov 24 19:45 cloudera-stack-monitor.propertiesdrwxr-xr-x 2 solr solr 40 Nov 24 19:45 temp-rw------- 1 root root 1470 Feb 16 14:16 krb5cc_cm_agent_solr-rw------- 1 root root 549 Feb 16 14:18 supervisor_status
4) 以8581中的jaas.conf更新solr配置-成功
[root@dn08 conf]# solrctl --jaas /var/run/cloudera-scm-agent/process/8581-solr-SOLR_SERVER/jaas.conf instancedir --update ranger_audits /tmp/ranger_auditsUploading configs from /tmp/ranger_audits/conf to yw-namenode02.rundba.com:2181,yw-namenode01.rundba.com:2181,yw-cm.rundba.com:2181/solr-infra. This may take up to a minute.
5.4 重新加载solr配置
[root@dn08 conf]# solrctl collection --reload ranger_audits
5.5 验证
再次下载solr信息到最新配置中:
[root@dn08 conf]# solrctl instancedir --get ranger_audits /tmp/ranger_audits_newDownloading configs from yw-namenode02.rundba.com:2181,yw-namenode01.rundba.com:2181,yw-cm.rundba.com:2181/solr-infra to /tmp/ranger_audits_new. This may take up to a minute.view /tmp/ranger_audits/conf/solrconfig.xml1042 <!-- The update.autoCreateFields property can be turned to false to disable schemaless mode -->1043 <updateRequestProcessorChain name="add-unknown-fields-to-the-schema" default="${update.autoCreateFields:true}"1044 processor="uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date,add-schema-fields">1045 <processor class="solr.DefaultValueUpdateProcessorFactory">1046 <str name="fieldName">_ttl_</str>1047 <str name="value">+90DAYS</str>1048 </processor>1049 <processor class="solr.processor.DocExpirationUpdateProcessorFactory">1050 <int name="autoDeletePeriodSeconds">86400</int>1051 <str name="ttlFieldName">_ttl_</str>1052 <str name="expirationFieldName">_expire_at_</str>1053 </processor>1054 <processor class="solr.FirstFieldValueUpdateProcessorFactory">1055 <str name="fieldName">_expire_at_</str>1056 </processor>1057 <processor class="solr.LogUpdateProcessorFactory"/>1058 <processor class="solr.DistributedUpdateProcessorFactory"/>1059 <processor class="solr.RunUpdateProcessorFactory"/>1060 </updateRequestProcessorChain>

6. 小结

在kerberos环境下,通过对ranger audits的数据,进行solr分片,根据机器数量设置合理的分片数,并设置数据冗余,有效提升大集群下大量数据时的solr性能,同时对CDP 7.1.1- 7.1.6中清理不能清理历史数据的bug进行了手动修复。
7. 参考

https://community.cloudera.com/t5/Community-Articles/split-shards-and-add-replicas-for-collection-quot-ranger/ta-p/245141https://solr.apache.org/guide/6_6/introduction-to-scaling-and-distribution.htmlhttps://solr.apache.org/guide/6_6/distributed-search-with-index-sharding.html
旨在交流,不足之处,还望抛砖。
作者:王坤,微信公众号:rundba,欢迎转载,转载请注明出处。
如需公众号转发,请联系wx: landnow。





