Tags:ELK扩容
ELK日志平台扩容计算节点
在本文中重点讲一下ELK日志平台扩容计算节点的步骤;
1. 利用扩容节点创建单独的集群:
elasticsearch扩容需要注意的是,因为我的集群已经使用了x-pack插件,所以不能直接扩容到现在的集群。需要先三台机器单独创建一个集群,配置好了x-pack之后然后再加入到现有的集群;不然会存在问题。具体配置命令如下:
1. 初始化操作:
systemctl stop firewalld && systemctl disable firewalld && systemctl stop iptables && systemctl disable iptables && iptables -L -n
mkdir -p /software
wget -O elasticsearch-6.2.4.tar.gz ftp://bqjrftp:Pass123$%^@192.168.20.27:9020/software/elasticsearch-6.2.4.tar.gz
tar xzvf elasticsearch-6.2.4.tar.gz -C /usr/local/
cd /usr/local
ln -sv elasticsearch-6.2.4/ elasticsearch
cd elasticsearch/config/
sed -i '/'"$HOSTNAME"'/'d /etc/hosts
echo -e "192.168.1.210 SZ1PRDELK00AP0016\n192.168.1.211 SZ1PRDELK00AP0017\n192.168.1.212 SZ1PRDELK00AP0018" >> /etc/hosts
mkdir -p /data/es-data && mkdir -p /var/log/elasticsearch
mkdir -p /var/lib/elasticsearch
mkdir -p /var/run/elasticsearch
useradd elasticsearch
chown -R elasticsearch:elasticsearch /data/es-data && chown -R elasticsearch:elasticsearch /var/log/elasticsearch
chown -R elasticsearch:elasticsearch elasticsearch elasticsearch-6.2.4/
chown -R elasticsearch.elasticsearch /var/lib/elasticsearch/
chown -R elasticsearch.elasticsearch /var/run/elasticsearch/
#启动elasticsearch服务的时候,由于配置文件中配置块:#bootstrap.memory_lock: true 所以启动服务是报错,解决如下:echo -e "fs.file-max = 65536\nvm.max_map_count = 655360" >> /etc/sysctl.conf
sysctl -p
echo -e "elasticsearch soft nofile 65539\nelasticsearch hard nofile 65539" >> /etc/security/limits.conf
# 配置ES启动所需要的jvm参数,默认是1G
vim jvm.options
-Xms14g
-Xmx14g
2. 独立集群配置文件修改:
# 首先配置elasticsearch.yml配置文件,配置第一个新增数据节点:cluster.name: my-elk
node.name: data-7
node.data: true
node.ingest: true
node.master: true
path.data: /data/es-data
path.logs: /var/log/elasticsearch
network.host: 192.168.1.210
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.168.1.210", "192.168.1.211", "192.168.1.212"]
discovery.zen.minimum_master_nodes: 2
# 配置elasticsearch的服务启动文件,配置在第一个新增数据节点:#!/bin/bash
#
# elasticsearch <summary>
#
# chkconfig: 2345 80 20
# description: Starts and stops a single elasticsearch instance on this system
#
### BEGIN INIT INFO
# Provides: Elasticsearch
# Required-Start: $network $named
# Required-Stop: $network $named
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: This service manages the elasticsearch daemon
# Description: Elasticsearch is a very scalable, schema-free and high-performance search solution supporting multi-tenancy and near realtime search.
### END INIT INFO
pid_num=$(ps aux |grep elasticsearch|grep NumberOfGCLogFiles|grep -v grep|awk '{print $2}')
start() {
su - elasticsearch -c "nohup /usr/local/elasticsearch/bin/elasticsearch >/dev/null 2>&1 &"
}
stop() {
if [ `ps aux |grep elasticsearch|grep NumberOfGCLogFiles|grep -v grep|wc -l` -eq 1 ];then
kill -9 ${pid_num}
fi
}
status() {
if [ `ps aux |grep elasticsearch|grep NumberOfGCLogFiles|grep -v grep|wc -l` -eq 1 ];then
echo "elasticsearch service is starting"
else
echo "elasticsearch service is stoping"
fi
}
case $1 in
start)
start
;;
stop)
stop
;;
status)
status
;;
*)
echo "service accept arguments start|stop|status"
esac
chmod +x /etc/init.d/es
service es start
# 启动es服务,查看日志确定集群是否已经正常;
第二台计算节点、第三台计算节点的配置和第一台类似,主要是IP地址不同,节点名称不同而已;
3. 独立集群配置X-pack:
cd /software
wget ftp://bqjrftp:Pass123$%^@192.168.20.27:9020/software/ELK_201906/x-pack-6.2.4.zip
cp -r /software/x-pack-6.2.4.zip /home/elasticsearch/
chown -R elasticsearch:elasticsearch /home/elasticsearch/x-pack-6.2.4.zip
su - elasticsearch
/usr/local/elasticsearch/bin/elasticsearch-plugin install file:///home/elasticsearch/x-pack-6.2.4.zip
按两次y键安装x-pack,安装好之后,会在plugins目录多一个x-pack目录,并且会在config目录下面多一个x-pack
ls /usr/local/elasticsearch/plugins/
x-pack
ls /usr/local/elasticsearch/config/x-pack/
log4j2.properties role_mapping.yml roles.yml users users_roles
安装完成之后需要把es服务重新启动一下,重新启动之后需要在每台新增节点上面更换已经破解的x-pack-core-6.2.4.jar包和下载license.json文件,并且先禁用x-pack,然后重启服务,至于破解过程因版权问题自行网上搜索;
cd /usr/local/elasticsearch/plugins/x-pack/x-pack-core
wget ftp://bqjrftp:Pass123$%^@192.168.20.27:9020/software/ELK_201906/x-pack-core-6.2.4.jar
wget -O /home/elasticsearch/license.json ftp://bqjrftp:Pass123$%^@192.168.20.27:9020/software/ELK_201906/license.json
chown -R elasticsearch.elasticsearch /home/elasticsearch/license.json
chown -R elasticsearch.elasticsearch x-pack-core-6.2.4.jar
vim /usr/local/elasticsearch/config/elasticsearch.yml
xpack.security.enabled: false
需要在所有elasticsearch节点elasticsearch-6.2.4/config/elasticsearch.yml增加配置项,用于上传授权文件
service es restart
更换好破解之后的x-pack包,禁用掉x-pack功能之后,用预先准备好的license.json替换licesne,在每个新增的节点上面做操作
su - elasticsearch
curl -XPUT -u elastic:654321 'http://192.168.1.210:9200/_xpack/license' -H "Content-Type: application/json" -d @license.json
x-pack功能之前的文档介绍过,如果不破解x-pack,那么有一个月的免费试用期限,这个时候各个ES节点之间是不需要SSL加密通信的。但是如果你使用破解的x-pack包,并且将xpack.security.enabled: true开启,那么各个ES节点zh之间必须要使用加密通信。加密通信以前的文档介绍过,这里就不在阐述。
我们现在只需要将我们目前运行的生产ELK集群其中任何一个节点上面拷贝证书目录(/usr/local/elasticsearch/config/certs)和密码文件(/usr/local/elasticsearch/config/elasticsearch.keystore)拷贝到每一个新增节点的位置;cd /usr/local/elasticsearch/config
wget ftp://bqjrftp:Pass123$%^@192.168.20.27:9020/software/ELK_201906/certs.tar.gz
tar xzvf certs.tar.gz
rm -rf certs.tar.gz
wget ftp://bqjrftp:Pass123$%^@192.168.20.27:9020/software/ELK_201906/elasticsearch.keystore.tar.gz
rm -rf elasticsearch.keystore
tar xzvf elasticsearch.keystore.tar.gz
每台新增的节点修改配置文件,加入到已有的生产集群,并重启服务;
cluster.name: my-elk
node.name: data-7
node.data: true
node.ingest: true
path.data: /data/es-data
path.logs: /var/log/elasticsearch
network.host: 192.168.1.210
http.port: 9200
discovery.zen.ping.unicast.hosts: ["192.168.1.17", "192.168.1.18", "192.168.1.25"]
discovery.zen.minimum_master_nodes: 2
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: certs/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: certs/elastic-certificates.p12
http.cors.enabled: true
http.cors.allow-origin: "*"
http.cors.allow-headers: Authorization,X-Requested-With,Content-Length,Content-Type
主要是增加了认证的信息,包括证书的路径和允许http访问就可以了;区别只是每一台机器的ip不一样而已。然后分别重启每一台节点的服务就可以了;
其实ES集群扩容的操作并不难,和安装的步骤差不多。我以前的文档安装和配置都介绍的很详细了。重点在于不能直接扩容,直接扩容的话首先需要节点之间要SSL加密通讯,而加密通讯的前提就是必须先将license破解,但是license破解又必须在加入加群之后才可以。这就产生了矛盾的问题了。这个问题困扰了我好久,终于有一天茅塞顿开,先将要扩容的几台机器自己组织一个单独的集群。然后通过这个单独的集群,将原有的集群的x-pack软件包,证书文件,license文件配置成一样。然后这个单独的集群没有问题了,再修改配置加入到现有的ES集群。
除了这个扩容的技术之外,顺便再说一点ES的一点小优化的知识。
优化原理:
索引的存储是通过shared(分片)—-segments(段),每秒会产生一个段,系统自动有段合并的功能,每一个段占用一个文件打开数。
段合并的过程会占用cpu和内存资源。
定时将冷数据段合并可以优化系统;
我通过一个shell脚本来实现定时每天合并前一天的索引的段。
这样就会占用的文件打开数比较少。
[root@SZ1PRDELK00AP008 ~]# cat /.scripts/ES-forcemerge-segments-index.sh
#! /bin/bash
#脚本的日志文件路径
segments_LOG="/var/log/forcemerge_segments.log"
#索引前缀
INDEX_PREFIX="elk-"
#elasticsearch 的主机ip及端口
SERVER_PORT="192.168.1.19:9200"
ES_USER="test"
ES_PW="888888"
#读取已有的索引列表
segments_data=$(date -d "-1 days" +"%F")
INDEXS=$(curl -s -u ${ES_USER}:${ES_PW} \-XGET "${SERVER_PORT}/_cat/indices/?v"|grep "${INDEX_PREFIX}" | awk '{print $3}'|grep ${segments_data})
#定时合并索引的段
echo "------------------------forcemerge segments time is $(date +%Y-%m-%d_%H:%M:%S)------------------------">>${segments_LOG}
for forcemerge_segments_index in ${INDEXS}
do
curl -s -u ${ES_USER}:${ES_PW} -XPOST "${SERVER_PORT}/${forcemerge_segments_index}/_forcemerge?max_num_segments=1"
echo "forcemerge segments time is $(date)" >> ${segments_LOG}
echo "${forcemerge_segments_index} merge segments is success" >> ${segments_LOG}
done
[root@SZ1PRDELK00AP008 ~]#
crontab -e
59 01 * * * sh /.scripts/ES-forcemerge-segments-index.sh > /dev/null 2>&1
行了,今天就讲到这里吧。因为我最近在公司负责的项目主要是ELK日志平台,还有APM链路监控、zabbix监控、日志监控、统一告警统一监控展示的项目。所以最近会针对监控这方面的内容多讲一些。欢迎大家持续关注我的个人技术公众号“云时代IT运维”,可以扫码关注;




