暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

ES集群规划优化方案

IT那活儿 2023-01-11
1224
点击上方“IT那活儿”公众号,关注后了解更多内容,不管IT什么活儿,干就完了!!



背 景



1.1 现状资源接入量
监控设备达到4000+节点,其中网络设备400+ (监控项数约80万,采集频率1分钟),日数据量200G 左右。
1.2 问题
ES为普通 6节点普通集群,未做冷热分类,读写压力巨大,导致数据写入缓慢、查询数据返回超时。

1.3 计划采取措施

  • 常规集群模式变更为冷(6个月历史数据)热(1个月数据)集群;
  • 热节点采用固态硬盘替换机械硬盘,提升读写性能。



ES配置



2.1 预备硬件基本信息

生产环境预备的是:

  • 3主节点:内存64G,8核,5T的机械硬盘;
  • 3热数据节点:内存64G,16核,5T的固态硬盘;
  • 3冷数据节点:内存64G,8核,50T的机械硬盘。

2.2 现状分析

1)主节点磁盘资源使用不完

因为仅主节点非数据节点不存储数据,因此主节点不需要单独配置5T的硬盘。

2)主机数较多

因为主机内存仅为64g内存,因此单个主机仅能部署1个数据节点,节点间的通信是主机间通信,相比主机内节点间通信会增加通信延时,节点数越多要求的主机数越多,不仅增加节点间通信延时,同时扩展性很差,因此建议采用较大内存主机,主机内部署多个节点。

2.3 规划逻辑

1)主节点和数据节点共用主机

主节点和数据节点在同一个主机上能减少主节点与数据节点间的延时。

2)规划数据量及所需承载节点数

  • 热设备存储30天的数据,冷设备存储6个月的数据,生产环境目前每天产生的数据大概在200G左右,周期为30天之前的数据迁移至冷节点。
  • 总热数据量=200G*30 约6t数据,按单个64g内存数据节点规划可以承载1.85t数据(内存存储比为30),至少需4个热数据节点。
  • 总冷数据量=200G*6*30 约36t数据,按单个64g内存数据节点规划可以承载32t数据(内存存储比为500),至少需要2个冷数据节点。

2.4 规划优化方案

1)主节点和热数据节点
按256g机器来规划,256g机器最多可以部署4个节点,其中一个节点当做主节点,其它部署为数据节点,因此需要2台256g作为主节点和热数据节点共用的主机,考虑单主机故障容灾,建议使用3台256g的机器作为主节点和热数据节点共用的主机。相比之前规划减少6台主机。
2)冷数据节点
目前3台64g内存的主机作为冷数据节点已达到上述至少2个冷数据节点的要求,因此可以暂不更改,如果资源足够,也可以采用2台128/256g内存的机器,每个主机上部署2个冷数据节点,共4个冷数据节点。
结合规划逻辑和实际情况,目前的规划方案如下,后续可根据实际使用情况进行调整。
3)现场生产实际情况
目前有3台512G内存、48核、39T机械硬盘、4块2.9T固态硬盘的物理机(若不划分虚拟机,则4块固态硬盘可以叠加),因此方案改为:物理机不划分虚拟机,一台物理机上部署1个主节点(不占用磁盘)、2个热数据节点(使用固态硬盘)、1个冷数据节点(使用机械硬盘)。

2.5 配置文件

1)生成证书,在node0执行一次即可(xpack开启)

cd elasticsearch
export  JAVA_HOME=/home/shsnc/snc_product/elasticsearch/jdk ##修改JDK的环境变量
./bin/elasticsearch-certutil ca ##一直回车即可
./bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12 ##一直回车即可
mv elastic-certificates.p12 ./config/    ##将证书文件放置到node0指定路径
mv elastic-stack-ca.p12 ./config/      ##将证书文件放置到node0指定路径
scp elastic-certificates.p12 elastic-stack-ca.p12 192.168.XXX.178:/home/shsnc/snc_product/elasticsearch/config/    ##将证书文件放置到node1指定路径
scp elastic-certificates.p12 elastic-stack-ca.p12 192.168.XXX.179:/home/shsnc/snc_product/elasticsearch/config/    ##将证书文件放置到node2指定路径
./bin/x-pack-env ##执行环境变量
./bin/x-pack-security-env ##执行环境变量
./bin/elasticsearch-setup-passwords interactive ##手动配置每个用户密码
export  JAVA_HOME=/home/shsnc/snc_product/jdk ##还原JDK的环境变量

2)主节点配置

cluster.name: shsnc
node.name: node0
network.host: 192.168.XXX.177
http.port: 9200
transport.tcp.port: 9300
node.master: true
node.data: false
node.ingest: true
bootstrap.memory_lock: true
cluster.routing.allocation.same_shard.host: true
xpack.license.self_generated.type: basic
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: home/shsnc/snc_product/elasticsearch/config/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: home/shsnc/snc_product/elasticsearch/config/elastic-certificates.p12
http.cors.enabled: true
http.cors.allow-origin: "*"
bootstrap.system_call_filter: false
node.attr.box_type: hot
discovery.zen.ping.unicast.hosts: ["192.168.XXX.177:9300","192.168.XXX.178:9300","192.168.XXX.179:9300"]

3)热数据节点配置

cluster.name: shsnc
node.name: node1
path.data: home/shsnc/snc_product/elasticsearch/data
network.host: 192.168.XXX.178
http.port: 9200
transport.tcp.port: 9300
node.master: false
node.data: true
node.ingest: true
bootstrap.memory_lock: true
cluster.routing.allocation.same_shard.host: true
xpack.license.self_generated.type: basic
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: home/shsnc/snc_product/elasticsearch/config/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: home/shsnc/snc_product/elasticsearch/config/elastic-certificates.p12
http.cors.enabled: true
http.cors.allow-origin: "*"
bootstrap.system_call_filter: false
node.attr.box_type: hot
discovery.zen.ping.unicast.hosts: ["192.168.XXX.177:9300","192.168.XXX.178:9300","192.168.XXX.179:9300"]

4)冷数据节点配置

cluster.name: shsnc
node.name: node1
path.data: home/shsnc/snc_product/elasticsearch/data
network.host: 192.168.XXX.179
http.port: 9200
transport.tcp.port: 9300
node.master: false
node.data: true
node.ingest: true
bootstrap.memory_lock: true
cluster.routing.allocation.same_shard.host: true
xpack.license.self_generated.type: basic
xpack.security.enabled: true
xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: home/shsnc/snc_product/elasticsearch/config/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: home/shsnc/snc_product/elasticsearch/config/elastic-certificates.p12
http.cors.enabled: true
http.cors.allow-origin: "*"
bootstrap.system_call_filter: false
node.attr.box_type: cold
discovery.zen.ping.unicast.hosts: ["192.168.XXX.177:9300","192.168.XXX.178:9300","192.168.XXX.179:9300"]

注:拷贝部署时,需要将data目录清空后,再启动。

5)部署完成后截图例子

  • ip: 集群中节点的 ip 地址;
  • Heap.perecnt:堆内存的占用百分比;
  • ram.percent:总内存的占用百分比,其实这个不是很准确,因为 buff/cache 和 available 也被当作使用内存;
  • cpu:cpu 占用百分比;
  • load_1m:1 分钟内 cpu 负载;
  • load_5m:5 分钟内 cpu 负载;
  • load_15m:15 分钟内 cpu 负载;
  • node.role:dilm 分别表示 m表示master节点、d表示data节点、i表示ingest节点、l表示机器学习节点;
  • master:* 代表是 master 节点,- 代表普通节点;
  • name:节点的名称。




修改ES索引模板



3.1 修改 dbl_template 模板

添加配置 "index.routing.allocation.require.box_type": "hot"
curl -XPUT "http://${node0_ip}:${node0_port}/_template/dbl_template?include_type_name=true" \
 -H "Content-Type: application/json;charset=UTF-8"  -d '
{
  "index_patterns": ["dbl*"],
         "settings" : {
                 "index" : {
                         "routing.allocation.require.box_type" : "hot",
                         "translog.durability": "async",
                         "translog.sync_interval":"30s",
                         "translog.flush_threshold_size":"1g",
                         "number_of_replicas" : 1,
                         "number_of_shards" : 3 ,
                         "lifecycle.name": "datastream_policy",
                         "max_result_window": 10000000,
                         "search.slowlog.threshold.fetch.info" : "1s",
                         "search.slowlog.threshold.fetch.warn" : "3s",
                         "search.slowlog.threshold.query.info" : "5s",
                         "search.slowlog.threshold.query.warn" : "10s",
                         "refresh_interval":"10s"
                 }
                },
  "mappings": {
     "values": {
        "properties": {
         "itemid": {
            "type": "long"
          },
          "clock": {
             "format": "epoch_second",
             "type": "date"
           },
          "value": {
             "type": "double"
          }
       }
    }
  }
}'

3.2 修改 log_template 模板

添加配置 "index.routing.allocation.require.box_type": "hot"
curl -XPUT "http://${node0_ip}:${node0_port}/_template/log_template?include_type_name=true" \
 -H "Content-Type: application/json;charset=UTF-8"  -d '
{
  "index_patterns": ["log*"],
         "settings" : {
                 "index" : {
                         "routing.allocation.require.box_type" : "hot",
                         "translog.durability": "async",
                         "translog.sync_interval":"30s",
                         "translog.flush_threshold_size":"1g",
                         "number_of_replicas" : 1,
                         "number_of_shards" : 3,
                         "lifecycle.name": "datastream_policy",
                         "max_result_window": 10000000,
                         "search.slowlog.threshold.fetch.info" : "1s",
                         "search.slowlog.threshold.fetch.warn" : "3s",
                         "search.slowlog.threshold.query.info" : "5s",
                         "search.slowlog.threshold.query.warn" : "10s",
                         "refresh_interval":"10s"
                 }
                },
  "mappings": {
    "values": {
      "properties": {
        "itemid": {
          "type": "long"
        },
        "clock": {
          "format": "epoch_second",
          "type": "date"
        },
        "value": {
          "fields": {
            "analyzed": {
              "index": true,
              "type": "text",
              "analyzer": "standard"
            }
          },
          "index": false,
          "type": "text"
        }
      }
    }
  }
}'

3.3 修改 text_template 模板

添加配置 "index.routing.allocation.require.box_type": "hot"
curl -XPUT "http://${node0_ip}:${node0_port}/_template/text_template?include_type_name=true" \
 -H "Content-Type: application/json;charset=UTF-8"  -d '
{
  "index_patterns": ["text*"],
         "settings" : {
                 "index" : {
                         "routing.allocation.require.box_type" : "hot",
                         "translog.durability": "async",
                         "translog.sync_interval":"30s",
                         "translog.flush_threshold_size":"1g",
                         "number_of_replicas" : 1,
                         "number_of_shards" : 3 ,
                         "lifecycle.name": "datastream_policy",
                         "max_result_window": 10000000,
                         "search.slowlog.threshold.fetch.info" : "1s",
                         "search.slowlog.threshold.fetch.warn" : "3s",
                         "search.slowlog.threshold.query.info" : "5s",
                         "search.slowlog.threshold.query.warn" : "10s",
                         "refresh_interval":"10s"
                 }
                },
    "mappings": {
        "values": {
            "properties": {
                "itemid": {
                    "type": "long"
                },
                "clock": {
                    "format": "epoch_second",
                    "type": "date"
                },
                "value": {
                    "fields": {
                        "analyzed": {
                            "index": true,
                            "type": "text",
                            "analyzer": "standard"
                        }
                    },
                    "index": false,
                    "type": "text"
                    }
                }
            }
    }
}'

3.4 修改 uint_template 模板

添加配置 "index.routing.allocation.require.box_type": "hot"
curl -XPUT "http://${node0_ip}:${node0_port}/_template/uint_template?include_type_name=true" \
 -H "Content-Type: application/json;charset=UTF-8"  -d '
{
  "index_patterns": ["uint*"],
         "settings" : {
                 "index" : {
                         "routing.allocation.require.box_type" : "hot",
                         "translog.durability": "async",
                         "translog.sync_interval":"30s",
                         "translog.flush_threshold_size":"1g",
                         "number_of_replicas" : 1,
                         "number_of_shards" : 3 ,
                         "lifecycle.name": "datastream_policy",
                         "max_result_window": 10000000,
                         "search.slowlog.threshold.fetch.info" : "1s",
                         "search.slowlog.threshold.fetch.warn" : "3s",
                         "search.slowlog.threshold.query.info" : "5s",
                         "search.slowlog.threshold.query.warn" : "10s",
                         "refresh_interval":"10s"
                 }
                },
        "mappings": {
                "values": {
                        "properties": {
                                "itemid": {
                                        "type": "long"
                                },
                                "clock": {
                                        "format": "epoch_second",
                                        "type": "date"
                                },
                                "value": {
                                        "type": "long"
                                }
                        }
                }
        }
}'

3.5 修改 str_template 模板

添加配置 "index.routing.allocation.require.box_type": "hot"
curl -XPUT "http://${node0_ip}:${node0_port}/_template/str_template?include_type_name=true" \
 -H "Content-Type: application/json;charset=UTF-8"  -d '
{
  "index_patterns": ["str*"],
         "settings" : {
                 "index" : {
                         "routing.allocation.require.box_type" : "hot",
                         "translog.durability": "async",
                         "translog.sync_interval":"30s",
                         "translog.flush_threshold_size":"1g",
                         "number_of_replicas" : 1,
                         "number_of_shards" : 3,
                         "lifecycle.name": "datastream_policy",
                         "max_result_window": 10000000,
                         "search.slowlog.threshold.fetch.info" : "1s",
                         "search.slowlog.threshold.fetch.warn" : "3s",
                         "search.slowlog.threshold.query.info" : "5s",
                         "search.slowlog.threshold.query.warn" : "10s",
                         "refresh_interval":"10s"
                 }
                },
   "mappings": {
      "values": {
         "properties": {
            "itemid": {
               "type": "long"
            },
            "clock": {
               "format": "epoch_second",
               "type": "date"
            },
            "value": {
               "fields": {
                  "analyzed": {
                     "index": true,
                     "type": "text",
                     "analyzer": "standard"
                  }
               },
               "index": false,
               "type": "text"
            }
         }
      }
   }
}'

3.6 修改 baseline_template

模板,添加配置 "index.routing.allocation.require.box_type": "hot"
curl  -X PUT http://${node0_ip}:${node0_port}/_template/baseline_template \
-H 'Content-type':'application/json' \
-d '{
  "index_patterns": [
    "baseline-*"
  ],
  "settings": {
"index": {
"routing.allocation.require.box_type" : "hot",
      "translog.durability": "async",
      "translog.sync_interval": "300s",
      "translog.flush_threshold_size": "1g",
"number_of_replicas": 0,
"number_of_shards": 1
      "max_result_window": 100000,
      "search.slowlog.threshold.fetch.warn": "3s",
      "search.slowlog.threshold.query.info": "5s",
      "search.slowlog.threshold.query.warn": "10s",
      "refresh_interval": "3600s"
    }
  },
  "mappings": {
    "dynamic" : false,
    "properties": {
      "itemid": {
         "type": "long"
      },
      "min": {
        "type": "double"
      },
      "avg": {
        "type": "double"
      },
      "max": {
        "type": "double"
      },
      "clock": {
        "format": "epoch_second",
        "type": "date"
      },
      "slot": {
        "type": "integer"
      },
          "oclock":{
            "type": "byte"
          }
    }
  }
}'

3.7 修改 trends_uint-template 模板

添加配置 "index.routing.allocation.require.box_type": "hot"
curl  -X PUT http://${node0_ip}:${node0_port}/_template/trends_uint-template \
-H 'Content-type':'application/json' \
-d '{
  "trends_uint-template" : {
    "order" : 0,
    "index_patterns" : [
      "trends_uint-*"
    ],
    "settings" : {
      "index" : {
"routing.allocation.require.box_type" : "hot",
        "max_result_window" : "10000000",
        "search" : {
          "slowlog" : {
            "threshold" : {
              "fetch" : {
                "warn" : "3s",
                "info" : "1s"
              },
              "query" : {
                "warn" : "10s",
                "info" : "5s"
              }
            }
          }
        },
        "refresh_interval" : "5s",
        "number_of_shards" : "3",
        "translog" : {
          "flush_threshold_size" : "1g",
          "sync_interval" : "30s",
          "durability" : "async"
        },
        "number_of_replicas" : "1"
      }
    },
    "mappings" : {
      "properties" : {
        "itemid" : {
          "type" : "long"
        },
        "valueMax" : {
          "type" : "long"
        },
        "valueMin" : {
          "type" : "long"
        },
        "valueAvg" : {
          "type" : "long"
        },
        "num" : {
          "type" : "long"
        },
        "id" : {
          "type" : "keyword"
        },
        "clock" : {
          "format" : "epoch_second",
          "type" : "date"
        }
      }
    },
    "aliases" : { }
  }
}'

3.8 修改 trends-template 模板

添加配置 "index.routing.allocation.require.box_type": "hot"

curl  -X PUT http://${node0_ip}:${node0_port}/_template/trends-template \
-H 'Content-type':'application/json' \
-d '{
  "trends-template" : {
    "order" : 1,
    "index_patterns" : [
      "trends-*"
    ],
    "settings" : {
      "index" : {
"routing.allocation.require.box_type" : "hot",
        "max_result_window" : "10000000",
        "search" : {
          "slowlog" : {
            "threshold" : {
              "fetch" : {
                "warn" : "3s",
                "info" : "1s"
              },
              "query" : {
                "warn" : "10s",
                "info" : "5s"
              }
            }
          }
        },
        "refresh_interval" : "5s",
        "number_of_shards" : "3",
        "translog" : {
          "flush_threshold_size" : "1g",
          "sync_interval" : "30s",
          "durability" : "async"
        },
        "number_of_replicas" : "1"
      }
    },
    "mappings" : {
      "properties" : {
        "itemid" : {
          "type" : "long"
        },
        "valueMax" : {
          "type" : "double"
        },
        "valueMin" : {
          "type" : "double"
        },
        "valueAvg" : {
          "type" : "double"
        },
        "num" : {
          "type" : "long"
        },
        "id" : {
          "type" : "keyword"
        },
        "clock" : {
          "format" : "epoch_second",
          "type" : "date"
        }
      }
    },
    "aliases" : { }
  }
}'




索引迁移



写脚本,定时修改索引的标识配置 "index.routing.allocation.require.box_type"
示例,修改uint,dbl,str,log,text,baseline 开头的历史数据索引,将30天前的索引标记为 cold,让这些索引的数据迁移到冷节点 :
#!/bin/bash
node0_ip=192.168.XXX.180
node0_port=9200
index_date=$(date -d '-30 day' '+%Y-%m-%d')
index_type=`echo 'uint,dbl,str,log,text,baseline' |sed 's/,/\n/g'`
for name in ${index_type};do
curl -XPUT "http://${node0_ip}:${node0_port}/${name}-${index_date}/_settings"  \
 -H "Content-Type: application/json;charset=UTF-8" \
 -d '
{
  "index.routing.allocation.require.box_type": "cold"
}'

done

冷热分离要修改的索引,请根据实际情况选择,常用的索引为:

  • 1)监控的历史数据索引:dbl-年-月-日,uint-年-月-日,str-年-月-日,log-年-月-日,text-年-月-日;
  • 2)监控的基线数据索引:baseline-年-月-日。


END



本文作者:禹 栋(上海新炬中北团队)

本文来源:“IT那活儿”公众号

文章转载自IT那活儿,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论