暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

ClickHouse实战-异常日志监控实践

大数据小黑屋 2021-04-06
2415

1.相关组件

  • ClickHouse:开源列式存储的OLAP数据库,单表查询性能彪悍,扩展性好

  • Fluentd: 开源实时日志集成工具,具有低资源占用,扩展性良好,易上手等优点


2.FLuentd 简介

  • 轻量级日志收集组件,支持多达上百种数据源

  • 官方文档:https://docs.fluentd.org/

  • github:https://github.com/fluent/fluentd/

2.1 安装

  • CentOS 7系统:可以通过rpm方式安装(td-agent)

curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent4.sh | sh

  • 其它系统:参见- https://docs.fluentd.org/installation

2.2 环境变量配置

# /etc/profile
FLUENTD_HOME=/opt/td-agent
PATH=$PATH:/opt/td-agent/bin


2.3 插件安装

  • Fluent插件众多,有自带的核心插件,也有需要用户自行安装的插件

  • 插件搜索:在github中以 fluent-plugin 作为关键字进行检索

# 以邮件发送插件为例
/opt/td-agent/bin/gem install fluent-plugin-mail

2.4 启动服务

sudo service td-agent start

2.5 默认配置说明

# 配置文件
/etc/td-agent/td-agent.conf
# 日志文件
/var/log/td-agent.conf



3.Fluentd 基本配置

3.1 基本配置指令

指令说明
source数据源
match数据输出
filter数据处理管道
system系统参数配置
@include引用其它文件配置

3.2 配置文件基本结构

  • 简单的数据流示意:

Input --> Filter --> Output


  • 文件配置示例:

<source>
@type forward
</source>
<filter app.**>
@type record_transformer
<record>
hostname "#{Socket.gethostname}"
</record>
</filter>
<match app.**>
@type file
# ...
</match>

3.3 配置示例:in_tail

  • 抽取指定日志文件,并解析内容

<source>
@type tail
path /var/log/clickhouse-server/clickhouse-server.err.log
pos_file /var/log/td-agent/test-log.pos
tag clickhouse.log
<parse>
@type regexp
expression /^(?<Date>\d{4}\.\d{2}\.\d{2})\s(?<Time>\d{2}:\d{2}:\d{2}\.\d{6})\s*\[\s(?<QueryThreadId>\d*)\s*\]\s*\{(?<QueryId>[^\}]*)\}\s*<(?<Level>[a-zA-Z]*)>\s(?<Content>[^\n]*)/
</parse>
</source>

3.4 配置实例:filter: record_transformer

  • 过滤器:record_transformer,对上游的record数据补充额外的字段信息

<filter  clickhouse.log>
@type record_transformer
<record>
Host "#{Socket.gethostname}"
IpAddress "#{Socket.ip_address_list.find { |ai| ai.ipv4? && !ai.ipv4_loopback? }.ip_address}"
</record>
</filter>

3.5 配置实例:out_webhdfs

  • 将上游record数据保存到hdfs中

<match clickhouse.log>
@type webhdfs
host java-166
port 50070
path "/user/hive/warehouse/jwy_hive.db/clickhouse_log/000000"
<buffer>
flush_interval 10s
</buffer>
</match>


4.ClickHouse 简介

  • 由俄罗斯Yandex公司开源的列式存储OLAP数据库

  • 官方文档:https://clickhouse.tech/docs/zh/

  • github: https://github.com/ClickHouse/ClickHouse

4.1 安装及配置

4.2 ClickHouse日志配置

日志存储目录

# 获取error日志目录
sudo cat /etc/clickhouse-server/config.xml | grep errorlog
# 获取普通日志目录
sudo cat /etc/clickhouse-server/config.xml | grep '<log>'

日志格式及解析

2021.04.06 19:51:20.365212 [ 26354 ] {54aab079-34dc-47ed-9b9b-be07a95a238a} <Error> executeQuery: Code: 81, e.displayText() = DB::Exception: Database tables doesn't exist (version 20.8.6.6 (official build)) (from 192.168.1.100:10442) (in query: use tables;), Stack trace (when copying this message, always include the lines below):


  • 对应正则解析表达式:

/^(\d{4}\.\d{2}\.\d{2})\s(\d{2}:\d{2}:\d{2}\.\d{6})\s*\[\s(\d*)\s*\]\s*\{([^\}]*)\}\s*<([a-zA-Z]*)>\s([^\n]*)/


5. Fluentd日志采集及处理

5.1 Input插件配置

  • 通过 in_tail 插件,读取指定目录的日志文件,按照一定的正则匹配对日志内容逐行解析

<source>
@type tail
path /var/log/clickhouse-server/clickhouse-server.err.log
pos_file /var/log/td-agent/test-log.pos
tag clickhouse.log
<parse>
@type regexp
expression /^(?<Date>\d{4}\.\d{2}\.\d{2})\s(?<Time>\d{2}:\d{2}:\d{2}\.\d{6})\s*\[\s(?<QueryThreadId>\d*)\s*\]\s*\{(?<QueryId>[^\}]*)\}\s*<(?<Level>[a-zA-Z]*)>\s(?<Content>[^\n]*)/
</parse>
</source>

5.2 filter插件配置

  • 通过 record_transformer 插件,补充日志中缺少的服务器ip,hostname等信息

<filter  clickhouse.log>
@type record_transformer
<record>
Host "#{Socket.gethostname}"
IpAddress "#{Socket.ip_address_list.find { |ai| ai.ipv4? && !ai.ipv4_loopback? }.ip_address}"
</record>
</filter>

5.3 异常日志信息推送

  • 通过 output_mail 插件,将捕获到的异常日志信息通过邮件推送给运维人员

<match clickhouse.log>
@type mail
host mail.custom_mail.com # Change this to your SMTP server host
port 465 # Normally 25/587/465 are used for submission
user sender@custom_mail.com # Use your username to log in
password xxxxxxx # Use your login password
enable_tls true # Use this option to enable tls
from jwy_bigdata@custom_mail.com # Set the sender address
to AAA@custom_mail.com,BBB@custom_mail.com # Set the recipient address
subject 'ClickHouse集群Error日志'
out_keys Host,Date,Time,QueryId,Level,Content
</match>

5.4 完整配置文件说明

sudo cat /etc/td-agent/td-agent.conf

  • output:

<source>
@type tail
path /var/log/clickhouse-server/clickhouse-server.err.log
pos_file /var/log/td-agent/test-log.pos
tag clickhouse.log
<parse>
@type regexp
expression /^(?<Date>\d{4}\.\d{2}\.\d{2})\s(?<Time>\d{2}:\d{2}:\d{2}\.\d{6})\s*\[\s(?<QueryThreadId>\d*)\s*\]\s*\{(?<QueryId>[^\}]*)\}\s*<(?<Level>[a-zA-Z]*)>\s(?<Content>[^\n]*)/
</parse>
</source>
<filter clickhouse.log>
@type record_transformer
<record>
Host "#{Socket.gethostname}"
IpAddress "#{Socket.ip_address_list.find { |ai| ai.ipv4? && !ai.ipv4_loopback? }.ip_address}"
</record>
</filter>
<match clickhouse.log>
@type mail
host mail.custom_mail.com # Change this to your SMTP server host
port 465 # Normally 25/587/465 are used for submission
user sender@custom_mail.com # Use your username to log in
password xxxxxxx # Use your login password
enable_tls true # Use this option to enable tls
from jwy_bigdata@custom_mail.com # Set the sender address
to AAA@custom_mail.com,BBB@custom_mail.com # Set the recipient address
subject 'ClickHouse集群Error日志'
out_keys Host,Date,Time,QueryId,Level,Content
</match>

5.5 启动服务

# 修改/etc/td-agent/td-agent.conf配置后,需要重新加载默认配置文件
sudo service td-agent restart
# 如果ClickHouse未启动,执行下面操作
sudo service clickhouse-server start


6.测试

  • 使用错误账号或密码登录ClickHouse,检查监控邮件是否能正常发送


文章转载自大数据小黑屋,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论