PostgreSQL 15 引入了 log_destination=jsonlog。“jsonlog”是一个新值,可以添加到 log_destination 来提供 JSON 格式记录,并将其输出写入文件,是继“stderr”和“csvlog”之后,此类的第三种类型。该格式便于将日志提供给其他应用程序。
其核心外部还有一个插件,使用elog.c 中的钩子,但这必须将“stderr”的输出覆盖到
才能生效,所以不可能同时做这两件事。这种日志格式生成的文件以“.json”为后缀,使用与其他两种格式相同的轮换策略,主要取决于后端配置。
这利用了之前在 ac7c807 中所做的重构工作,bed6ed3、8b76f89 和 2d77d83 用于后端,72b76f7 用于后端TAP 测试,添加任何新的基于文件的格式,而不是直截了当。
更新的文档通过列出所有键和值才能以这种新格式存在。 pg_current_logfile() 还需要一个刷新新选项。
我不喜欢在数据库中使用 JSON。根据我在 IRC 和 Slack 的经验,大约 95% 的案例是滥用 JSON 来处理应该重写的内容。但是在 DB 之外,对于集成数据之类的东西,它实际上运用效果相当不错。
我们已经有了 csvlog,但显然并非所有 csv 解析库都能解析与 CSV 相同的内容(多行值!)。
所以,我非常热衷于 NOT 必须找到专门的工具来处理 csvs,其方式与 PostgreSQL 处理它们的方式相同(显然,这是正确的方式:)。
让我们看看它是如何工作的。
我只更改了(在配置文件中)两个参数:
- logging_collector 需要设置为 on
- log_destination 需要设置为 jsonlog
其他日志相关的设置如下:
$ select name, setting from pg_settings where name ~ '^log' order by 1;
name │ setting
───────────────────────────────────┼────────────────────────────────
log_autovacuum_min_duration │ 600000
log_checkpoints │ on
log_connections │ on
log_destination │ jsonlog
log_directory │ log
log_disconnections │ on
log_duration │ off
log_error_verbosity │ default
log_executor_stats │ off
log_file_mode │ 0600
log_filename │ postgresql-%Y-%m-%d_%H%M%S.log
logging_collector │ on
log_hostname │ off
logical_decoding_work_mem │ 65536
log_line_prefix │ %m %u@%d %p %r
log_lock_waits │ on
log_min_duration_sample │ -1
log_min_duration_statement │ 0
log_min_error_statement │ error
log_min_messages │ warning
log_parameter_max_length │ -1
log_parameter_max_length_on_error │ 0
log_parser_stats │ off
log_planner_stats │ off
log_recovery_conflict_waits │ off
log_replication_commands │ off
log_rotation_age │ 1440
log_rotation_size │ 102400
log_startup_progress_interval │ 10000
log_statement │ none
log_statement_sample_rate │ 1
log_statement_stats │ off
log_temp_files │ 0
log_timezone │ Europe/Warsaw
log_transaction_sample_rate │ 0
log_truncate_on_rotation │ off
(36 rows)
重新启动后,我注意到在 log/ 目录中有一个名为 *.json 的新文件:
=$ ls -l log/
total 192
-rw------- 1 pgdba pgdba 178194 Jan 17 12:23 postgresql-2022-01-17_121022.log
-rw------- 1 pgdba pgdba 11240 Jan 17 12:26 postgresql-2022-01-17_122308.json
-rw------- 1 pgdba pgdba 175 Jan 17 12:23 postgresql-2022-01-17_122308.log
它的内容很难直接读取:
=$ head -n 5 log/postgresql-2022-01-17_122308.json
{"timestamp":"2022-01-17 12:23:08.437 CET","pid":1560406,"session_id":"61e5519c.17cf56","line_num":1,"session_start":"2022-01-17 12:23:08 CET","txid":0,"error_severity":"LOG","message":"ending log output to stderr","hint":"Future log output will go to log destination \"jsonlog\".","backend_type":"postmaster","query_id":0}
{"timestamp":"2022-01-17 12:23:08.437 CET","pid":1560406,"session_id":"61e5519c.17cf56","line_num":2,"session_start":"2022-01-17 12:23:08 CET","txid":0,"error_severity":"LOG","message":"starting PostgreSQL 15devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 11.2.0-7ubuntu2) 11.2.0, 64-bit","backend_type":"postmaster","query_id":0}
{"timestamp":"2022-01-17 12:23:08.437 CET","pid":1560406,"session_id":"61e5519c.17cf56","line_num":3,"session_start":"2022-01-17 12:23:08 CET","txid":0,"error_severity":"LOG","message":"listening on IPv4 address \"0.0.0.0\", port 5430","backend_type":"postmaster","query_id":0}
{"timestamp":"2022-01-17 12:23:08.437 CET","pid":1560406,"session_id":"61e5519c.17cf56","line_num":4,"session_start":"2022-01-17 12:23:08 CET","txid":0,"error_severity":"LOG","message":"listening on IPv6 address \"::\", port 5430","backend_type":"postmaster","query_id":0}
{"timestamp":"2022-01-17 12:23:08.444 CET","pid":1560406,"session_id":"61e5519c.17cf56","line_num":5,"session_start":"2022-01-17 12:23:08 CET","txid":0,"error_severity":"LOG","message":"listening on Unix socket \"/tmp/.s.PGSQL.5430\"","backend_type":"postmaster","query_id":0}
但幸运的是,我们有 jq 之类的工具:
=$ tail -n2 log/postgresql-2022-01-17_122308.json | jq .
{
"timestamp": "2022-01-17 12:28:08.549 CET",
"pid": 1560408,
"session_id": "61e5519c.17cf58",
"line_num": 1,
"session_start": "2022-01-17 12:23:08 CET",
"txid": 0,
"error_severity": "LOG",
"message": "checkpoint starting: time",
"backend_type": "checkpointer",
"query_id": 0
}
{
"timestamp": "2022-01-17 12:28:08.574 CET",
"pid": 1560408,
"session_id": "61e5519c.17cf58",
"line_num": 2,
"session_start": "2022-01-17 12:23:08 CET",
"txid": 0,
"error_severity": "LOG",
"message": "checkpoint complete: wrote 3 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.007 s, sync=0.002 s, total=0.025 s; sync files=2, longest=0.001 s, average=0.001 s; distance=0 kB, estimate=0 kB",
"backend_type": "checkpointer",
"query_id": 0
}
我的直接问题是:在多行查询的情况下它看起来如何?
=$ printf 'select\n relname,\n relkind\nfrom\n pg_class\nlimit 1;\n' | psql -aX
select
relname,
relkind
from
pg_class
limit 1;
relname | relkind
--------------+---------
pg_statistic | r
(1 row)
日志中的数据如下:
=$ grep relkind log/postgresql-2022-01-17_122308.json | tail -n1
{"timestamp":"2022-01-17 12:30:27.324 CET","user":"pgdba","dbname":"pgdba","pid":1563548,"remote_host":"[local]","session_id":"61e55353.17db9c","line_num":3,"ps":"SELECT","session_start":"2022-01-17 12:30:27 CET","vxid":"4/0","txid":0,"error_severity":"LOG","message":"duration: 0.693 ms statement: select\n relname,\n relkind\nfrom\n pg_class\nlimit 1;","application_name":"psql","backend_type":"client backend","query_id":6991982717157019333}
=$ grep relkind log/postgresql-2022-01-17_122308.json | tail -n1 | jq .
{
"timestamp": "2022-01-17 12:30:27.324 CET",
"user": "pgdba",
"dbname": "pgdba",
"pid": 1563548,
"remote_host": "[local]",
"session_id": "61e55353.17db9c",
"line_num": 3,
"ps": "SELECT",
"session_start": "2022-01-17 12:30:27 CET",
"vxid": "4/0",
"txid": 0,
"error_severity": "LOG",
"message": "duration: 0.693 ms statement: select\n relname,\n relkind\nfrom\n pg_class\nlimit 1;",
"application_name": "psql",
"backend_type": "client backend",
"query_id": 6991982717157020000
}
=$ grep relkind log/postgresql-2022-01-17_122308.json | tail -n1 | jq -r .message
duration: 0.693 ms statement: select
relname,
relkind
from
pg_class
limit 1;
Sweet. Looks like each query will be i
看起来每个查询都在单行中,无论它有多少行,并且标准工具(jq)可以完美地处理它。
我们还可以使用 jq 在日志中查找内容:
=$ jq 'select(.error_severity != "LOG" )' log/postgresql-2022-01-17_122308.json
{
"timestamp": "2022-01-17 12:41:48.112 CET",
"user": "depesz",
"dbname": "depesz",
"pid": 1561647,
"remote_host": "[local]",
"session_id": "61e55238.17d42f",
"line_num": 10,
"ps": "SELECT",
"session_start": "2022-01-17 12:25:44 CET",
"vxid": "3/29",
"txid": 0,
"error_severity": "ERROR",
"state_code": "22012",
"message": "division by zero",
"statement": "select 1/0;",
"application_name": "psql",
"backend_type": "client backend",
"query_id": -6975083880902025000
}




