原作者:陈坤
wal日志名解读
wal日志位置:
$PGDATA/pg_xlog
wal日志文件命名规则:
我们看到的wal日志是这样的:000000010000000100000034
其中前8位:00000001表示timeline;
中间8位:00000001表示logid;
最后8位:00000034表示logseg
wal日志LSN编号规则:
1/34B4FEE8(高32位/低32位)
对照关系:
1、wal日志的logseg前6位始终是0,后两位是LSN低32位/16MB(2*24),即LSN的前两位。如上例中logseg最后两位是34,LSN低32位前两位也是34。
2、LSN在wal日志中的偏移量即LSN低32位中后24位对应的十进制值。
例如当前wal日志偏移量为12027600
MogDB=# select pg_current_xlog_location(),pg_xlogfile_name_offset(pg_current_xlog_location());
pg_current_xlog_location | pg_xlogfile_name_offset
--------------------------±------------------------------------
1/34B786D0 | (000000010000000100000034,12027600)
(1 row)MogDB=# select x’B786D0’::int;
int4
----------
12027600
(1 row)
解析wal日志
接下来我们来看看wal日志里面究竟记录的是些什么内容。如果你直接查看wal日志,可能会收到下面这样的提示:
[omm@master137 pg_xlog]$ less 000000010000000100000034
“000000010000000100000034” may be a binary file. See it anyway?
因为wal日志是二进制格式的文件,使用mog_xlogdump这个工具可以解读xlog文件。
工具安装方法参考官方文档:
https://docs.mogdb.io/zh/mogdb/v5.0/mog_xlogdump
解读实验
先对数据库进行一些操作
MogDB=# select pg_current_xlog_location();
MogDB=# BEGIN;
BEGINpg_current_xlog_location
--------------------------
1/631D69D0
(1 row)
MogDB=# create table tbl (id int, msg varchar(20));
insert into tbl values(1,‘aaaa’);
MogDB=# insert into tbl values(2,‘ddd’);
INSERT 0 1
MogDB=# select pg_current_xlog_location();
pg_current_xlog_location
--------------------------
1/631D7F70
(1 row)MogDB=# update tbl set msg=‘llll’ where id =1;
UPDATE 1
MogDB=# select pg_current_xlog_location();
pg_current_xlog_location
--------------------------
1/631DDD68
(1 row)MogDB=# delete from tbl where id=2;
DELETE 0
MogDB=# select pg_current_xlog_location();
pg_current_xlog_location
--------------------------
1/631E0E08
(1 row)MogDB=# end;
select oid,relname from pg_class where relname=‘tbl’;
oid | relname
--------±--------
156569 | tbl
(1 row)
使用mog_xlogdump直接解析日志
./mog_xlogdump --start=1/631D69D0 --end=1/631E0E08 /data/mogdb/data/pg_xlog/000000010000000100000063
…
会显示一大堆内容
可以根据对象的oid 156569搜索
可以找到对应的insert,update等的记录
REDO @ 1/631D7170; LSN 1/631D71D0: prev 1/631D7120; xid 942228; term 1; len 11; total 89; crc 756483142; desc: Heap - XLOG_HEAP_INSERT insert: off 2, blkref #0: rel 1663/16133/156569 storage HEAP DISK blk 0 lastlsn 1/6313B078
…
REDO @ 1/631DD5E8; LSN 1/631DD648: prev 1/631DD588; xid 942228; term 1; len 13; total 92; crc 218032257; desc: Heap - XLOG_HEAP_HOT_UPDATE hot_update: off 1 new off 3, blkref #0: rel 1663/16133/156569 storage HEAP DISK blk 0 lastlsn 1/631D71D0
解读:
len 13:wal记录的长度。
xid 942228: 事务号。
LSN 1/631DD648:本条wal记录的lsn。
prev 1/631DD588:上条wal记录的lsn。
desc: Heap - XLOG_HEAP_HOT_UPDATE hot_update: off 1 new off 3:Heap表示堆表,hot_update 这是一条热更新类型的记录,旧数据offset为1,新数据offset为3。
blkref #0: rel 1663/16133/156569 storage HEAP DISK blk 0 :引用的第一个page(新tuple所在page)所属的堆表文件为1663/13543/16469,块号为0(即ctid的前半部分)。
使用oid和数据类型进行离线解析
./mog_xlogdump -o 156569 -R int,varchar /data/mogdb/data/pg_xlog/000000010000000100000063
[omm@master137 mog_xlogdump]$ ./mog_xlogdump -o 156569 -R int,varchar /data/mogdb/data/pg_xlog/000000010000000100000063
‘insert’,‘tuple’:{‘int’:‘1’,‘varchar’:‘bill’}
‘insert’,‘tuple’:{‘int’:‘2’,‘varchar’:‘ddd’}
‘update’,‘new_tuple’:{‘int’:‘1’,‘varchar’:‘llll’}
为什么需要oid 和数据类型,因为表的定义并不在wal中,所以不连接数据库就不知道表对象的相关信息,所以需要指定数据类型和oid进行解析
使用函数pg_logical_get_area_changes解析
pg_logical_get_area_changes
没有ddl的前提下,指定lsn区间进行解码,或者指定xlog文件进行解码。
select pg_logical_get_area_changes(‘1/631D69D0’,‘1/631E0E08’,NULL,‘mppdb_decoding’,NULL);
pg_logical_get_area_changes
(1/631D6A70,942225,“COMMIT 942225 (at 2023-11-02 23:40:55.681923+08) CSN 897446”)
(1/631D6DB0,942226,“COMMIT 942226 (at 2023-11-02 23:40:57.175212+08) CSN 897447”)
(1/631D70F0,942227,“COMMIT 942227 (at 2023-11-02 23:40:57.185853+08) CSN 897448”)
(1/631D7170,942228,"{"“table_name”":"“public.tbl”","“op_type”":"“INSERT”","“columns_name”":["“id”","“msg”"],"“columns_type”":["“integer”",""character v
arying""],"“columns_val”":["“2"”,""‘ddd’""],"“old_keys_name”":[],"“old_keys_type”":[],"“old_keys_val”":[]}")
(1/631D7270,942229,“COMMIT 942229 (at 2023-11-02 23:40:59.205513+08) CSN 897449”)
(1/631D75B0,942230,“COMMIT 942230 (at 2023-11-02 23:41:00.197818+08) CSN 897450”)
…
(1/631DC8E8,942256,“COMMIT 942256 (at 2023-11-02 23:41:36.479308+08) CSN 897476”)
(1/631DD5E8,942260,“COMMIT 942260 (at 2023-11-02 23:41:42.526125+08) CSN 897480”)
(1/631DD5E8,942228,"{"“table_name”":"“public.tbl”","“op_type”":"“UPDATE”","“columns_name”":["“id”","“msg”"],"“columns_type”":["“integer”","“character varying”"],"“columns_val”":["“1"”,""‘llll’""],"“old_keys_name”":[],"“old_keys_type”":[],"“old_keys_val”":[]}")
(1/631DD6E8,942261,“COMMIT 942261 (at 2023-11-02 23:41:42.825228+08) CSN 897481”)
…
(56 rows)
mog_xlogdump常用命令介绍
-n 查看几条数据
./mog_xlogdump -n 10 /data/mogdb/data/pg_xlog
查看指定lsn之间的数据
./mog_xlogdump --start=1/61193290 --end=1/6151C038 /data/mogdb/data/pg_xlog
从指定lsn开始往后查看10条
./mog_xlogdump --start=1/61193290 -n 10 /data/mogdb/data/pg_xlog
在线解析wal
需要使用-c 连接数据库
./mog_xlogdump -c postgres://ck:Enmo#2023@192.168.182.137:27000/postgres /data/mogdb/data/pg_xlog/000000010000000100000063
Connection to database Succeeded
parse of multi inserted record is not supported currently, transaction id: 0.
parse of multi inserted record is not supported currently, transaction id: 0.
parse of multi inserted record is not supported currently, transaction id: 0.
…
{‘table_name’:‘snap_global_os_runtime’, ‘schema_name’:‘snapshot’, ‘transaction id’:‘942688’, ‘action’:‘delete’,‘tuple’:{}tableOid:16397
{‘table_name’:‘snap_global_os_runtime’, ‘schema_name’:‘snapshot’, ‘transaction id’:‘942688’, ‘action’:‘delete’,‘tuple’:{}tableOid:16397
{‘table_name’:‘snap_global_os_runtime’, ‘schema_name’:‘snapshot’, ‘transaction id’:‘942688’, ‘action’:‘delete’,‘tuple’:{}
…
要在线解析wal日志必须:
1,把数据库配置文件postgres.conf里的wal_level设置成logical,
2,对表进行修改: alter table table_name replica identity full;
其中-c后面的备机链接URL为postgres://ck:Enmo#2023@192.168.182.137:27000/postgres
- postgres://为链接串标记头部
- ck为链接用户名
- Enmo#2023为链接用户的密码
- 192.168.182.137为ip地址
- 27000为备机链接端口
- postgres为链接备机的数据库名
- /data/mogdb/data/pg_xlog/000000010000000100000063为主机的wal日志文件




