暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

从静态表查询冲突到其原理

原创 liuzhilong62 2025-09-13
251

问题现象

现象

一个静态历史表,没有任何更新,在同城从库跑SQL稳定触发查询冲突:

ERROR: 40001: canceling statement due to conflict with recovery DETAIL: User query might have needed to see row versions that must be removed. LOCATION: ProcessInterrupts, postgres.c:3197 Time: 30534.973 ms (00:30.535)

为什么一个静态表查询冲突是一个值得关注的问题

我的理解中静态表是不应该发生冲突的(这个理解错了,后面会解释)。

官方文档 Conflict cases include:

  • Access Exclusive locks taken on the primary server, including both explicit LOCK commands and various DDL actions, conflict with table accesses in standby queries.
  • Dropping a tablespace on the primary conflicts with standby queries using that tablespace for temporary work files.
  • Dropping a database on the primary conflicts with sessions connected to that database on the standby.
  • Application of a vacuum cleanup record from WAL conflicts with standby transactions whose snapshots can still “see” any of the rows to be removed.
  • Application of a vacuum cleanup record from WAL conflicts with queries accessing the target page on the standby, whether or not the data to be removed is visible.

LOCK,DDL,drop tablespace ,drop database肯定没有。

vacuum,这也没有,这可以从pg_stat_all_tables.last_autovacuum和wal vacuum record可以看出来

官方文档的解释就到此为止了,我自己仔细排查了一下确认没有以上情况。

利用现有知识可能还有其他场景干掉从库查询的所持有快照的xmin。例如页内修剪会干掉page中行上的xmin,如果下游的查询的快照还依赖这些xmin,这理论上也会发生查询冲突。但是,page是属于一个表的,只查一个表只能持有这个表上的快照以及表上的xmin,所以,理论上A表的页内修剪,应该不会发生B表上的查询冲突(这个理解错了,后面会解释)。

pg官方文档对查询冲突场景的解释比较糊,没有很好的解释静态表发生冲突的场景和原因。加上自己联想的场景,也不应该有查询冲突。但我发现这个场景似乎在很多库上都有,所以得看下咋回事。

原因分析

因为是startup把查询掐掉的,所以看startup进程的pstack即可找到查询冲突函数

$ pstack 212012 #0 0x00002b283f63d783 in __select_nocancel () from /lib64/libc.so.6 #1 0x00000000008fcf5a in pg_usleep (microsec=<optimized out>) at pgsleep.c:56 #2 0x0000000000787905 in WaitExceedsMaxStandbyDelay (wait_event_info=134217762) at standby.c:208 #3 ResolveRecoveryConflictWithVirtualXIDs (waitlist=0x2398a50, reason=reason@entry=PROCSIG_RECOVERY_CONFLICT_SNAPSHOT, wait_event_info=wait_event_info@entry=134217762, report_waiting=report_waiting@entry=true) at standby.c:276 #4 0x0000000000787b33 in ResolveRecoveryConflictWithVirtualXIDs (report_waiting=true, wait_event_info=134217762, reason=PROCSIG_RECOVERY_CONFLICT_SNAPSHOT, waitlist=<optimized out>) at standby.c:333 #5 ResolveRecoveryConflictWithSnapshot (latestRemovedXid=<optimized out>, node=...) at standby.c:329 #6 0x00000000004c8ffe in heap_xlog_clean (record=0x2366978) at heapam.c:7764 #7 heap2_redo (record=0x2366978) at heapam.c:8917 #8 0x0000000000519e55 in StartupXLOG () at xlog.c:7411 #9 0x000000000072f211 in StartupProcessMain () at startup.c:204 #10 0x00000000005286b1 in AuxiliaryProcessMain (argc=argc@entry=2, argv=argv@entry=0x7ffeb7e39d70) at bootstrap.c:450 #11 0x000000000072c369 in StartChildProcess (type=StartupProcess) at postmaster.c:5494 #12 0x000000000072eb54 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x232edb0) at postmaster.c:1407 #13 0x00000000004892cf in main (argc=3, argv=0x232edb0) at main.c:210

XLOG_HEAP2_CLEAN

void heap2_redo(XLogReaderState *record) { uint8 info = XLogRecGetInfo(record) & ~XLR_INFO_MASK; switch (info & XLOG_HEAP_OPMASK) { case XLOG_HEAP2_CLEAN: heap_xlog_clean(record); break;

当redo是XLOG_HEAP2_CLEAN时,才会进入这个下一个函数heap_xlog_clean

pg 18已经没有XLOG_HEAP2_CLEAN这个东西了(15左右其实就没有了,这篇文章就只看13和18两个版本),但是可以在heapam_xlog.h中找到define

//pg13 #define XLOG_HEAP2_CLEAN 0x10 #define XLOG_HEAP2_FREEZE_PAGE 0x20 #define XLOG_HEAP2_CLEANUP_INFO 0x30
//pg18 * There's no difference between XLOG_HEAP2_PRUNE_ON_ACCESS, * XLOG_HEAP2_PRUNE_VACUUM_SCAN and XLOG_HEAP2_PRUNE_VACUUM_CLEANUP records. * They have separate opcodes just for debugging and analysis purposes, to * indicate why the WAL record was emitted. */ #define XLOG_HEAP2_PRUNE_ON_ACCESS 0x10 #define XLOG_HEAP2_PRUNE_VACUUM_SCAN 0x20 #define XLOG_HEAP2_PRUNE_VACUUM_CLEANUP 0x30

把pg18的源码掏出来,主要是因为pg13(当前生产库的版本)对这几个xl_info 的CLEAN宏定义一点解释没有,我看不懂。由于18这段宏换了个更容易理解的名字,而且加上了注释,我们可以从18的源码来理解13,看看这个wal record是干什么的。

这3个opcodes本质上都是为了PRUNE产生的wal record。从名字上看PRUNE ON ACCESS看着像访问产生PRUNE,另外两个跟VACUUM动作相关。

pg_waldump查看wal record的时候,rmgr: Heap2 CLEAN remxid这种record每几秒钟就生成几条,而且filenode差异很大,且跟静态表毫无关系:

$ pg_waldump 00000001000012FE00000001 |tail -200|egrep -i heap2 pg_waldump: fatal: error in WAL record at 12FE/F34F138: invalid resource manager ID 50 at 12FE/F34F168 rmgr: Heap2 len (rec/tot): 61/ 3520, tx: 0, lsn: 12FE/0F346ED0, prev 12FE/0F346EA0, desc: CLEAN remxid 1983744188, blkref #0: rel 1663/88121/1083807 blk 617606 FPW rmgr: Heap2 len (rec/tot): 66/ 66, tx: 0, lsn: 12FE/0F34BC60, prev 12FE/0F34BC30, desc: CLEAN remxid 1984090598, blkref #0: rel 1663/88121/504681 blk 1447147

这个就跟我们的现象有点匹配了:没有vacuum动作但是有PRUNE,可以走到后续heap_xlog_clean ResolveRecoveryConflictWithSnapshot等查询冲突主函数上去。

PRUNE动作产生rmgr: Heap2 CLEAN remxidwalrecord后面会测试复现。

先把源码撸完。

ResolveRecoveryConflictWithSnapshot

void ResolveRecoveryConflictWithSnapshot(TransactionId latestRemovedXid, RelFileNode node) { VirtualTransactionId *backends; /* * If we get passed InvalidTransactionId then we do nothing (no conflict). * * This can happen when replaying already-applied WAL records after a * standby crash or restart, or when replaying an XLOG_HEAP2_VISIBLE * record that marks as frozen a page which was already all-visible. It's * also quite common with records generated during index deletion * (original execution of the deletion can reason that a recovery conflict * which is sufficient for the deletion operation must take place before * replay of the deletion record itself). */ if (!TransactionIdIsValid(latestRemovedXid)) return; backends = GetConflictingVirtualXIDs(latestRemovedXid, node.dbNode); ResolveRecoveryConflictWithVirtualXIDs(backends, PROCSIG_RECOVERY_CONFLICT_SNAPSHOT, WAIT_EVENT_RECOVERY_CONFLICT_SNAPSHOT, true); }

查询冲突有几个类型,其中ResolveRecoveryConflictWithSnapshot人如其名就是跟快照冲突了,

其中GetConflictingVirtualXIDs是找到哪些backends跟快照冲突。ResolveRecoveryConflictWithVirtualXIDs主要是解决冲突,时间限制。

GetConflictingVirtualXIDs

GetConflictingVirtualXIDs是判断是否vxid backend发生查询冲突的关键函数,要稍微用点脑力。
关键函数解读所需要的基础知识:

  • limitXmin就是latestRemovedXid,也就是wal中的 CLEAN remxid,也就是需要被清理的xid(remxid我理解为remove xid)。/*limitXmin is supplied as either latestRemovedXid, or InvalidTransactionId*/
  • PGPROC是当前进程信息,有backend id,databas id,锁信息等等很多东西
  • PGXACT是当前进程持有的快照的事务信息,信息没有那么多,最关键的就是xmin——当前进程运行的最小xid
  • c中||的规则是:只要有一个操作数为真(非零),结果就为真(1)
  • TransactionIdIsValidxid!=0,0完全没有意义

关键函数GetConflictingVirtualXIDs解读:

VirtualTransactionId * GetConflictingVirtualXIDs(TransactionId limitXmin, Oid dbOid) { ... for (index = 0; index < arrayP->numProcs; index++) //所有本地进程循环处理 { int pgprocno = arrayP->pgprocnos[index]; PGPROC *proc = &allProcs[pgprocno]; //进程的PGPROC PGXACT *pgxact = &allPgXact[pgprocno]; //进程的PGXACT /* Exclude prepared transactions */ if (proc->pid == 0) //prepared transaction没有进程持有,处理不了 continue; if (!OidIsValid(dbOid) || //要注意全局的表,全局表的dbOid=0是invalid的,满足条件 proc->databaseId == dbOid) //只处理当前database。跨库是另一码事了,根本不会事务冲突。 { /* Fetch xmin just once - can't change on us, but good coding */ TransactionId pxmin = UINT32_ACCESS_ONCE(pgxact->xmin); //pgxact->xmin即是当前进程持有事务的最小xid,UINT32_ACCESS_ONCE只是为了保护原子操作,取xmin逻辑不变 /* * We ignore an invalid pxmin because this means that backend has * no snapshot currently. We hold a Share lock to avoid contention * with users taking snapshots. That is not a problem because the * current xmin is always at least one higher than the latest * removed xid, so any new snapshot would never conflict with the * test here. */ if (!TransactionIdIsValid(limitXmin) || //limitXmin=0还能出现?至少latestRemovedXid不可能有,我想不到有什么场景会出现wal日志写入无效xid (TransactionIdIsValid(pxmin) && !TransactionIdFollows(pxmin, limitXmin))) //TransactionIdIsValid(pxmin)也没有什么用。!TransactionIdFollows(pxmin, limitXmin)表示只要pxmin<=limitXmin即可 { VirtualTransactionId vxid; GET_VXID_FROM_PGPROC(vxid, *proc); if (VirtualTransactionIdIsValid(vxid)) vxids[count++] = vxid; } } }

最关键的是!TransactionIdFollows(pxmin, limitXmin)

所以这段判断是否存在查询冲突的代码,最主要判断逻辑是:

  1. 主库清理的remxid>=从库查询的快照持有的最小xid,即冲突。
  2. 只掐掉当前db的;没有db的全局系统表就不管了会无脑掐

所以即使被清理的主库表和从库查询的表不相干,也会冲突!!!

页内修剪

发生冲突的逻辑理完了,但wal CLEAN 怎么来的还没有弄清楚,这就需要先看下PRUNE怎么产生的。

README.HOT关于什么时候产生prune和defragment的描述-- When can/should we prune or defragment?

The currently planned heuristic is to prune and defrag when first accessing
a page that potentially has prunable tuples

prune和defragment确实是2个概念,但是很有可能一起做了。

  • prune指更新line pointers使HOT链更短,但不会释放空间
  • defragment指回收prune后page上已死的line pointers和行所占用的空间

We cannot prune or defragment unless we can get a “buffer cleanup lock”
on the target page; otherwise, pruning might destroy line pointers that
other backends have live references to, and defragmenting might move
tuples that other backends have live pointers to

page是"buffer cleanup lock"的,才会产生prune或者defragment

The worst-case consequence of this is only that an
UPDATE cannot be made HOT but has to link to a new tuple version placed on
some other page, for lack of centralized space on the original page.

其中一个典型场景是,HOT更新到了page外(容易测试)

space reclamation happens during tuple retrieval when the
page is nearly full (<10% free) and a buffer cleanup lock can be
acquired. This means that UPDATE, DELETE, and SELECT can trigger space
reclamation, but often not during INSERT … VALUES because it does
not retrieve a row.

SELECT/UPDATE/DELETE扫描行的都可能发生空间回收动作,INSERT不会,因为INSERT不会扫描行。

显然prune or defragment后对应的xid应该回收了,从readme就可以看出来通过HOT更新可以复现prune or defragment,然后产生CLEAN walrecord。见[测试:update产生页内修剪](## 测试:纯update产生页内修剪)

测试

以下测试只观察到冲突产生或者有CLEAN walrecord或者有页上lp的更新,不区分到底是prune还是defragment。很多场景应该是一起触发两者,区分两者有点费劲,maybe以后再说。这里主要是看CLEAN walrecord等有没有。

会用到的sql:

--sql for test --heap_page_items select t_ctid,lp, case lp_flags when 0 then '0:LP_UNUSED' when 1 then 'LP_NORMAL' when 2 then 'LP_REDIRECT' when 3 then 'LP_DEAD' end as lp_flags, t_xmin,t_xmax,t_field3 as t_cid, raw_flags, info.combined_flags,substring(t_data,0,40) from heap_page_items(get_raw_page('lzl',0)) item, LATERAL heap_tuple_infomask_flags(t_infomask, t_infomask2) info order by lp; --heap header select * from page_header(get_raw_page('lzl',0)); --bt_page_items SELECT itemoffset, ctid, itemlen, nulls, vars, dead, htid, tids[0:2] AS some_tids FROM bt_page_items('idxlzl',1); --create table create table lzl(a char(2000)); create index idxlzl on lzl(a); insert into lzl values('z'); update lzl set a=md5(random()::text); --非hot update lzl set a='z'; --hot --触发索引扫描 set enable_seqscan =off; set enable_indexonlyscan=off; --开启rr事务获取快照不释放,方便观察 BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;

测试:异表查询冲突

primary standby
create table lzl(a bigint primary key);
insert into lzl values(1);
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
select 1;
update lzl set a=2;
不阻塞
vacuum lzl;
#3 ResolveRecoveryConflictWithVirtualXIDs (waitlist=0x277c340, reason=reason@entry=PROCSIG_RECOVERY_CONFLICT_SNAPSHOT, wait_event_info=wait_event_info@entry=134217762, report_waiting=report_waiting@entry=true) at standby.c:276
#4 0x0000000000787b33 in ResolveRecoveryConflictWithVirtualXIDs (report_waiting=true, wait_event_info=134217762, reason=PROCSIG_RECOVERY_CONFLICT_SNAPSHOT, waitlist=) at standby.c:333
#5 ResolveRecoveryConflictWithSnapshot (latestRemovedXid=, node=…) at standby.c:329
#6 0x00000000004c8ffe in heap_xlog_clean (record=0x273a258) at heapam.c:7764

结论:只要查询产生,就会有快照,有快照就会有快照xmin,即使查询的表毫不相干,也可以发生查询冲突

测试:vacuum产生页内修剪

会修剪,会冲突。示例略,跟本案无关

测试:update产生页内修剪

--HOT,更新到页外defragment --一个8k的heap page,存储的行数是4-2xx条。这里规划了行大小,生成4条数据保证HOT,下一条更新即触发页外更新 create table lzl(a char(2000)); create table idxlzl on lzl(a); insert into lzl values('z'); update lzl set a='z'; --hot update lzl set a='z'; --hot update lzl set a='z'; --hot --heap page 4条数据,且HOT: t_ctid | lp | lp_flags | t_xmin | t_xmax | t_cid | raw_flags | combined_flags | --------+----+-----------+----------+----------+-------+----------------------------------------------------------------------------------------------------------+----------------+---------- (0,2) | 1 | LP_NORMAL | 34954161 | 34954162 | 0 | {HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_HOT_UPDATED} | {} | \x501f000 (0,3) | 2 | LP_NORMAL | 34954162 | 34954163 | 0 | {HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_UPDATED,HEAP_HOT_UPDATED,HEAP_ONLY_TUPLE} | {} | \x501f000 (0,4) | 3 | LP_NORMAL | 34954163 | 34954164 | 0 | {HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_UPDATED,HEAP_HOT_UPDATED,HEAP_ONLY_TUPLE} | {} | \x501f000 (0,4) | 4 | LP_NORMAL | 34954164 | 0 | 0 | {HEAP_HASVARWIDTH,HEAP_XMAX_INVALID,HEAP_UPDATED,HEAP_ONLY_TUPLE} | {} | \x501f000 (4 rows) --索引条目仅一个: itemoffset | ctid | itemlen | nulls | vars | dead | htid | some_tids ------------+-------+---------+-------+------+------+-------+----------- 1 | (0,1) | 48 | f | t | f | (0,1) | [null]
--再更新一条,触发页外更新 update lzl set a='z'; --page已满,hot不了了 --hot链改变。lp改变 t_ctid | lp | lp_flags | t_xmin | t_xmax | t_cid | raw_flags | combined_flags | --------+----+-------------+----------+----------+--------+--------------------------------------------------------------------------------------+----------------+--------------------------- [null] | 1 | LP_REDIRECT | [null] | [null] | [null] | [null] | [null] | [null] (0,2) | 2 | LP_NORMAL | 34954165 | 0 | 0 | {HEAP_HASVARWIDTH,HEAP_XMAX_INVALID,HEAP_UPDATED,HEAP_ONLY_TUPLE} | {} | \x501f00007a20202020202020 [null] | 3 | 0:LP_UNUSED | [null] | [null] | [null] | [null] | [null] | [null] (0,2) | 4 | LP_NORMAL | 34954164 | 34954165 | 0 | {HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_UPDATED,HEAP_HOT_UPDATED,HEAP_ONLY_TUPLE} | {} | \x501f00007a20202020202020 (4 rows) --索引条目仅一个,没变: itemoffset | ctid | itemlen | nulls | vars | dead | htid | some_tids ------------+-------+---------+-------+------+------+-------+----------- 1 | (0,1) | 48 | f | t | f | (0,1) | [null]

再次更新,row没有写到下个页,而是在本页发生页内修剪后写入本页,这样减少了一个页的访问。

wal产生CLEAN remxid ,也说明可以发生查询冲突:

rmgr: Heap2 len (rec/tot): 62/ 62, tx: 0, lsn: 3DB/F8017348, prev 3DB/F8017310, desc: CLEAN remxid 34954177, blkref #0: rel 1663/5893914/5893920 blk 0 rmgr: Heap len (rec/tot): 2070/ 2070, tx: 34954178, lsn: 3DB/F8017388, prev 3DB/F8017348, desc: HOT_UPDATE off 4 xmax 34954178 flags 0x10 ; new off 2 xmax 0, blkref #0: rel 1663/5893914/5893920 blk 0

结论:update语句可产生页内修剪,可产生查询冲突

测试:hintbit回写产生页内修剪?

primary standby
wal_log_hints=on
truncate table lzl;
insert into lzl values(‘z’);
BEGIN TRANSACTION ISOLATION LEVEL REPEATABLE READ;
select * from lzl;
delete from lzl where a=‘z’;
checkpoint;
select * from lzl;
–wal中生成FPI_FOR_HINT
–未发生查询冲突

standby的pageinspect:

t_ctid | lp | lp_flags | t_xmin | t_xmax | t_cid | raw_flags | combined_flags | substring --------+----+-----------+----------+----------+-------+------------------------------------------------------------------------------+----------------+------------------------------------------------- (0,1) | 1 | LP_NORMAL | 34954229 | 34954230 | 0 | {HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_KEYS_UPDATED} | {} | \x501f00007a202020202020202020202020202020202020 (1 row)

结论:wal log hint只是同步hintbit,不会对xmin/xmax产生影响,wal也不会有CLEAN等东西的产生,所以hintbit回写不会查询冲突。

测试:查询产生页内修剪

查询正常不会修剪,但page被填满会:https://www.modb.pro/db/1683648157451362304

填满page测试pruning:

--还是刚才的表,4条记录且HOT,几乎填满 insert into lzl values('z'); update lzl set a='z'; update lzl set a='z'; update lzl set a='z'; --此时的page: t_ctid | lp | lp_flags | t_xmin | t_xmax | t_cid | raw_flags | combined_flags | --------+----+-----------+----------+----------+-------+----------------------------------------------------------------------------------------------------------+----------------+--------------------- (0,2) | 1 | LP_NORMAL | 34954232 | 34954233 | 0 | {HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_HOT_UPDATED} | {} | \x501f00007a20202020 (0,3) | 2 | LP_NORMAL | 34954233 | 34954234 | 0 | {HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_UPDATED,HEAP_HOT_UPDATED,HEAP_ONLY_TUPLE} | {} | \x501f00007a20202020 (0,4) | 3 | LP_NORMAL | 34954234 | 34954235 | 0 | {HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_UPDATED,HEAP_HOT_UPDATED,HEAP_ONLY_TUPLE} | {} | \x501f00007a20202020 (0,4) | 4 | LP_NORMAL | 34954235 | 0 | 0 | {HEAP_HASVARWIDTH,HEAP_XMAX_INVALID,HEAP_UPDATED,HEAP_ONLY_TUPLE} | {} | \x501f00007a20202020 (4 rows) -- 一个查询 select * from lzl; --此时的page产生页内修剪 t_ctid | lp | lp_flags | t_xmin | t_xmax | t_cid | raw_flags | combined_flags | sub --------+----+-------------+----------+--------+--------+---------------------------------------------------------------------------------------+----------------+--------------------------------------- [null] | 1 | LP_REDIRECT | [null] | [null] | [null] | [null] | [null] | [null] [null] | 2 | 0:LP_UNUSED | [null] | [null] | [null] | [null] | [null] | [null] [null] | 3 | 0:LP_UNUSED | [null] | [null] | [null] | [null] | [null] | [null] (0,4) | 4 | LP_NORMAL | 34954235 | 0 | 0 | {HEAP_HASVARWIDTH,HEAP_XMIN_COMMITTED,HEAP_XMAX_INVALID,HEAP_UPDATED,HEAP_ONLY_TUPLE} | {} | \x501f00007a20202020202020202020202020

结论:查询可以产生页内修剪,可以产生查询冲突

测试:共享表多database查询冲突

共享表是global的,之前在GetConflictingVirtualXIDs中有看到global的会无脑冲突。所以可以测一把。

共享表信息如下:

源码定义:IsSharedRelation 源码判断:shared ? InvalidOid : MyDatabaseId; 表:pg_class.relisshared 目录: global/

直接查pg_class.relisshared要方便一点:

select relname,relkind,relisshared from pg_class where relisshared is true and relkind='r'; relname | relkind | relisshared -----------------------+---------+------------- pg_authid | r | t pg_subscription | r | t pg_database | r | t pg_db_role_setting | r | t pg_tablespace | r | t pg_auth_members | r | t pg_shdepend | r | t pg_shdescription | r | t pg_replication_origin | r | t pg_shseclabel | r | t

pg_authid会保存角色/用户的信息。这里以改密来测试:

--测试,在primary库 非业务db跑 create user lzl; alter user lzl with password '1'; --多跑几次

产生CLEAN remxid:

rmgr: Heap len (rec/tot): 76/ 76, tx: 34954264, lsn: 3DB/F808D0F8, prev 3DB/F808D0B8, desc: HOT_UPDATE off 67 xmax 34954264 flags 0x20 ; new off 66 xmax 0, blkref #0: rel 1664/0/1260 blk 0 rmgr: Transaction len (rec/tot): 82/ 82, tx: 34954264, lsn: 3DB/F808D148, prev 3DB/F808D0F8, desc: COMMIT 2025-09-12 14:40:56.680782 CST; inval msgs: catcache 11 catcache 10 rmgr: Heap2 len (rec/tot): 60/ 60, tx: 0, lsn: 3DB/F808D1A0, prev 3DB/F808D148, desc: CLEAN remxid 34954264, blkref #0: rel 1664/0/1260 blk 0 rmgr: Heap2 len (rec/tot): 60/ 60, tx: 34954265, lsn: 3DB/F808D1E0,

从库业务db的SQLselect 1被掐掉。

结论:共享表可以发生跨db查询冲突。

不过这些共享表正常业务很难有大量更新。

结论

开发视野

查询冲突可以跟查询的表无关,也就是说完全静态的表是可以发生冲突的。

跨db意味着不同的业务和数据。跨db不会发生查询冲突。一个例外是共享表会,但是共享表只有几个系统表,一般不会有什么更新。

对于开发来说,可以关注以下点:

  • 请求重试:从库的查询可能会被掐掉,重试查询是必要的,重试有可能会成功
  • SQL时长:SQL时间越长越容易被掐掉
  • 其他从库:可能要选择其他容灾属性不高的从库

运维视野

因为查询冲突可以从“四面八方”来,一个简单的长时间单表查询,可能会被其他频繁更新的表的页内修剪给干掉,所有可以调大max_standby_streaming_delay以降低冲突的概率。

但是,max_standby_streaming_delay是以牺牲应用日志为代价的,相当于日志应用停止。这个参数的大小就代表了最大可能的从库复制延迟(限制不了网络延迟等其他因素造成的延迟)。

  • 查询时效:长时间的日志应用停止,从库的数据会严重落后(可能wal已经在从库磁盘上),会影响从库其他查询的数据时效要求。
  • RTO:如果主库灾难发生需要切换到从库,从库需要日志应用的时间。如果apply延迟太长比如几个小时,可能不能满足分钟级切换要求,这直接会影响SLA RTO承诺。

所以,调整max_standby_streaming_delay是一个精细活,需要从从库的角色定义,查询时效要求,甚至地域等来综合考虑。

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论