暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL源码分析 备库查询冲突 - User was holding shared buffer pin for too long

digoal 2016-08-15
935

作者

digoal

日期

2016-08-15

标签

PostgreSQL , standby , conflict , shared buffer pin , 源码分析 , 冲突


背景

PostgreSQL 的基于流复制的物理备库是基于redo的物理块复制备库,允许开放只读的功能,但是需要注意,由于主库可能不断的产生redo,这些redo可能会与备库的QUERY产生冲突。

什么情况下query会堵塞、或与恢复冲突?

当以下操作产生的REDO被复制到备库,并且备库准备拿这些REDO来恢复时。

  • Access Exclusive locks taken on the primary server, including both explicit LOCK commands and various DDL actions, conflict with table accesses in standby queries.

主库的访问排它锁,与备库对应的锁产生冲突。

例如主库truncate a表, 备库查询a表。

这种情况的冲突面很窄。

  • Dropping a tablespace on the primary conflicts with standby queries using that tablespace for temporary work files.

主库删除表空间,备库使用这个表空间产生临时文件。 例如主库删除TBS,备库的一个大的查询需要写临时文件,并且这个临时文件是写到这个表空间的。

这种情况非常少见,也很容易规避,新建一个临时表空间不要删除即可。

  • Dropping a database on the primary conflicts with sessions connected to that database on the standby.

主库删除数据库,备库刚好连在这个数据库上。

这种情况也非常的少见。

  • Application of a vacuum cleanup record from WAL conflicts with standby transactions whose snapshots can still "see" any of the rows to be removed.

主库回收dead tuple的REDO,同事备库当前的query snapshot需要看到这些记录。

这种情况可以通过参数控制,恢复优先,或查询优先。 可以配置时间窗口。

而且这种冲突出现的概率也非常的小,除非用户在备库使用repeatable read,同时是非常大的事务。

而通常用户用的都是read committed.

  • Application of a vacuum cleanup record from WAL conflicts with queries accessing the target page on the standby, whether or not the data to be removed is visible.

同上,但是当query访问的页就是要清理垃圾的页时,也是有冲突的。

这是物理复制与逻辑复制唯一有差别的地方,但是对现实场景来说,这种情况出现的概率也不大。

案例

最近收到了一个冲突的例子,是这样的,在备库查询一些函数,导致了这样的报错。

< 2016-08-09 22:03:39.534 CST >STATEMENT: SELECTxxx('32980770','-1','0','0',20,100,10) < 2016-08-09 22:03:39.534 CST >ERROR: canceling statement due to conflict with recovery < 2016-08-09 22:03:39.534 CST >DETAIL: User was holding shared buffer pin for too long. < 2016-08-09 22:03:39.534 CST >CONTEXT: PL/pgSQL function xxx(integer,bigint[],integer,integer) line 7 at RETURN QUERY PL/pgSQL function xxx(integer,integer,bigint,bigint,integer,integer,integer) line 15 at RETURN QUERY

是什么原因造成的呢?

PostgreSQL 备库apply与query之间可能存在的冲突种类如下

src/include/storage/procsignal.h

/* Recovery conflict reasons */ PROCSIG_RECOVERY_CONFLICT_DATABASE, PROCSIG_RECOVERY_CONFLICT_TABLESPACE, PROCSIG_RECOVERY_CONFLICT_LOCK, PROCSIG_RECOVERY_CONFLICT_SNAPSHOT, PROCSIG_RECOVERY_CONFLICT_BUFFERPIN, PROCSIG_RECOVERY_CONFLICT_STARTUP_DEADLOCK,

例子的这个报错的冲突类型属于 PROCSIG_RECOVERY_CONFLICT_BUFFERPIN

它来自这个个函数

/* * errdetail_recovery_conflict * * Add an errdetail() line showing conflict source. */ static int errdetail_recovery_conflict(void) { switch (RecoveryConflictReason) { case PROCSIG_RECOVERY_CONFLICT_BUFFERPIN: errdetail("User was holding shared buffer pin for too long."); break;

备库apply redo时,产生的冲突,调用如下

screenshot

screenshot

源码

备库在apply redo时,如果遇到vacuum操作,则会触发以下。

而vacuum必须要拿到block的exclusive lock才能继续下去。

void LockBufferForCleanup(Buffer buffer) { ... for (;;) { /* Try to acquire lock */ LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE); LockBufHdr(bufHdr); ... /* Wait to be signaled by UnpinBuffer() */ if (InHotStandby) { /* Publish the bufid that Startup process waits on */ SetStartupBufferPinWaitBufId(buffer - 1); /* Set alarm and then wait to be signaled by UnpinBuffer() */ ResolveRecoveryConflictWithBufferPin(); /* Reset the published bufid */ SetStartupBufferPinWaitBufId(-1); }

如果是standby,则判断是否超时,如果没有超时,vacuum redo会继续等待。

```
void
ResolveRecoveryConflictWithBufferPin(void)
{
TimestampTz ltime;

    Assert(InHotStandby);

    ltime = GetStandbyLimitTime();

    if (ltime == 0)  
    {  
            /*  
             * We're willing to wait forever for conflicts, so set timeout for  
             * deadlock check only  
             */  
            enable_timeout_after(STANDBY_DEADLOCK_TIMEOUT, DeadlockTimeout);  
    }  
    else if (GetCurrentTimestamp() >= ltime)  
    {  
            /*  
             * We're already behind, so clear a path as quickly as possible.  
             */  
            SendRecoveryConflictWithBufferPin(PROCSIG_RECOVERY_CONFLICT_BUFFERPIN);  
    }

```

冲突超时时间获取,超时则触发 SendRecoveryConflictWithBufferPin , 导致这个CASE的错误。

```
/
* Determine the cutoff time at which we want to start canceling conflicting
* transactions. Returns zero (a time safely in the past) if we are willing
* to wait forever.
/
static TimestampTz
GetStandbyLimitTime(void)
{
TimestampTz rtime;
bool fromStream;

    /*  
     * The cutoff time is the last WAL data receipt time plus the appropriate  
     * delay variable.  Delay of -1 means wait forever.  
     */  
    GetXLogReceiptTime(&rtime, &fromStream);  
    if (fromStream)  
    {  
            if (max_standby_streaming_delay < 0)  
                    return 0;                       /* wait forever */  
            return TimestampTzPlusMilliseconds(rtime, max_standby_streaming_delay);  
    }  
    else  
    {  
            if (max_standby_archive_delay < 0)  
                    return 0;                       /* wait forever */  
            return TimestampTzPlusMilliseconds(rtime, max_standby_archive_delay);  
    }

}
```

如何避免或降低冲突

PostgreSQL提供了3种解决备库上查询与恢复冲突的办法

  • 在主库配置vacuum_defer_cleanup_age来解决以上最后两种冲突。

vacuum_defer_cleanup_age = 0 # number of xacts by which cleanup is delayed

  • 在备库配置recovery延迟来解决以上所有冲突,给备库的QUERY设置一个执行窗口

```
max_standby_archive_delay = 30s # max delay before canceling queries

when reading WAL from archive;

-1 allows indefinite delay

max_standby_streaming_delay = 30s # max delay before canceling queries

when reading streaming WAL;

-1 allows indefinite delay

```

  • 在备库配置hot_standby_feedback,备库会将备库的xmin反馈给上游

从而主库知道备库还需要哪些记录,在cleanup dead tuple时,会考虑备库的情况,防止冲突。

``` feedback 协议如下

Hot Standby feedback message (F) Byte1('h') Identifies the message as a Hot Standby feedback message.

Int64 The client's system clock at the time of transmission, as microseconds since midnight on 2000-01-01.

Int32 The standby's current xmin. This may be 0, if the standby is sending notification that Hot Standby feedback will no longer be sent on this connection. Later non-zero messages may reinitiate the feedback mechanism.

Int32 The standby's current epoch. ```

参数如下

```
hot_standby_feedback = off # send info from standby to prevent query conflicts

wal_retrieve_retry_interval = 1s
```

  • 通过pg_stat_database.conflicts可以了解冲突次数

  • 通过pg_stat_database_conflicts 可以了解每种冲突类型的次数

  • 源码

``` / -------- * pgstat_report_recovery_conflict() - * * Tell the collector about a Hot Standby recovery conflict. * -------- / void pgstat_report_recovery_conflict(int reason) { PgStat_MsgRecoveryConflict msg;

    if (pgStatSock == PGINVALID_SOCKET || !pgstat_track_counts)
            return;

    pgstat_setheader(&msg.m_hdr, PGSTAT_MTYPE_RECOVERYCONFLICT);
    msg.m_databaseid = MyDatabaseId;
    msg.m_reason = reason;
    pgstat_send(&msg, sizeof(msg));

} ```

祝大家玩得开心,欢迎随时来 阿里云促膝长谈业务需求 ,恭候光临

阿里云的小伙伴们加油,努力 做好内核与服务,打造最贴地气的云数据库

PostgreSQL 许愿链接

您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.

9.9元购买3个月阿里云RDS PostgreSQL实例

PostgreSQL 解决方案集合

德哥 / digoal's github - 公益是一辈子的事.

digoal's wechat

文章转载自digoal,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论