PolarDB PostgreSQL版获取buffer策略简介
关于 PolarDB PostgreSQL 版
PolarDB PostgreSQL 版是一款阿里云自主研发的云原生关系型数据库产品,100% 兼容 PostgreSQL,高度兼容Oracle语法;采用基于 Shared-Storage 的存储计算分离架构,具有极致弹性、毫秒级延迟、HTAP 、Ganos全空间数据处理能力和高可靠、高可用、弹性扩展等企业级数据库特性。同时,PolarDB PostgreSQL 版具有大规模并行计算能力,可以应对 OLTP 与 OLAP 混合负载。
RING BUFFER中获取buffer
buffer的获取策略的实现主要集中在StrategyGetBuffer
函数中,通过在BufferAlloc中被调用获取buffer pool中下一个合适候选buffer。
获取buffer时先判断需要获取的buffer是否为ring buffer。
/*
* If given a strategy object, see whether it can select a buffer. We
* assume strategy objects don't need buffer_strategy_lock.
*/
if (strategy != NULL)
{
buf = GetBufferFromRing(strategy, buf_state);
if (buf != NULL)
return buf;
}
如果是则通过GetBufferFromRing
去ring buffer数组里获取有效的buffer。ring buffer关键结构体:
typedef struct BufferAccessStrategyData
{
...
/* Number of elements in buffers[] array */
int ring_size;
/*
* Index of the "current" slot in the ring, ie, the one most recently
* returned by GetBufferFromRing.
*/
int current;
/*
* True if the buffer just returned by StrategyGetBuffer had been in the
* ring already.
*/
bool current_was_in_ring;
/*
* Array of buffer numbers. InvalidBuffer (that is, zero) indicates we
* have not yet selected a buffer for this ring slot. For allocation
* simplicity this is palloc'd together with the fixed fields of the
* struct.
*/
Buffer buffers[FLEXIBLE_ARRAY_MEMBER];
} BufferAccessStrategyData;
其中ring_size为ring buffer的大小,current则为当前获取ring buffer的位置,current_was_in_ring
用于判断当前位置的buffer是否在ring buffer里,buffers则是ring buffers。ring buffer其实类似于buffer pool,相对于buffer pool,ring buffer是一些buffers构成的一个“环”。获取有效的ring buffer其实就是绕着这个环获取一个有效的buffer。
ring buffer的获取过程如下:
/* Advance to next ring slot */
if (++strategy->current >= strategy->ring_size)
strategy->current = 0;
/*
* If the slot hasn't been filled yet, tell the caller to allocate a new
* buffer with the normal allocation strategy. He will then fill this
* slot by calling AddBufferToRing with the new buffer.
*/
bufnum = strategy->buffers[strategy->current];
if (bufnum == InvalidBuffer)
{
strategy->current_was_in_ring = false;
returnNULL;
}
通过current当前的位置,获取对应slot的buffer,如果current大于ring buffer的大小,则从ring buffer数组的开始位置重新遍历获取(类似于一个环)。如果发现当前的buffer无效,则标记current_was_in_ring为false(后续将buffer pool中合适的buffer放入ring buffer当前位置),则返回从buffer pool中获取。如果有效则:
if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0 &&
BUF_STATE_GET_USAGECOUNT(local_buf_state) <= 1)
{
strategy->current_was_in_ring = true;
*buf_state = local_buf_state;
return buf;
}
判断当前的buffer的REFCOUNT是否为0,也就说当前的buffer是否被pin住,如果被pin住说明该buffer可能正在被使用,同时也会判断该buffer最近被使用的次数,如果是被当作ring buffer被使用,即使访问多次,USAGECOUNT
最多也只是1,所以如果大于1,说明该buffer被当前非ring buffer被使用过。因此,如果buffer满足上面两个条件,则直接返回,否则去buffer pool中获取。
BUFFER POOL中获取buffer
从buffer pool中获取有效的buffer。如果从ring buffer中获取不到有效buffer,或者并非获取ring buffer,则都会从buffer pool中获取有效的buffer。
if (StrategyControl->firstFreeBuffer >= 0)
{
...
StrategyControl->firstFreeBuffer = buf->freeNext;
buf->freeNext = FREENEXT_NOT_IN_LIST;
...
local_buf_state = LockBufHdr(buf);
if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0
&& BUF_STATE_GET_USAGECOUNT(local_buf_state) == 0)
{
if (strategy != NULL)
AddBufferToRing(strategy, buf);
*buf_state = local_buf_state;
return buf;
}
...
}
会先考虑buffer pool中 freelist中的buffer是否已经使用完了,如果没有使用完则会直接去freelist中获取有效的buffer,否则通过"clock sweep" 算法获取有效的buffer。
"clock sweep" 算法主要实现在ClockSweepTick
函数中。关键结构:
/*
* The shared freelist control information.
*/
typedef struct
{
...
pg_atomic_uint32 nextVictimBuffer;
...
uint32 completePasses; /* Complete cycles of the clock sweep */
...
} BufferStrategyControl;
在StrategyControl
共享内存变量中nextVictimBuffer
记录了当前buffer pool中buffer淘汰的位置,以及completePasses
记录了当前buffer pool已经走过的轮次。"clock sweep" 算法会绕着buffer pool这个环循环便利,直到获取有效的buffer位置,或者所用buffer 被pin主了直接返回错误。
if (BUF_STATE_GET_REFCOUNT(local_buf_state) == 0)
{
if (BUF_STATE_GET_USAGECOUNT(local_buf_state) != 0)
{
local_buf_state -= BUF_USAGECOUNT_ONE;
trycounter = NBuffers;
}
else
{
/* Found a usable buffer */
if (strategy != NULL)
AddBufferToRing(strategy, buf);
*buf_state = local_buf_state;
return buf;
}
}
elseif (--trycounter == 0)
{
UnlockBufHdr(buf, local_buf_state);
elog(ERROR, "no unpinned buffers available");
}
根据nextVictimBuffer
获取buffer,如果当前的buffer没有pin住同时最近也没有被使用,则直接返回。否则nextVictimBuffer
++获取下一个buffer,在判断buffer最近是否被使用时通过BUF_USAGECOUNT_ONE
来判断,当发现最近被使用时,会将USAGECOUNT
减1(当USAGECOUNT
为0时则可以获取该buffer了)后再判断下一个buffer。如果trycounter
为0说明所有的buffer被pin住了,这时无法获取buffer,则直接返回ERROR。




