暂无图片
暂无图片
3
暂无图片
暂无图片
暂无图片

案例-起库失败和sysv共享内存

原创 liuzhilong62 2026-03-09
297

问题现象

数据库实例RSS内存打满,日志有OOM信息,库挂了。这里不分析OOM原因。

但是起库的时候失败,从日志来看总共起库4、5次都失败:

2026-02-12 09:15:21 CST::@:[578272]: FATAL: pre-existing shared memory block (key 2048, ID 1328250881) is still in use 2026-02-12 09:15:21 CST::@:[578272]: HINT: Terminate any old server processes associated with data directory "/data". 2026-02-12 09:15:21 CST::@:[578272]: LOG: database system is shut down 2026-02-12 09:21:03 CST::@:[658824]: FATAL: pre-existing shared memory block (key 2048, ID 1328250881) is still in use 2026-02-12 09:21:03 CST::@:[658824]: HINT: Terminate any old server processes associated with data directory "/data". 2026-02-12 09:21:03 CST::@:[658824]: LOG: database system is shut down 2026-02-12 09:31:12 CST::@:[794791]: LOG: redirecting log output to logging collector process 2026-02-12 09:31:12 CST::@:[794791]: HINT: Future log output will appear in directory "/data/pg_log". 2026-02-12 09:31:37 CST::@:[801049]: FATAL: lock file "postmaster.pid" already exists 2026-02-12 09:31:37 CST::@:[801049]: HINT: Is another postmaster (PID 794791) running in data directory "/data"? 2026-02-12 09:32:34 CST::@:[814396]: FATAL: lock file "postmaster.pid" already exists 2026-02-12 09:32:34 CST::@:[814396]: HINT: Is another postmaster (PID 794791) running in data directory "/data"?

启动成功是因为DBA在起库前执行ipcrm -m xxx,然后启动成功的。

虽然快速解决问题,但是仍有很多疑问:

  • 为什么这种场景在现实种不算太多见?
  • start.log起库报错有2类,分别对应什么操作和逻辑?
  • 如果PM都不在了共享内存还可以存在吗?
  • 这段共享内存如何定位和清理?
  • PG共享内存有多段,这段共享内存是哪一段?
  • 除了ipcrm -m还有其他起库办法吗

报错分析:pre-existing shared memory block

3种共享内存

正常来说,PG起库后共享内存有三段。

以默认“shared_memory_type='mmap'+未使用大页”为例:

# 从PG申请的虚拟内存查看PG真实使用共享内存 cat /proc/`head -1 $PGDATA/postmaster.pid`/smaps | grep -E "\-s"
2b61b0563000-2b61b0564000 rw-s 00000000 00:04 116293664 /SYSV00001000 (deleted) 2b61b057f000-2b61b05b3000 rw-s 00000000 00:12 1501001168 /dev/shm/PostgreSQL.1193490778 2b61bbac2000-2b61fa67a000 rw-s 00000000 00:04 1500999610 /dev/zero (deleted)

如上所示,从上往下分别是SYSV起库使用的共享内存并行计算使用的共享内存sharedbuffers使用的共享内存

如果sharedbuffers使用了大页,或者sharedbuffers type是SYSV而不是mmap,输出会稍微有些区别。

大页:

2aaaaac00000-2aba9ca00000 rw-s 00000000 00:0e 48453452 /anon_hugepage (deleted) 2b08f2eea000-2b08f2eeb000 rw-s 00000000 00:04 50692152 /SYSV00001000 (deleted) 2b08f2f05000-2b08f302d000 rw-s 00000000 00:12 48436142 /dev/shm/PostgreSQL.1345689218

shared_memory_type = ‘sysv’:

2b03b3ceb000-2b03b3d1f000 rw-s 00000000 00:12 1572332304 /dev/shm/PostgreSQL.2883611352 2b03bf0c2000-2b03fdc7a000 rw-s 00000000 00:04 143917075 /SYSV00001000 (deleted)

汇总如下:

PG共享内存配置 smaps共享内存段数 sharedbuffers smaps sysv smaps
shared_memory_type=mmap,没有大页 3段共享内存 /dev/zero /SYSV00001000
shared_memory_type=sysv,没有大页 2段共享内存 /SYSV00001000 /SYSV00001000
shared_memory_type=mmap,有大页 3段共享内存 /anon_hugepage /SYSV00001000
shared_memory_type=sysv,有大页 不支持 不支持

那么现在问题来了,报错分析:pre-existing shared memory block时对应的哪个共享内存?

源码分析

源码搜报错很容易找到关键代码位置:src/backend/port/sysv_shmem.c

首先理解sysv shmem是干嘛的,以下截取自零散的readme:

We still require a SysV shmem block to
 * exist, though, because mmap'd shmem provides no way to find out how
 * many processes are attached, which we need for interlocking purposes.
 
 * As of PostgreSQL 9.3, we normally allocate only a very small amount of
 * System V shared memory, and only for the purposes of providing an
 * interlock to protect the data directory.  The real shared memory block
 * is allocated using mmap().  This works around the problem that many
 * systems have very low limits on the amount of System V shared memory
 * that can be allocated.  Even a limit of a few megabytes will be enough
 * to run many copies of PostgreSQL without needing to adjust system settings.
  • sysv shmem可以找共享内存是否是attached,mmap不能实现此功能
  • 这段sysv shmem是用来保护datadir的;shared buffer用的是mmap(默认)不是sysv
  • 这段sysv shmem非常小(从虚拟内存地址可以看出申请的是4K=2b61b0563000-2b61b0564000)

再看shm的状态enum:

typedef enum { SHMSTATE_ANALYSIS_FAILURE, /* unexpected failure to analyze the ID */ SHMSTATE_ATTACHED, /* pertinent to DataDir, has attached PIDs */ SHMSTATE_ENOENT, /* no segment of that ID */ SHMSTATE_FOREIGN, /* exists, but not pertinent to DataDir */ SHMSTATE_UNATTACHED /* pertinent to DataDir, no attached PIDs */ } IpcMemoryState;

主要是关注ATTACHED,FOREIGN,UNATTACHED。

sysv shmem是用来保护datadir目录的,比如常见的场景是要确认这个目录不会被跑2个实例。既然有shmem共享内存,那么因为各种奇怪原因,这个共享内存也有可能不是这个目录或者这个进程的,所有是FOREIGN状态。如果共享内存对应到datadir了,但没有进程在运行,那么应该是UNATTACHED,有进程运行那么是ATTACHED。

这时再来看PGSharedMemoryCreate函数抛出的报错:

PGShmemHeader * PGSharedMemoryCreate(Size size, PGShmemHeader **shim) {... for (;;) //死循环 {.. shmid = shmget(NextShmemSegID, sizeof(PGShmemHeader), 0);//shmget获取shmem共享内存并返回shmid if (shmid < 0) { oldhdr = NULL; state = SHMSTATE_FOREIGN; } else state = PGSharedMemoryAttach(shmid, NULL, &oldhdr);//找到这段shmem共享内存的状态 switch (state)//根据共享内存的状态执行不同的动作 { ...//这里只展示了2种,shm有attach和没有attach case SHMSTATE_ATTACHED: //shm有attach的情况,抛出报错(也就是问题现象出现的报错) ereport(FATAL, (errcode(ERRCODE_LOCK_FILE_EXISTS), errmsg("pre-existing shared memory block (key %lu, ID %lu) is still in use", (unsigned long) NextShmemSegID, (unsigned long) shmid), errhint("Terminate any old server processes associated with data directory \"%s\".", DataDir))); break; ... case SHMSTATE_UNATTACHED://shm是unattach的 /* * The segment pertains to DataDir, and every process that had * used it has died or detached. Zap it, if possible, and any * associated dynamic shared memory segments, as well. This * shouldn't fail, but if it does, assume the segment belongs * to someone else after all, and try the next candidate. * Otherwise, try again to create the segment. That may fail * if some other process creates the same shmem key before we * do, in which case we'll try the next key. */ //代表内存段关联Data目录,且没有进程还持有这个段 if (oldhdr->dsm_control != 0) dsm_cleanup_using_control_segment(oldhdr->dsm_control); if (shmctl(shmid, IPC_RMID, NULL) < 0) NextShmemSegID++; //注意这里的ShmemSegID递增循环 break; } ... } ... }

可以看到shmem attached时会抛出报错。如果没有attach,会无限循环尝试清理这段共享内存并shmemsegid+1申请新的共享内存。

  • 第一种情况对应这个故障
  • 第二种情况对应实例崩溃仍然可以正常起库

sysv shmem

PG10及以后postmaster.pid,sysv_shmem相关的逻辑大改,10以后基本没有变过。本文只分析了10以后的逻辑。

pidfile.h:

#define LOCK_FILE_LINE_SHMEM_KEY 7

sysv_shmem.c,InternalIpcMemoryCreate():

{ char line[64]; sprintf(line, "%9lu %9lu", (unsigned long) memKey, (unsigned long) shmid); AddToDataDirLockFile(LOCK_FILE_LINE_SHMEM_KEY, line); }

从源码可以看出,shmem信息保存在postmaster.pid文件第七行,分别写的是shmkey和shmid。

> cat postmaster.pid 242712 /data 1772698474 8531 /tmp 0.0.0.0 4096 143917078 # <----here ready

什么是shmkey和shmid

在pg源码中是这样调用的,InternalIpcMemoryCreate():

shmid = shmget(memKey, 0, IPC_CREAT | IPC_EXCL | IPCProtection);

PG以shmkey/memkey为种子key,向内核申请shmem并返回唯一标识符shmid

shmid高度依赖服务器或者说服务器内存的状态。对于PG来说,快速重启实例,前后的shmid可能会相同或者+1,这跟linux内核机制相关;服务器重启那就完全不一样。

可以这样增加理解度:无论服务器是否重启,shmkey/memkey都可以是固定值,因为毕竟是用户输入(即PG);而在服务器重启前后,即便传入同一shmkey,获取的shmid不太可能是同一值。

PG是怎么拿shmkey的

PGSharedMemoryCreate():

/* * We use the data directory's ID info (inode and device numbers) to * positively identify shmem segments associated with this data dir, and * also as seeds for searching for a free shmem key. */ if (stat(DataDir, &statbuf) < 0) ereport(FATAL, (errcode_for_file_access(), errmsg("could not stat data directory \"%s\": %m", DataDir))); ... /* * Loop till we find a free IPC key. Trust CreateDataDirLockFile() to * ensure no more than one postmaster per data directory can enter this * loop simultaneously. (CreateDataDirLockFile() does not entirely ensure * that, but prefer fixing it over coping here.) */ NextShmemSegID = statbuf.st_ino; for (;;) { IpcMemoryId shmid; PGShmemHeader *oldhdr; IpcMemoryState state; /* Try to create new segment */ memAddress = InternalIpcMemoryCreate(NextShmemSegID, sysvsize); if (memAddress) break; /* successful create and attach */ /* Check shared memory and possibly remove and recreate */ /* * shmget() failure is typically EACCES, hence SHMSTATE_FOREIGN. * ENOENT, a narrow possibility, implies SHMSTATE_ENOENT, but one can * safely treat SHMSTATE_ENOENT like SHMSTATE_FOREIGN. */ shmid = shmget(NextShmemSegID, sizeof(PGShmemHeader), 0);

PG通过stat获取datadir的状态,其中包含datadir的inode,PG直接将datadir.inode当作shmkey。

在PG中shmem key跟datadir的inode强相关,一般情况下shmem key=datadir inode

验证示例:

> ls -id $PGDATA 4096 /lzlcloud/pg8574/data > cat postmaster.pid |head -7|tail -1 4096 143917090

可以看到datadir.inode=shmkey=4096。

PG在云环境下的shmkey

上面说一般情况shmid=datadir.inode,实际上这在云环境中基本不是这个情况。

我们的云环境:

> ls -id /lzlcloud/pg8298/data 4096 /lzlcloud/pg8298/data > ls -id /lzlcloud/pg8388/data 4096 /lzlcloud/pg8388/data > ls -id /lzlcloud/pg8095/data 4096 /lzlcloud/pg8095/data
> cat /lzlcloud/pg8298/data/postmaster.pid|head -7|tail -1 4096 971833391 > cat /lzlcloud/pg8388/data/postmaster.pid|head -7|tail -1 4097 62128161 > cat /lzlcloud/pg8095/data/postmaster.pid|head -7|tail -1 4098 143163441

data盘dir的inode都是4096,而shmkey是4096、4097、4098

why?

inode的问题跟磁盘的文件系统有关系:

  • 每个文件系统有独立的inode
  • 文件系统预留了一些inode,前几位是不能使用的。根据不同的挂载方式,我们data盘真正的inode从4096开始

也就是说datadir.inode=4096这是我们云环境磁盘挂载的默认行为。其他环境可能不一样,未深入分析。不过以相同文件系统和相同方式挂载挂载pg datadir的话,仍有可能inode数值相等。

shmkey的问题跟PG源码相关,PGSharedMemoryCreate():

for (;;) { ... NextShmemSegID = statbuf.st_ino; ... shmid = shmget(NextShmemSegID, sizeof(PGShmemHeader), 0); ... switch (state) { case SHMSTATE_FOREIGN: NextShmemSegID++; break;

本来shmkey=datadir.inode,但是由于可能申请到shmem是foreign的,所以shmkey+1再申请一次。

例如postmaster.pid文件shmkey=4097的那个实例,它起库时shmkey=4096,但是发现shmid那个内存段被其他实例使用了(就是另一个shmkey=4096的PG实例),它让shmkey+1再申请了另一个shmid共享内存段。

同理shmkey=4098的那个实例加了2次才找到空闲的shmkey对应的shmid。

shmid的关联性

sysv的shmid可以在起库的报错日志postmaster.pid文件第7行虚拟内存地址smaps中均可以找到,并通过sysv共享内存命令的ipcs命令查看和ipcrm命令清理。

示例:注意以下shmid=143917078

起库报错日志:

pg_ctl: another server might be running; trying to start server anyway waiting for server to start....2026-03-05 16:02:19 CST::@:[262388]: FATAL: pre-existing shared memory block (key 4096, ID 143917078) is still in use

postmaster.pid文件第七行:

> cat postmaster.pid |head -7|tail -1 4096 143917078

虚拟内存smaps:

cat /proc/`head -1 $PGDATA/postmaster.pid`/smaps | grep -E "\-s" 2ad2b5189000-2ad2b518a000 rw-s 00000000 00:04 143917078 /SYSV00001000 (deleted)

通过shmid sysv共享内存id查看和清理:

ipcs -m -i 143917078 #清理:ipcrm -m shmid Shared memory Segment shmid=143917078 uid=6001 gid=6001 cuid=6001 cgid=6001 mode=0600 access_perms=0600 bytes=56 lpid=242712 cpid=242712 nattch=10 att_time=Thu Mar 5 16:14:51 2026 det_time=Thu Mar 5 16:14:49 2026 change_time=Thu Mar 5 16:14:34 2026

测试

生产问题复现

持有一个backend进程永不退出,kill -9 PM

> cat postmaster.pid 4096 143917076 > ipcs -m -i 143917076 #shmem id Shared memory Segment shmid=143917076 uid=6001 gid=6001 cuid=6001 cgid=6001 mode=0600 access_perms=0600 bytes=56 lpid=241567 cpid=64757 nattch=23 > kill -stop 107648 #任意一个backend > kill -9 64757 #postmaster或者其他的 > ipcs -m -i 143917076 Shared memory Segment shmid=143917076 uid=6001 gid=6001 cuid=6001 cgid=6001 mode=0600 access_perms=0600 bytes=56 lpid=252283 cpid=64757 nattch=1 #nattch != 0 > pg_ctl start -D $PGDATA pg_ctl: another server might be running; trying to start server anyway waiting for server to start....2026-03-05 16:02:19 CST::@:[262388]: FATAL: pre-existing shared memory block (key 4096, ID 143917076) is still in use 2026-03-05 16:02:19 CST::@:[262388]: HINT: Terminate any old server processes associated with data directory "/data". stopped waiting pg_ctl: could not start server

nattach=1,实例无法启动。

实例奔溃正常起库

其实就是kill实例然后启动

> cat postmaster.pid 4096 143917077 > ipcs -m -i 143917077 #shmem id Shared memory Segment shmid=143917077 uid=6001 gid=6001 cuid=6001 cgid=6001 mode=0600 access_perms=0600 bytes=56 lpid=154800 cpid=134329 nattch=18 > kill -9 134329 #postmaster或者其他的 > cat postmaster.pid 4096 143917077 > ipcs -m -i 143917077 #shmem id没有改变,shmem仍然存在 Shared memory Segment shmid=143917077 uid=6001 gid=6001 cuid=6001 cgid=6001 mode=0600 access_perms=0600 bytes=56 lpid=169360 cpid=134329 nattch=0 #nattch=0 > ipcs -m -i 143917077 #shmem id没有改变,shmem仍然存在 > pg_ctl start -D $PGDATA # 起库成功 pg_ctl: another server might be running; trying to start server anyway waiting for server to start....2026-03-05 16:14:34 CST::@:[242712]: LOG: redirecting log output to logging collector process 2026-03-05 16:14:34 CST::@:[242712]: HINT: Future log output will appear in directory "/data/pg_log". done server started > ipcs -m -i 143917077 #残留的shmem起库时被清理 ipcs: id 143917077 not found > ipcs -m -i 143917078 #shmemid起库时被+1 Shared memory Segment shmid=143917078 uid=6001 gid=6001 cuid=6001 cgid=6001 mode=0600 access_perms=0600 bytes=56 lpid=273571 cpid=242712 nattch=26 > cat postmaster.pid # shmkey不变,shmid+1 4096 143917078

正常kill -9然后启动,可以正常启动,残留的shmem会在启动时被清理。shmkey不变是因为inode=4096且shmkey=4096没有被占用,shmid+1这是linux内核行为,至少说明不是使用的同一段shmem。

持有文件但不持有shmem

因为起库跟datadir inode相关,inode跟shmem id相关,起库本质上是在检查shmem是不是被其他进程持有,而不是文件fd是否还被其他进程持有。所以这里测试不持有共享内存但持有文件fd的进程logger。

$ cat /proc/77300/smaps | grep -E "\-s" #这是logger进程,检查它没有用共享内存 $ kill -stop 77300 #stop logger $ kill -9 77076 #kill -9 pm $ cat postmaster.pid #文件仍在 77076 /lzlcloud/pg8531/data 1772700343 8531 /tmp 0.0.0.0 4096 143917080 ready $ ipcs -m -i 143917080 #共享内存仍在 Shared memory Segment shmid=143917080 uid=6001 gid=6001 cuid=6001 cgid=6001 mode=0600 access_perms=0600 bytes=56 lpid=77319 cpid=77076 nattch=0 att_time=Thu Mar 5 17:27:11 2026 det_time=Thu Mar 5 17:27:15 2026 change_time=Thu Mar 5 16:45:43 2026 $ ps -ef|grep 77300 #进程仍在 postgres 77300 1 0 16:45 ? 00:00:00 postgresql: lzldb: logger postgres 135246 46622 0 17:27 pts/1 00:00:00 grep --color=auto 77300 $ pg_ctl start -D $PGDATA #起库成功 pg_ctl: another server might be running; trying to start server anyway waiting for server to start....2026-03-05 17:27:55 CST::@:[140497]: LOG: redirecting log output to logging collector process 2026-03-05 17:27:55 CST::@:[140497]: HINT: Future log output will appear in directory "/data/pg_log". done server started

logger持有data目录下的文件,但不关联共享内存,不会阻止起库

删除postmaster.pid文件起库失败

流程跟上面差不多:持有1个backend进程,kill -9 PM,删除postmaster.pid文件,起库。

过程不贴了,结果是起库失败,报错如下:

waiting for server to start....2026-03-06 15:29:48 CST::@:[22475]: FATAL: pre-existing shared memory block (key 4098, ID 171868173) is still in use 2026-03-06 15:29:48 CST::@:[22475]: HINT: Terminate any old server processes associated with data directory "/data". 2026-03-06 15:29:48 CST::@:[22475]: LOG: database system is shut down

可以看出,有僵尸进程持有shmem的情况下,即便删除包含shmid的postmaster.pid文件,PG仍然能找到对应的shmid。

关闭一个其他库,启动当前库

pg会分析2个地方shmid是否是当前的

  1. 以datadir.inode当作shmkey对应的shmid,或者shmkey++
  2. postmaster.pid中的shmid

即便直接删除postmaster.pid,PG仍然可以知道shmem是不是被其他进程持有。但是我们可以通过datadir.inode和shmkey++的特性让他起库。

因为根据之前分析,我们云环境datadir inode都是4096,shmkey不同是因为源码有shmkey++的逻辑。所以我们可以:启动或停止一个datadir.inode=4096的PG库,让当前PG库启动时shmkey++多一个或者少一个,拿到不同的shmid。

$ kill -stop 165245 $ kill -9 164411 #停当前库并持有一个当前库backend进程 $ pg_ctl stop -D /pg8531/data # 停一个其他库 waiting for server to shut down.... done server stopped $ pg_ctl start -D /pg8574/data # 启动当前库,会失败,因为postmaster.pid没有删除 rase_ctl: another server might be running; trying to start server anyway waiting for server to start....2026-03-05 18:22:35 CST::@:[196209]: FATAL: pre-existing shared memory block (key 4097, ID 143917087) is still in use 2026-03-05 18:22:35 CST::@:[196209]: HINT: Terminate any old server processes associated with data directory "/pg8574/data". stopped waiting rase_ctl: could not start server Examine the log output. $ mv /lzlcloud/pg8574/data/postmaster.pid{,.bak} # 删除当前库的postmaster.pid $ pg_ctl start -D /lzlcloud/pg8574/data #再起当前库,成功 2026-03-05 18:23:09 CST::@:[207725]: LOG: redirecting log output to logging collector process 2026-03-05 18:23:09 CST::@:[207725]: HINT: Future log output will appear in directory "/lzlcloud/pg8574/data/pg_log". done server started $ ipcs -m -i 143917087 #shmid对应的sysv共享内存仍然被我们持有 Shared memory Segment shmid=143917087 uid=6001 gid=6001 cuid=6001 cgid=6001 mode=0600 access_perms=0600 bytes=56 lpid=196209 cpid=164411 nattch=1 att_time=Thu Mar 5 18:22:35 2026 det_time=Thu Mar 5 18:22:35 2026 change_time=Thu Mar 5 18:21:04 2026

可以启动,当前库共享内存申请了另一块,之前那个共享内存没有被清理。这就是在云环境下关其他库启动当前库的骚操作。

这里有个小小的前提,关的其他库不仅要inode=当前库inode,还要其他库shmkey<当前库shmkey。

报错分析:lock file "postmaster.pid" already exists

这个问题比“共享内存已存在”简单多了。

起库时本身就会检查lock file、lock file中的pid,CreateLockFile():

if (other_pid != my_pid && other_pid != my_p_pid && other_pid != my_gp_pid) { if (kill(other_pid, 0) == 0 || (errno != ESRCH && errno != EPERM)) { /* lockfile belongs to a live process */ ereport(FATAL, (errcode(ERRCODE_LOCK_FILE_EXISTS), errmsg("lock file \"%s\" already exists", filename), isDDLock ? (encoded_pid < 0 ? errhint("Is another postgres (PID %d) running in data directory \"%s\"?", (int) other_pid, refName) : errhint("Is another postmaster (PID %d) running in data directory \"%s\"?", (int) other_pid, refName)) : (encoded_pid < 0 ? errhint("Is another postgres (PID %d) using socket file \"%s\"?", (int) other_pid, refName) : errhint("Is another postmaster (PID %d) using socket file \"%s\"?", (int) other_pid, refName)))); } }

测试就更简单,在库启动的时候再启动一次:

$ pg_ctl start -D /pg8531/data pg_ctl: another server might be running; trying to start server anyway waiting for server to start....2026-03-06 15:59:05 CST::@:[89145]: FATAL: lock file "postmaster.pid" already exists 2026-03-06 15:59:05 CST::@:[89145]: HINT: Is another postmaster (PID 255500) running in data directory "/pg8531/data"? stopped waiting pg_ctl: could not start server Examine the log output.

所以故障时的start.log后面几个报错是因为库已经启动了,多启动了几次。

总结

PG在起库时,会先开辟一个sysv shmem(不是mmap对应的share buffers)以锁定datadir。锁定是通过datadir的inode号当作shmkey通过shmget申请的,并返回shmem唯一标识符shmid。由于可能申请的shmem被其他进程使用,PG会让shmkey++无限循环指到申请到没有被人占用的shmem。postmaster.pid第七行分别保存shmkey和shmid。在云环境下通常可以看到共享PG实例的shmkey递增的现象,这是因为data盘挂载方式相同使用了相同的inode,shmkey++导致。

如果PG实例被意外干掉,shmem不会被清理,正常情况下没有僵尸进程持有共享内存,那么起库会清理这段shmem并正常起库;异常情况下僵尸进程持有共享内存,起库会失败,此时需要介入处理。

推荐的处理方式:

  1. ipcrm -m(最推荐)
  2. lsof找到僵尸进程并kill
  3. 重启主机

不推荐但可以起库的方式:

  1. mv postmaster.pid+关闭一个其他PG库(其他PG库的shmkey<当前PG库)
  2. mv postmaster.pid+重新挂载data盘并改变inode

最后回答开头的问题:

  • 为什么这种场景在现实种不算太多见?

实例异常宕机+仍然有僵尸进程没有被清理。有些情况是异常宕机没有僵尸进程,正常起库就行了。

  • start.log起库报错有2类,分别对应什么操作和逻辑?

共享内存被占用的报错是因为实例异常宕机+仍然有僵尸进程;postmaster.pid存在的报错是因为起库多次

  • 如果PM都不在了共享内存还可以存在吗?

PM都不在了共享内存可以存在,PG的进程不一定会自己跑挂或者被OS处理;但是所有进程都不在了共享内存应该不存在

  • 这段共享内存如何定位和清理?

起库的start.log可以找到shmid,ipcrm -m $shmid命令可以清理。

  • PG共享内存有多段,这段共享内存是哪一段?

sysv shmem,用于保护datadir,一定存在,参考“三种共享内存”部分。与mmap下的sharebuffers是2个东西。

  • 可以通过inode或者文件找到对应的shmem吗?

LINUX在用户态没有提供通过inode或者文件找到对应shmem的接口(这句话AI含量100%,经过多个模型交叉验证)。PG是通过datadir的inode当作种子shmkey去申请的shmem共享内存,本质上不是通过inode直接找到对应的shmem,PG对shmem共享内存使用自己的寻找机制,但不绝对对应,shkey++就是一个折衷起库逻辑。

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论