暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

小实验:模拟撑爆PostgreSQL主进程

原创 瓶盖吃芥末 2025-11-03
139

模拟撑爆PostgreSQL主进程
方案:

  • 内存耗尽 + OOM killer
  • 文件句柄耗尽
  • fork()/线程失败
  • shared memory 区域出问题
  • fsync 报错或文件系统错误

示例:利用 WAL 写满 tmpfs 的方式,导致 PostgreSQL 触发 PANICpostmaster 进程终止并崩溃。


操作步骤

  1. 停库
  2. 备份原始 pg_wal 目录
echo "postgres ALL=(ALL) NOPASSWD: ALL" >>/etc/sudoers sudo mv pg_wal pg_wal_backup
  1. 创建一个小容量的 tmpfs 并挂载
sudo mkdir -p /tmp/pg_wal_tmpfs sudo chown postgres.postgres /tmp/pg_wal_tmpfs/ sudo mount -t tmpfs -o size=64M tmpfs /tmp/pg_wal_tmpfs

验证是否挂载成功:

df -h | grep tmpfs
  1. pg_wal 目录替换为软链接指向 tmpfs
sudo ln -s /tmp/pg_wal_tmpfs /var/lib/postgresql/15/main/pg_wal

确认链接成功:

ls -l /var/lib/postgresql/15/main/pg_wal
cp -r /data/pg/pg_data/pg_wal_bk/* /tmp/pg_wal_tmpfs/
  1. 重启 PostgreSQL 服务
... restart postgresql
  1. 登录 PostgreSQL 并执行写爆操作
psql -U postgres -d postgres

进入后输入:

DROP TABLE IF EXISTS bloater; CREATE TABLE bloater(id serial primary key, content text); INSERT INTO bloater(content) SELECT repeat('x', 10000) FROM generate_series(1, 500000);
  1. 观察:
  • 插入过程变慢
  • 最终 PostgreSQL 会报错:
    PANIC:  could not write to log file ...
    
  • 或者 psql 会断开连接,提示:
    server closed the connection unexpectedly
    

也可以查看系统日志确认崩溃

sudo grep PANIC .../log/postgresql/postgresql-15-main.log
ps -ef | grep postmaster # 如果没有 postmaster,说明确实已经崩了
  1. 恢复 PostgreSQL 正常运行
    停库,删除软链接 & 卸载 tmpfs
sudo rm /var/lib/postgresql/15/main/pg_wal sudo umount /tmp/pg_wal_tmpfs sudo rmdir /tmp/pg_wal_tmpfs

恢复原始 pg_wal 目录

sudo mv /var/lib/postgresql/15/main/pg_wal_backup /var/lib/postgresql/15/main/pg_wal
  1. 启动 PostgreSQL恢复
... start postgresql

完整过程:

[root@jerometestbase pg_data]# echo "postgres   ALL=(ALL)   NOPASSWD: ALL" >>/etc/sudoers
[root@jerometestbase pg_data]# su - postgres
[postgres@jerometestbase ~]$ pg_ctl stop
[postgres@jerometestbase pg_data]$ mv pg_wal/ pg_wal_bk
[postgres@jerometestbase pg_data]$  mkdir -p /tmp/pg_wal_tmpfs
[postgres@jerometestbase pg_data]$ sudo mount -t tmpfs -o size=64M tmpfs /tmp/pg_wal_tmpfs
[postgres@jerometestbase pg_data]$ sudo chown postgres.postgres /tmp/pg_wal_tmpfs/
[postgres@jerometestbase pg_data]$ ln -s /tmp/pg_wal_tmpfs /data/pg/pg_data/pg_wal
[postgres@jerometestbase pg_data]$ cp -r /data/pg/pg_data/pg_wal_bk/*  /tmp/pg_wal_tmpfs/
[postgres@jerometestbase pg_data]$ ll /tmp/pg_wal_tmpfs/
total 33M
[postgres@jerometestbase pg_data]$ pg_ctl start

postgres=# DROP TABLE IF EXISTS bloater;
NOTICE:  table "bloater" does not exist, skipping
DROP TABLE
postgres=# CREATE TABLE bloater(id serial primary key, content text);
INSERT INTO bloater(content)
SELECT repeat('x', 10000)
FROM generate_series(1, 500000);

HINT:  In a moment you should be able to reconnect to the database and repeat your command.
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!?> ^C
!?> \q


--最后查看日志
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论