以下生产案例,将表名等都做了替换:
业务反应sql执行慢,查看执行计划为全表扫描,count表行数很少,表膨胀比较大所以比较慢
查看表自动vacuum已经做了,但是还是有很多死元组
testdb=# select * from pg_stat_all_tables where relname='test';
-[ RECORD 1 ]-------+------------------------------
relid | 23620
schemaname | sch1
relname | test
seq_scan | 69510137
seq_tup_read | 8810306691
idx_scan | 3883729
idx_tup_fetch | 3883855
n_tup_ins | 1
n_tup_upd | 36157173
n_tup_del | 0
n_tup_hot_upd | 3375745
n_live_tup | 127
n_dead_tup | 3298921
n_mod_since_analyze | 0
last_vacuum |
last_autovacuum | 2024-09-03 15:16:36.754106+08
last_analyze |
last_autoanalyze | 2024-09-03 15:15:12.431527+08
vacuum_count | 0
autovacuum_count | 141814
analyze_count | 0
autoanalyze_count | 131518手动vacuum也未能清理死元组
testdb=# vacuum verbose sch1.test;
INFO: vacuuming "sch1.test"
INFO: "test": found 0 removable, 3299567 nonremovable row versions in 101152 out of 101155 pages
DETAIL: 3299177 dead row versions cannot be removed yet, oldest xmin: 2539223913
There were 6095 unused item identifiers.
Skipped 3 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.58 s, system: 0.00 s, elapsed: 0.93 s.
VACUUM检查是否是长事务引起,发现备库有几个idle in transaction状态会话已经持续了21天
postgres=# select usename,pid,client_addr,state,wait_event,now()::timestamp(0)-query_start::timestamp(0) exec_time,SUBSTR(query,1,80) as query from pg_stat_activity where state <> 'idle' and pid <> pg_backend_pid() order by query_start;
usename | pid | client_addr | state | wait_event | exec_time | query
-----------+-------+---------------+---------------------+------------+------------------+----------------------------------------------------------------------------------
xx_userxx | 28411 | xxx.xxx.xx.xx | idle in transaction | ClientRead | 21 days 04:12:46 | /* dynamic native SQL query */ select t.xxxxxxxx from xxxxxxxx t where t.xxxxxx
xx_userxx | 18125 | xxx.xxx.xx.xx | idle in transaction | ClientRead | 21 days 04:03:32 | /* dynamic native SQL query */ select t.xxxxxxxx from xxxxxxxx t where t.xxxxxx
xx_userxx | 33239 | xxx.xxx.xx.xx | idle in transaction | ClientRead | 21 days 04:01:26 | /* dynamic native SQL query */ select t.xxxxxxxx from xxxxxxxx t where t.xxxxxx
xx_userxx | 25032 | xxx.xxx.xx.xx | idle in transaction | ClientRead | 20 days 23:53:40 | /* dynamic native SQL query */ select t.xxxxxxxx from xxxxxxxx t where t.xxxxxx
(4 rows)把上面查出来的几个会话杀掉,再次手动执行vacuum,可以看到死元组被清理了
testdb=# vacuum verbose sch1.test;
INFO: vacuuming "sch1.test"
INFO: scanned index "pk_test" to remove 65699 row versions
DETAIL: CPU: user: 0.45 s, system: 0.01 s, elapsed: 0.76 s
INFO: "test": removed 65699 row versions in 2145 pages
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO: index "pk_test" now contains 2742283 row versions in 19284 pages
DETAIL: 65699 index row versions were removed.
430 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "test": found 89 removable, 3229345 nonremovable row versions in 101164 out of 101165 pages
DETAIL: 3228979 dead row versions cannot be removed yet, oldest xmin: 2539442736
There were 10985 unused item identifiers.
Skipped 1 page due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 1.08 s, system: 0.02 s, elapsed: 1.58 s.
VACUUM
testdb=# vacuum verbose sch1.test;
INFO: vacuuming "sch1.test"
INFO: scanned index "pk_test" to remove 2518051 row versions
DETAIL: CPU: user: 1.78 s, system: 0.02 s, elapsed: 1.82 s
INFO: "test": removed 2518051 row versions in 90577 pages
DETAIL: CPU: user: 0.28 s, system: 0.00 s, elapsed: 0.29 s
INFO: index "pk_test" now contains 224239 row versions in 19284 pages
DETAIL: 2518051 index row versions were removed.
12739 index pages have been deleted, 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO: "test": found 7 removable, 715 nonremovable row versions in 90600 out of 101165 pages
DETAIL: 324 dead row versions cannot be removed yet, oldest xmin: 2566902560
There were 437287 unused item identifiers.
Skipped 8426 pages due to buffer pins, 2139 frozen pages.
0 pages are entirely empty.
CPU: user: 2.22 s, system: 0.05 s, elapsed: 2.30 s.
VACUUM如何避免这个问题的再次发生,设置idle_in_transaction_session_timeout参数,idle_in_transaction状态会话超过时长自动被杀掉
postgres=# show idle_in_transaction_session_timeout;
idle_in_transaction_session_timeout
-------------------------------------
0
(1 row)
postgres=# alter system set idle_in_transaction_session_timeout='12h'; --idle_in_transaction状态超过时长12小时自动被杀掉
ALTER SYSTEM
postgres=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)
postgres=# show idle_in_transaction_session_timeout;
idle_in_transaction_session_timeout
-------------------------------------
12h
(1 row)「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




