作者
digoal
日期
2019-07-20
标签
PostgreSQL , 膨胀点
背景
PostgreSQL 11以前的版本,默认存储引擎为HEAP,删除更新操作产生的垃圾版本在数据文件内,所有有autovacuum或vacuum来清理垃圾,但是什么垃圾能清理,什么垃圾不能清理有一定的讲究,如下分析:
《膨胀点解释 - 全局catalog,库级catalog,普通表,wal文件 - 哪些垃圾(dead tuple), wal文件不能被回收reuse - 什么情况下可能膨胀》
如上文章介绍,其中最重要的一个是GetOldestXmin:系统中存在的最老事务,甭管你是2pc还是,空闲中的事务,又或者是在执行中的SQL,也甭管你的事务的隔离级别,总之它只看最老的。
那么问题来了:
1、对于一个最常用的隔离级别rc,在这种事务中的sql,sql快照实际是指的是sql发起时的状态,sql发起之前已提交的事务所产生的垃圾版本对于这个sql来说已经不需要看到了。
但是GetOldestXmin可不管你这么多,总之它只看事务启动后获得的第一个快照,在这个快照之后产生的垃圾tuple都不会被清理。所以一个曾经获得过事务快照的空闲事务,就会变得比较危险。
危险是什么?请看前面的文章分析。
ps: 空闲中的只读事务不影响,因为它没有backend_xid, backend_xmin。
2、已经prepare transaction 但是未commit或rollback 的2pc事务,也同样危险,问题类似,GetOldestXmin里面也包含了2pc开启时的事务快照。所以迟迟不结束的2pc也会有影响。
但是实际上prepare transaction后,这个事务实际上也是处于类似空闲的状态。
影响例子
first, 空闲中事务
```
postgres=# begin;
BEGIN
postgres=# select 1;
?column?
1
(1 row)
postgres=# select pg_backend_pid();
pg_backend_pid
38295
(1 row)
```
postgres=# select * from pg_stat_activity where pid=38295;
-[ RECORD 1 ]----+------------------------------
datid | 13591
datname | postgres
pid | 38295
usesysid | 10
usename | postgres
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2019-07-20 08:59:56.960177+08
xact_start | 2019-07-20 10:15:33.683904+08
query_start | 2019-07-20 10:16:00.818965+08
state_change | 2019-07-20 10:16:00.819302+08
wait_event_type | Client
wait_event | ClientRead
state | idle in transaction
backend_xid |
backend_xmin |
query | select pg_backend_pid();
backend_type | client backend
```
postgres=# select txid_current();
txid_current
800
(1 row)
```
```
-[ RECORD 1 ]----+------------------------------
datid | 13591
datname | postgres
pid | 38295
usesysid | 10
usename | postgres
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2019-07-20 08:59:56.960177+08
xact_start | 2019-07-20 10:15:33.683904+08
query_start | 2019-07-20 10:16:28.371439+08
state_change | 2019-07-20 10:16:28.371717+08
wait_event_type | Client
wait_event | ClientRead
state | idle in transaction
backend_xid | 800
backend_xmin |
query | select txid_current();
backend_type | client backend
postgres=# create table tmp1(id int);
CREATE TABLE
postgres=# insert into tmp1 values (1);
INSERT 0 1
postgres=# delete from tmp1;
DELETE 1
postgres=# vacuum verbose tmp1;
psql: INFO: vacuuming "public.tmp1"
psql: INFO: "tmp1": found 0 removable, 1 nonremovable row versions in 1 out of 1 pages
DETAIL: 1 dead row versions cannot be removed yet, oldest xmin: 800
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
```
postgres=# end;
COMMIT
postgres=# vacuum verbose tmp1;
psql: INFO: vacuuming "public.tmp1"
psql: INFO: "tmp1": removed 1 row versions in 1 pages
psql: INFO: "tmp1": found 1 removable, 0 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 804
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
psql: INFO: "tmp1": truncated 1 to 0 pages
DETAIL: CPU: user: 0.09 s, system: 0.00 s, elapsed: 0.09 s
VACUUM
second,未结束2pc
```
postgres=# begin;
BEGIN
postgres=# select txid_current();
txid_current
806
(1 row)
postgres=# prepare transaction 'a';
PREPARE TRANSACTION
```
postgres=# truncate tmp1;
TRUNCATE TABLE
postgres=# insert into tmp1 values (1);
INSERT 0 1
postgres=# delete from tmp1;
DELETE 1
postgres=# vacuum verbose tmp1;
psql: INFO: vacuuming "public.tmp1"
psql: INFO: "tmp1": found 0 removable, 1 nonremovable row versions in 1 out of 1 pages
DETAIL: 1 dead row versions cannot be removed yet, oldest xmin: 806
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
VACUUM
postgres=# commit prepared 'a';
COMMIT PREPARED
postgres=# vacuum verbose tmp1;
psql: INFO: vacuuming "public.tmp1"
psql: INFO: "tmp1": removed 1 row versions in 1 pages
psql: INFO: "tmp1": found 1 removable, 0 nonremovable row versions in 1 out of 1 pages
DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 810
There were 0 unused item identifiers.
Skipped 0 pages due to buffer pins, 0 frozen pages.
0 pages are entirely empty.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
psql: INFO: "tmp1": truncated 1 to 0 pages
DETAIL: CPU: user: 0.09 s, system: 0.00 s, elapsed: 0.09 s
VACUUM
如何优化?
1 内核优化
1、简单一点,GetOldestXmin时,遇到空闲事务、2pc未结束事务,这类事务的oldestxmin使用当前系统最小未分配事务号代替(因为确确实实他们如果再发起请求的话,它们的快照是这样的)。
```
postgres=# select * from txid_current_snapshot();
txid_current_snapshot
811:811:
(1 row)
```
优化后,2pc和空闲中事务对膨胀就没有影响了。
2 设置参数优化
1、快照过旧,影响所有的慢请求(包括慢sql,长事务,2pc等)。
```
postgres=# show old_snapshot_threshold ;
old_snapshot_threshold
-1
(1 row)
参数
old_snapshot_threshold = -1 # 1min-60d; -1 disables; 0 is immediate
# (change requires restart)
```
2、事务空闲超时,只影响空闲事务。不影响其他包括2pc.
postgres=# set idle_in_transaction_session_timeout ='1s';
SET
postgres=# begin;
BEGIN
postgres=# select txid_current();
psql: FATAL: terminating connection due to idle-in-transaction timeout
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
```
postgres=# set idle_in_transaction_session_timeout ='10s';
SET
postgres=# begin;
BEGIN
postgres=# select txid_current();
txid_current
812
(1 row)
postgres=# prepare transaction 'a';
PREPARE TRANSACTION
postgres=# select * from pg_prepared_xacts ;
transaction | gid | prepared | owner | database
-------------+-----+-------------------------------+----------+----------
812 | a | 2019-07-20 10:31:16.706549+08 | postgres | postgres
(1 row)
10秒后,可以提交.
postgres=# commit prepared 'a';
COMMIT PREPARED
```
小节
优化方法如上,但是使用新的存储引擎zheap或zedstore,这些都不是问题了。
PostgreSQL 许愿链接
您的愿望将传达给PG kernel hacker、数据库厂商等, 帮助提高数据库产品质量和功能, 说不定下一个PG版本就有您提出的功能点. 针对非常好的提议,奖励限量版PG文化衫、纪念品、贴纸、PG热门书籍等,奖品丰富,快来许愿。开不开森.
9.9元购买3个月阿里云RDS PostgreSQL实例
PostgreSQL 解决方案集合
德哥 / digoal's github - 公益是一辈子的事.





