PostgreSQL中没有UNDO,MVCC是通过每次修改产生新版本记录的方式实现。如update,每次更新产生一行新版本的数据,不像ORACLE,MYSQL为inplace update,因此会导致表膨胀。这也是PG的痛点之一。
经自己验证,如果PG中单表不是太大,vacuum时不会有瓶颈,表膨胀问题并没有想象中那么严重。当然在PG中使用需要大量UPDATE的表时,还是有考虑表膨胀的问题。设计针对性的应对策略。
如有不同观点,欢迎指正!
下面简单测试一下。
--创建测试表
create table tab1(id int,c varchar(100));
create index idx_tab1 on tab1(id);
---insert 5000条测试记录
可以用generate_series函数。
这里想简单学习下plsql,所以没有使用generate_series
do $$
declare
v_idx integer := 1;
begin
while v_idx <= 5000 loop
v_idx = v_idx+1;
insert into tab1 values (v_idx,'test'||v_idx);
end loop;
end $$;
--查看表大小
test=# \d+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------------+----------+----------+------------+-------------
public | tab1 | table | test | 248 kB |
--创建pgbench测试角本
--testupdate.sql
\set v_id random(1,5000)
begin;
select * from tab1 where id=:v_id;
update tab1 set c='test'||:v_id where id=:v_id;
end;
--多次使用pgbench进行压测
pgbench -c 10 -j 10 -M prepared -n -s 1428 -T 60 -r -U test test -f testupdate.sql
transaction type: testupdate.sql
scaling factor: 1428
query mode: prepared
number of clients: 10
number of threads: 10
duration: 60 s
number of transactions actually processed: 159418
latency average = 3.764 ms
tps = 2656.485954 (including connections establishing)
tps = 2659.159694 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.002 \set v_id random(1,5000)
0.100 begin;
0.160 select * from tab1 where id=:v_id;
0.266 update tab1 set c='test'||:v_id where id=:v_id;
3.232 end;
---频繁update后,表大小稳定在416K不再增长。膨胀也就1倍
test=# \d+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------------+----------+----------+------------+-------------
public | tab1 | table | test | 416 kB |
---同时日志中可以看到,auto vacuum 会清理不需要的旧版本tuple,之后空间可以被重用
2020-06-21 07:18:12.806 EDT,,,3911,,5eef41f4.f47,1,,2020-06-21 07:18:12 EDT,5/33777,0,LOG,00000,"automatic vacuum of table ""test.public.tab1"": index scans: 1
pages: 0 removed, 48 remain, 0 skipped due to pins, 0 skipped frozen
tuples: 1244 removed, 5000 remain, 0 are dead but not yet removable, oldest xmin: 384819
buffer usage: 182 hits, 0 misses, 0 dirtied
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s",,,,,,,,,""
---把表记录增长到50W后重新测试
--初化大小
test=# \d+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------------+----------+----------+------------+-------------
public | tab1 | table | test | 21 MB |
(15 rows)
---十几次pgbench压测update
-bash-4.2$ pgbench -c 10 -j 10 -M prepared -n -s 1428 -T 60 -r -U test test -f testupdate.sql
transaction type: testupdate.sql
scaling factor: 1428
query mode: prepared
number of clients: 10
number of threads: 10
duration: 60 s
number of transactions actually processed: 142533
latency average = 4.211 ms
tps = 2374.980907 (including connections establishing)
tps = 2376.384063 (excluding connections establishing)
script statistics:
- statement latencies in milliseconds:
0.002 \set v_id random(1,500000)
0.109 begin;
0.224 select * from tab1 where id=:v_id;
0.344 update tab1 set c='test'||:v_id where id=:v_id;
3.528 end;
---test 十几次后,膨胀不算严重
test=# \d+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+------------------+----------+----------+------------+-------------
public | tab1 | table | test | 26 MB |
test=# select * from pg_stat_user_tables where relname='tab1';
-[ RECORD 1 ]-------+------------------------------
relid | 50922
schemaname | public
relname | tab1
seq_scan | 30009
seq_tup_read | 301165050
idx_scan | 4106239
idx_tup_fetch | 3121727
n_tup_ins | 1516224
n_tup_upd | 1615853
n_tup_del | 861013
n_tup_hot_upd | 1416579
n_live_tup | 505018
n_dead_tup | 26307
n_mod_since_analyze | 0
last_vacuum |
last_autovacuum | 2020-06-21 03:35:15.758202-08
last_analyze |
last_autoanalyze | 2020-06-21 03:39:16.186562-08
vacuum_count | 0
autovacuum_count | 32
analyze_count | 0
autoanalyze_count | 61
最后修改时间:2020-06-22 19:11:39
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




