暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

PostgreSQL 学习笔记,大量update场景测试

原创 范计杰 2020-06-22
1466

PostgreSQL中没有UNDO,MVCC是通过每次修改产生新版本记录的方式实现。如update,每次更新产生一行新版本的数据,不像ORACLE,MYSQL为inplace update,因此会导致表膨胀。这也是PG的痛点之一。
经自己验证,如果PG中单表不是太大,vacuum时不会有瓶颈,表膨胀问题并没有想象中那么严重。当然在PG中使用需要大量UPDATE的表时,还是有考虑表膨胀的问题。设计针对性的应对策略。
如有不同观点,欢迎指正!
下面简单测试一下。

--创建测试表
create table tab1(id int,c varchar(100));
create index idx_tab1 on tab1(id);

---insert 5000条测试记录
可以用generate_series函数。
这里想简单学习下plsql,所以没有使用generate_series
do $$
declare
v_idx integer := 1;
begin
while v_idx <= 5000 loop
v_idx = v_idx+1;
insert into tab1 values (v_idx,'test'||v_idx);
end loop;
end $$;

--查看表大小
test=# \d+
                             List of relations
 Schema |       Name       |   Type   |  Owner   |    Size    | Description 
--------+------------------+----------+----------+------------+-------------
 public | tab1             | table    | test     | 248 kB     |

--创建pgbench测试角本
--testupdate.sql 
\set  v_id random(1,5000)
begin;
select * from tab1 where id=:v_id;
update tab1 set c='test'||:v_id where id=:v_id;
end;


--多次使用pgbench进行压测
pgbench -c 10 -j 10 -M prepared -n -s 1428 -T 60 -r  -U test test -f testupdate.sql
transaction type: testupdate.sql
scaling factor: 1428
query mode: prepared
number of clients: 10
number of threads: 10
duration: 60 s
number of transactions actually processed: 159418
latency average = 3.764 ms
tps = 2656.485954 (including connections establishing)
tps = 2659.159694 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
         0.002  \set  v_id random(1,5000)
         0.100  begin;
         0.160  select * from tab1 where id=:v_id;
         0.266  update tab1 set c='test'||:v_id where id=:v_id;
         3.232  end;

---频繁update后,表大小稳定在416K不再增长。膨胀也就1倍
test=# \d+
                             List of relations
 Schema |       Name       |   Type   |  Owner   |    Size    | Description 
--------+------------------+----------+----------+------------+-------------
 public | tab1             | table    | test     | 416 kB     |            

---同时日志中可以看到,auto vacuum 会清理不需要的旧版本tuple,之后空间可以被重用
2020-06-21 07:18:12.806 EDT,,,3911,,5eef41f4.f47,1,,2020-06-21 07:18:12 EDT,5/33777,0,LOG,00000,"automatic vacuum of table ""test.public.tab1"": index scans: 1
pages: 0 removed, 48 remain, 0 skipped due to pins, 0 skipped frozen
tuples: 1244 removed, 5000 remain, 0 are dead but not yet removable, oldest xmin: 384819
buffer usage: 182 hits, 0 misses, 0 dirtied
avg read rate: 0.000 MB/s, avg write rate: 0.000 MB/s
system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s",,,,,,,,,""


---把表记录增长到50W后重新测试

--初化大小
test=# \d+
                             List of relations
 Schema |       Name       |   Type   |  Owner   |    Size    | Description 
--------+------------------+----------+----------+------------+------------- 
 public | tab1             | table    | test     | 21 MB      |  
(15 rows)

---十几次pgbench压测update
-bash-4.2$ pgbench -c 10 -j 10 -M prepared -n -s 1428 -T 60 -r  -U test test -f testupdate.sql

transaction type: testupdate.sql
scaling factor: 1428
query mode: prepared
number of clients: 10
number of threads: 10
duration: 60 s
number of transactions actually processed: 142533
latency average = 4.211 ms
tps = 2374.980907 (including connections establishing)
tps = 2376.384063 (excluding connections establishing)
script statistics:
 - statement latencies in milliseconds:
         0.002  \set  v_id random(1,500000)
         0.109  begin;
         0.224  select * from tab1 where id=:v_id;
         0.344  update tab1 set c='test'||:v_id where id=:v_id;
         3.528  end;

---test 十几次后,膨胀不算严重
test=# \d+
                             List of relations
 Schema |       Name       |   Type   |  Owner   |    Size    | Description 
--------+------------------+----------+----------+------------+-------------
 public | tab1             | table    | test     | 26 MB      | 
 
test=# select * from pg_stat_user_tables where relname='tab1';
-[ RECORD 1 ]-------+------------------------------
relid               | 50922
schemaname          | public
relname             | tab1
seq_scan            | 30009
seq_tup_read        | 301165050
idx_scan            | 4106239
idx_tup_fetch       | 3121727
n_tup_ins           | 1516224
n_tup_upd           | 1615853
n_tup_del           | 861013
n_tup_hot_upd       | 1416579
n_live_tup          | 505018
n_dead_tup          | 26307
n_mod_since_analyze | 0
last_vacuum         | 
last_autovacuum     | 2020-06-21 03:35:15.758202-08
last_analyze        | 
last_autoanalyze    | 2020-06-21 03:39:16.186562-08
vacuum_count        | 0
autovacuum_count    | 32
analyze_count       | 0
autoanalyze_count   | 61


最后修改时间:2020-06-22 19:11:39
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论