暂无图片
暂无图片
7
暂无图片
暂无图片
暂无图片

PostgreSQL 扩展之 pg_dirtyread

pg_dirtyread是一个脏读插件,对于误操作删除的数据或者列,在未对表执行vacuum之前可以读取出来并进行恢复,但是对truncate操作是没有效果的,使用此插件需要让vacuum相关参数不敏感,或者调整关键表的autovacuum/vacuum存储参数设置,另外使用此参数,也要考虑对性能的影响和维护成本。

PostgreSQL 版本:PG14.2
pg_dirtyread 版本:2.4
链接地址:https://github.com/ChristophBerg/pg_dirtyread

安装部署

软件获取

--下载
wget https://github.com/df7cb/pg_dirtyread/archive/refs/tags/2.4.tar.gz

--将pg_dirtyread放到数据库源码的contrib目录下,并给777权限
# pwd
/opt/software/postgresql-14.2/contrib

# ll |grep -i pg_dirtyread
drwxrwxrwx 8  777 root 4096 Mar 15 14:31 pg_dirtyread

编译安装

--配置pg_config
# vim Makefile
PG_CONFIG = /opt/postgresql-14.2/bin/pg_config

--编译
# make && make install

--安装
postgres=# select * from pg_available_extensions() where name='pg_dirtyread';
     name     | default_version |                 comment
--------------+-----------------+------------------------------------------
 pg_dirtyread | 2               | Read dead but unvacuumed rows from table

postgres=# create extension pg_dirtyread;
CREATE EXTENSION
postgres=# \dx
                          List of installed extensions
     Name     | Version |   Schema   |               Description
--------------+---------+------------+------------------------------------------
 pg_dirtyread | 2       | public     | Read dead but unvacuumed rows from table
 plpgsql      | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

场景测试

--创建表
postgres=# create table fb_dirty(id int,col1 varchar(8)) with(autovacuum_enabled = false, toast.autovacuum_enabled = false);
CREATE TABLE
postgres=# \d+ fb_dirty
                                               Table "public.fb_dirty"
 Column |         Type         | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
--------+----------------------+-----------+----------+---------+----------+-------------+--------------+-------------
 id     | integer              |           |          |         | plain    |             |              |
 col1   | character varying(8) |           |          |         | extended |             |              |
Access method: heap
Options: autovacuum_enabled=false

--插入数据
postgres=# insert into fb_dirty values(1,'test1'),(2,'test2');
postgres=# select xmin,xmax,ctid,* from fb_dirty;
 xmin | xmax | ctid  | id | col1
------+------+-------+----+-------
  765 |    0 | (0,1) |  1 | test1
  765 |    0 | (0,2) |  2 | test2
(2 rows)

--查看脏读数据,如果表在非public模式下,直接用schema.table即可
postgres=# select * from pg_dirtyread('fb_dirty') as foo(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,id int,col1 varchar(8));
 tableoid | ctid  | xmin | xmax | cmin | cmax | dead | id | col1
----------+-------+------+------+------+------+------+----+-------
    16394 | (0,1) |  765 |    0 |    0 |    0 | f    |  1 | test1
    16394 | (0,2) |  765 |    0 |    0 |    0 | f    |  2 | test2
(2 rows)

删除数据

postgres=# delete from fb_dirty where id=2;
DELETE 1
postgres=# select xmin,xmax,ctid,* from fb_dirty;
 xmin | xmax | ctid  | id | col1
------+------+-------+----+-------
  765 |    0 | (0,1) |  1 | test1
(1 row)

postgres=# select * from pg_dirtyread('fb_dirty') as foo(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,id int,col1 varchar(8));
 tableoid | ctid  | xmin | xmax | cmin | cmax | dead | id | col1
----------+-------+------+------+------+------+------+----+-------
    16394 | (0,1) |  765 |    0 |    0 |    0 | f    |  1 | test1
    16394 | (0,2) |  765 |  766 |    0 |    0 | t    |  2 | test2
(2 rows)

当我们通过delete操作将id=2的这条数据删除后,在表里我们已经看不到这条记录,但是在脏读中可以看到这条数据,但是dead那列标示这条数据已经是t,说明这条数据被删除。

更新数据

postgres=# update fb_dirty set col1='t1' where id=1;
UPDATE 1
postgres=# select xmin,xmax,ctid,* from fb_dirty;
 xmin | xmax | ctid  | id | col1
------+------+-------+----+------
  767 |    0 | (0,3) |  1 | t1
(1 row)

postgres=# select * from pg_dirtyread('fb_dirty') as foo(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,id int,col1 varchar(8));
 tableoid | ctid  | xmin | xmax | cmin | cmax | dead | id | col1
----------+-------+------+------+------+------+------+----+-------
    16394 | (0,1) |  765 |  767 |    0 |    0 | t    |  1 | test1
    16394 | (0,2) |  765 |  766 |    0 |    0 | t    |  2 | test2
    16394 | (0,3) |  767 |    0 |    0 |    0 | f    |  1 | t1
(3 rows)

当我们更新id=1这条数据之后,在脏读中我们可以看到id=1的事务765已经被事务767覆盖,事务765的两个操作都已经dead,事务767中的col1这列的值’test1’替换成了’t1’,同时我们也能看到这个表中的历史操作流水。

删除列

postgres=# alter table fb_dirty drop column col1;
ALTER TABLE
postgres=# select xmin,xmax,ctid,* from fb_dirty;
 xmin | xmax | ctid  | id
------+------+-------+----
  767 |    0 | (0,3) |  1
(1 row)

postgres=# select * from pg_dirtyread('fb_dirty') as foo(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,id int,dropped_2 varchar(8));
 tableoid | ctid  | xmin | xmax | cmin | cmax | dead | id | dropped_2
----------+-------+------+------+------+------+------+----+-----------
    16394 | (0,1) |  765 |  767 |    0 |    0 | t    |  1 | test1
    16394 | (0,2) |  765 |  766 |    0 |    0 | t    |  2 | test2
    16394 | (0,3) |  767 |    0 |    0 |    0 | f    |  1 | t1
(3 rows)

我们删除列col1后,在脏读中依然能看到之前的数据,但是这里要注意一个变更,就是col1已经没有了,需要用dropped_n来替代,n代表表中的第几列。

truncate数据

postgres=# select xmin,xmax,ctid,* from fb_dirty;
 xmin | xmax | ctid  | id
------+------+-------+----
  767 |    0 | (0,3) |  1
(1 row)

postgres=# truncate fb_dirty;
TRUNCATE TABLE
postgres=# select xmin,xmax,ctid,* from fb_dirty;
 xmin | xmax | ctid | id
------+------+------+----
(0 rows)

postgres=# select * from pg_dirtyread('fb_dirty') as foo(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,id int,dropped_2 varchar(8));
 tableoid | ctid | xmin | xmax | cmin | cmax | dead | id | dropped_2
----------+------+------+------+------+------+------+----+-----------
(0 rows)

truncate 这个操作很危险,它不但将表中的数据清除,同时也将脏读的历史流水也同步清理。

性能测试

我们直接用pgbench对同一个实例下不通的数据库做压测,其中一个数据库安装插件,看一下两个数据库大小以及pgbench的效率

环境准备

dirtyread=# \l+
                                                                    List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   |  Size   | Tablespace |                Description
-----------+----------+----------+-------------+-------------+-----------------------+---------+------------+--------------------------------------------
 dirtyread | postgres | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |                       | 8569 kB | pg_default |
 postgres  | postgres | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |                       | 8569 kB | pg_default | default administrative connection database

postgres=# \dx
                          List of installed extensions
     Name     | Version |   Schema   |               Description
--------------+---------+------------+------------------------------------------
 pg_dirtyread | 2       | public     | Read dead but unvacuumed rows from table
 plpgsql      | 1.0     | pg_catalog | PL/pgSQL procedural language
(2 rows)

postgres=# \c dirtyread
You are now connected to database "dirtyread" as user "postgres".
dirtyread=# \dx
                 List of installed extensions
  Name   | Version |   Schema   |         Description
---------+---------+------------+------------------------------
 plpgsql | 1.0     | pg_catalog | PL/pgSQL procedural language
(1 row)

dirtyread压测

$ pgbench -i dirtyread -p 1432 -s 10
$ pgbench -n -r -T 60 -P 1 -c 32 -j 4 -p 1432 dirtyread
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 10
query mode: simple
number of clients: 32
number of threads: 4
duration: 60 s
number of transactions actually processed: 272811
latency average = 7.008 ms
latency stddev = 4.515 ms
initial connection time = 17.247 ms
tps = 4546.851288 (without initial connection time)
statement latencies in milliseconds:
         0.001  \set aid random(1, 100000 * :scale)
         0.001  \set bid random(1, 1 * :scale)
         0.001  \set tid random(1, 10 * :scale)
         0.000  \set delta random(-5000, 5000)
         0.354  BEGIN;
         0.460  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
         0.423  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
         1.293  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
         2.880  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
         0.413  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
         1.183  END;

postgres压测

$ pgbench -i postgres -p 1432 -s 10
$ pgbench -n -r -T 60 -P 1 -c 32 -j 4 -p 1432 postgres
transaction type: <builtin: TPC-B (sort of)>
scaling factor: 10
query mode: simple
number of clients: 32
number of threads: 4
duration: 60 s
number of transactions actually processed: 276637
latency average = 6.915 ms
latency stddev = 4.418 ms
initial connection time = 15.362 ms
tps = 4610.552273 (without initial connection time)
statement latencies in milliseconds:
         0.001  \set aid random(1, 100000 * :scale)
         0.000  \set bid random(1, 1 * :scale)
         0.001  \set tid random(1, 10 * :scale)
         0.000  \set delta random(-5000, 5000)
         0.344  BEGIN;
         0.450  UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;
         0.414  SELECT abalance FROM pgbench_accounts WHERE aid = :aid;
         1.275  UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;
         2.849  UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;
         0.404  INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);
         1.176  END;

结果对比

postgres=# \l+
                                                                    List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges   |  Size   | Tablespace |                Description
-----------+----------+----------+-------------+-------------+-----------------------+---------+------------+--------------------------------------------
 dirtyread | postgres | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |                       | 188 MB  | pg_default |
 postgres  | postgres | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 |                       | 189 MB  | pg_default | default administrative connection database
 template0 | postgres | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =c/postgres          +| 8409 kB | pg_default | unmodifiable empty database
           |          |          |             |             | postgres=CTc/postgres |         |            |
 template1 | postgres | UTF8     | zh_CN.UTF-8 | zh_CN.UTF-8 | =c/postgres          +| 8409 kB | pg_default | default template for new databases
           |          |          |             |             | postgres=CTc/postgres |         |            |
(4 rows)

postgres=# select count(*) from pg_dirtyread('pgbench_accounts') as foo(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,aid int,bid int,abalance int,filler char(84));
  count
---------
 1016703

同样的命令,同一个实例,数据库的总数据量相差不大,pgbench工具用32c,在cpu跑满的情况下,相差也不是很大,从这个例子可以看出pg_dirtyread对性能影响较小。

注:此次测试仅作参考

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

文章被以下合辑收录

评论