Whoami:6年+金融、政府、医疗领域工作经验的DBACertificate:PGCM、OCP、YCPSkill:Oracle、Mysql、PostgreSQL、国产数据库Platform:CSDN、墨天轮、公众号(呆呆的私房菜)
环境声明:本测试库环境为pg11版本。 1. 创建范围分区:
CREATE TABLE orders (order_id serial,customer_id int NOT NULL,order_status char(1) NOT NULL CHECK (order_status IN ('P', 'S', 'C')),order_date date NOT NULL) PARTITION BY RANGE (order_date);
2. 创建分区表:
-- 2023年订单CREATE TABLE orders_2023 PARTITION OF ordersFOR VALUES FROM ('2023-01-01') TO ('2024-01-01');-- 2024年订单CREATE TABLE orders_2024 PARTITION OF ordersFOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
3. 插入数据
testdb=> insert into orders(customer_id,order_status,order_date)testdb-> values(1,'P',date '2023-12-01'),(1,'S',date '2024-12-01'),(2,'S',date '2023-10-01');INSERT 0 3testdb=>testdb=> select * from orders;order_id | customer_id | order_status | order_date----------+-------------+--------------+------------1 | 1 | P | 2023-12-013 | 2 | S | 2023-10-012 | 1 | S | 2024-12-01(3 行记录)
1. 会话1:创建新的分区:
-- 开启事务testdb=> begin;BEGINtestdb=> CREATE TABLE orders_2022 PARTITION OF orders FOR VALUES FROM ('2022-01-01') TO ('2023-01-01');CREATE TABLE
2. 会话2:执行pg_dump导出其他任意表,这里示例 t1:
## 执行备份,此时进程hung住[pg11@host-01 ~]$ pg_dump -h 127.0.0.1 -U test -d testdb -t orders -s...
完蛋,pg_dump竟然hung住了。。。赶紧分析下锁源:
-- 查看等待进程testdb=> select pid,application_name,xact_start,state,wait_event,wait_event_type,left(query,50) from pg_stat_activity where state<>'idle' and pid != pg_backend_pid() order by xact_start;pid | application_name | xact_start | state | wait_event | wait_event_type | left-------+------------------+-------------------------------+---------------------+------------+-----------------+----------------------------------------------------20660 | psql | 2024-08-15 11:29:47.381227+08 | idle in transaction | ClientRead | Client | CREATE TABLE orders_2022 PARTITION OF orders FOR V21530 | pg_dump | 2024-08-15 11:33:18.38258+08 | active | relation | Lock | SELECT c.tableoid, c.oid, c.relname, (SELECT pg_ca21977 | psql | 2024-08-15 11:34:56.267895+08 | active | relation | Lock | select * from orders;(3 行记录)-- 看看谁阻塞了pg_dumptestdb=> select * from pg_locks where pid = 21530 AND granted IS NOT true;locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+-----------------+---------+----------relation | 33027 | 33117 | | | | | | | | 4/572 | 21530 | AccessShareLock | f | f(1 行记录)-- 找到持有该现有锁的进程testdb=> select * from pg_locks where relation = 33117 and granted IS true;locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+---------------------+---------+----------relation | 33027 | 33117 | | | | | | | | 3/155 | 20660 | AccessShareLock | t | frelation | 33027 | 33117 | | | | | | | | 3/155 | 20660 | RowExclusiveLock | t | frelation | 33027 | 33117 | | | | | | | | 3/155 | 20660 | AccessExclusiveLock | t | f(3 行记录)-- 查找阻塞进程正在执行的操作testdb=> select pid, state, wait_event, wait_event_type, left (query, 40) from pg_stat_activity where pid = 20660;pid | state | wait_event | wait_event_type | left-------+---------------------+------------+-----------------+------------------------------------------20660 | idle in transaction | ClientRead | Client | CREATE TABLE orders_2022 PARTITION OF or(1 行记录)
好的,抓到罪魁祸首,创建分区语句引发卡住。 问题又来了,为什么会导致pg_dump等待呢?pg_dump执行了什么操作呢?
# 设置log_statement = all,查看执行卡住语句;SELECTc.tableoid,c.oid,c.relname,(SELECTpg_catalog.array_agg(aclORDER BYrow_n)FROM(SELECTacl,row_nFROMpg_catalog.unnest(coalesce(c.relacl,pg_catalog.acldefault(CASE WHEN c.relkind = 'S' THEN 's' ELSE 'r' END ::& quot; char & quot;,c.relowner))) WITH ORDINALITY AS perm(acl, row_n)WHERENOT EXISTS (SELECT1FROMpg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(CASE WHEN c.relkind = 'S' THEN 's' ELSE 'r' END ::& quot; char & quot;,c.relowner))) AS init(init_acl)WHEREacl = init_acl)) as foo) AS relacl,(SELECTpg_catalog.array_agg(aclORDER BYrow_n)FROM(SELECTacl,row_nFROMpg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(CASE WHEN c.relkind = 'S' THEN 's' ELSE 'r' END ::& quot; char & quot;,c.relowner))) WITH ORDINALITY AS initp(acl, row_n)WHERENOT EXISTS (SELECT1FROMpg_catalog.unnest(coalesce(c.relacl,pg_catalog.acldefault(CASE WHEN c.relkind = 'S' THEN 's' ELSE 'r' END ::& quot; char & quot;,c.relowner))) AS permp(orig_acl)WHEREacl = orig_acl)) as foo) as rrelacl,NULL AS initrelacl,NULL as initrrelacl,c.relkind,c.relnamespace,(SELECTrolnameFROMpg_catalog.pg_rolesWHEREoid = c.relowner) AS rolname,c.relchecks,c.relhastriggers,c.relhasindex,c.relhasrules,c.relhasoids,c.relrowsecurity,c.relforcerowsecurity,c.relfrozenxid,c.relminmxid,tc.oid AS toid,tc.relfrozenxid AS tfrozenxid,tc.relminmxid AS tminmxid,c.relpersistence,c.relispopulated,c.relreplident,c.relpages,CASE WHEN c.reloftype & lt;& gt; 0 THEN c.reloftype : :pg_catalog.regtype ELSE NULL END AS reloftype,d.refobjid AS owning_tab,d.refobjsubid AS owning_col,(SELECTspcnameFROMpg_tablespace tWHEREt.oid = c.reltablespace) AS reltablespace,array_remove(array_remove(c.reloptions, 'check_option=local'),'check_option=cascaded') AS reloptions,CASE WHEN 'check_option=local' = ANY (c.reloptions) THEN 'LOCAL' : :text WHEN 'check_option=cascaded' = ANY (c.reloptions) THEN 'CASCADED' : :text ELSE NULL END AS checkoption,tc.reloptions AS toast_reloptions,c.relkind = 'S'AND EXISTS (SELECT1FROMpg_dependWHEREclassid = 'pg_class' : :regclassAND objid = c.oidAND objsubid = 0AND refclassid = 'pg_class' : :regclassAND deptype = 'i') AS is_identity_sequence,EXISTS (SELECT1FROMpg_attribute atLEFT JOIN pg_init_privs pip ON (c.oid = pip.objoidAND pip.classoid = 'pg_class' : :regclassAND pip.objsubid = at.attnum)WHEREat.attrelid = c.oidAND ((SELECTpg_catalog.array_agg(aclORDER BYrow_n)FROM(SELECTacl,row_nFROMpg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c', c.relowner))) WITH ORDINALITY AS perm(acl, row_n)WHERENOT EXISTS (SELECT1FROMpg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c', c.relowner))) AS init(init_acl)WHEREacl = init_acl)) as foo) IS NOT NULLOR (SELECTpg_catalog.array_agg(aclORDER BYrow_n)FROM(SELECTacl,row_nFROMpg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault('c', c.relowner))) WITH ORDINALITY AS initp(acl, row_n)WHERENOT EXISTS (SELECT1FROMpg_catalog.unnest(coalesce(at.attacl,pg_catalog.acldefault('c', c.relowner))) AS permp(orig_acl)WHEREacl = orig_acl)) as foo) IS NOT NULLOR NULL IS NOT NULLOR NULL IS NOT NULL)) AS changed_acl,pg_get_partkeydef(c.oid) AS partkeydef,c.relispartition AS ispartition,pg_get_expr(c.relpartbound, c.oid) AS partboundFROMpg_class cLEFT JOIN pg_depend d ON (c.relkind = 'S'AND d.classid = c.tableoidAND d.objid = c.oidAND d.objsubid = 0AND d.refclassid = c.tableoidAND d.deptype IN ('a', 'i'))LEFT JOIN pg_class tc ON (c.reltoastrelid = tc.oid)LEFT JOIN pg_init_privs pip ON (c.oid = pip.objoidAND pip.classoid = 'pg_class' : :regclassAND pip.objsubid = 0)WHEREc.relkind in ('r', 'S', 'v', 'c', 'm', 'f', 'p')ORDER BYc.oid
ok,问题其实在于pg_get_expr 和 pg_get_partkeydef 这两个分区函数。
pg_get_expr:将存储在系统表中的表达式(以 pg_node_tree 类型存储的抽象语法树)反编译为可读的SQL表达式;pg_get_partkeydef:生成分区表的分区键定义语句。
从上边分析锁源的时候,由于order表正在执行ddl,表加上了最高级别锁 AccessExclusiveLock ,因此此时 order 表是不可访问的,分区函数也不可以执行。
1. 普通表执行DDL时,执行pg_dump导出其他表没问题;2. 分区表正在执行DDL时,执行pg_dump导出任意表都会卡住,pg15及之后版本已优化。3. 执行pg_dump时可以优化下,加个–lock-wait-timeout=TIMEOUT ,设置超时时间,超时后pg_dump退出。
本文内容就到这啦!PostgreSQL不同版本间还是存在一些小坑的~希望本篇内容能给你带来帮助。我们下篇再见!

文章转载自呆呆的私房菜,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




