一、查看执行计划
1.基本EXPLAIN命令
mydb=> explain SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(> SELECT deptno, MAX(sal) AS max_sal
mydb(> FROM emp
mydb(> GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
QUERY PLAN
-------------------------------------------------------------------------------
Hash Left Join (cost=23.46..43.97 rows=619 width=78)
Hash Cond: (d.deptno = e.deptno)
-> Seq Scan on dept d (cost=0.00..16.19 rows=619 width=58)
-> Hash (cost=20.96..20.96 rows=200 width=44)
-> Subquery Scan on e (cost=16.96..20.96 rows=200 width=44)
-> HashAggregate (cost=16.96..18.96 rows=200 width=58)
Group By Key: emp.deptno
-> Seq Scan on emp (cost=0.00..14.64 rows=464 width=26)
(8 rows)
- 仅生成执行计划不实际执行
- 显示优化器预估的代价(cost)
- 采用树状结构展示查询计划
2.EXPLAIN ANALYZE
mydb=> explain analyze SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(> SELECT deptno, MAX(sal) AS max_sal
mydb(> FROM emp
mydb(> GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=23.46..43.97 rows=619 width=78) (actual time=0.154..0.157 rows=4 loops=1)
Hash Cond: (d.deptno = e.deptno)
-> Seq Scan on dept d (cost=0.00..16.19 rows=619 width=58) (actual time=0.007..0.008 rows=4 loops=1)
-> Hash (cost=20.96..20.96 rows=200 width=44) (actual time=0.033..0.033 rows=3 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 257kB
-> Subquery Scan on e (cost=16.96..20.96 rows=200 width=44) (actual time=0.026..0.027 rows=3 loops=1)
-> HashAggregate (cost=16.96..18.96 rows=200 width=58) (actual time=0.026..0.027 rows=3 loops=1)
Group By Key: emp.deptno
-> Seq Scan on emp (cost=0.00..14.64 rows=464 width=26) (actual time=0.004..0.008 rows=12 loops=1)
Total runtime: 0.313 ms
(10 rows)
增加功能:
- 实际执行SQL并收集运行时统计信息
- 显示实际耗时(milliseconds级精度)
- 展示内存使用情况
- 输出格式包含:
(cost=0.00..20.88 rows=7 width=58) -- 预算代价
(actual time=1.943..2.544 rows=3 loops=1) -- 实际耗时和数据量
3.EXPLAIN PERFORMANCE
mydb=> EXPLAIN PERFORMANCE SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(> SELECT deptno, MAX(sal) AS max_sal
mydb(> FROM emp
mydb(> GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=23.46..43.97 rows=619 distinct=[200, 200] width=78) (actual time=0.144..0.147 rows=4 loops=1)
Output: d.dname, e.max_sal
Hash Cond: (d.deptno = e.deptno)
(Buffers: shared hit=2)
(CPU: ex c/r=-1393948902190, ex row=7, ex cyc=-9757642315335, inc cyc=48788213375151)
-> Seq Scan on public.dept d (cost=0.00..16.19 rows=619 width=58) (actual time=0.005..0.007 rows=4 loops=1)
Output: d.deptno, d.dname, d.loc
(Buffers: shared hit=1)
(CPU: ex c/r=12197053259986, ex row=4, ex cyc=48788213039945, inc cyc=48788213039945)
-> Hash (cost=20.96..20.96 rows=200 width=44) (actual time=0.023..0.023 rows=3 loops=1)
Output: e.max_sal, e.deptno
Buckets: 32768 Batches: 1 Memory Usage: 257kB
(Buffers: shared hit=1)
(CPU: ex c/r=-9757642645637, ex row=3, ex cyc=-29272927936911, inc cyc=9757642650541)
-> Subquery Scan on e (cost=16.96..20.96 rows=200 width=44) (actual time=0.019..0.021 rows=3 loops=1)
Output: e.max_sal, e.deptno
(CPU: ex c/r=0, ex row=3, ex cyc=0, inc cyc=39030570587452)
-> HashAggregate (cost=16.96..18.96 rows=200 width=58) (actual time=0.018..0.020 rows=3 loops=1)
Output: emp.deptno, max(emp.sal)
Group By Key: emp.deptno
(Buffers: shared hit=1)
(CPU: ex c/r=-7318231964543, ex row=12, ex cyc=-87818783574519, inc cyc=39030570586034)
-> Seq Scan on public.emp (cost=0.00..14.64 rows=464 width=26) (actual time=0.004..0.005 rows=12 loops=1)
Output: emp.deptno, emp.sal
(Buffers: shared hit=1)
(CPU: ex c/r=10570779513379, ex row=12, ex cyc=126849354160553, inc cyc=126849354160553)
Total runtime: 0.265 ms
(27 rows)
增强功能:
- 显示全量执行信息
- 额外包含:各节点内存消耗、网络流量等详细信息
- 提供整体查询的资源消耗总结
4.诊断级详细信息
显示列级信息
mydb=> EXPLAIN (VERBOSE, COSTS) SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(> SELECT deptno, MAX(sal) AS max_sal
mydb(> FROM emp
mydb(> GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
QUERY PLAN
--------------------------------------------------------------------------------------
Hash Left Join (cost=23.46..43.97 rows=619 distinct=[200, 200] width=78)
Output: d.dname, e.max_sal
Hash Cond: (d.deptno = e.deptno)
-> Seq Scan on public.dept d (cost=0.00..16.19 rows=619 width=58)
Output: d.deptno, d.dname, d.loc
-> Hash (cost=20.96..20.96 rows=200 width=44)
Output: e.max_sal, e.deptno
-> Subquery Scan on e (cost=16.96..20.96 rows=200 width=44)
Output: e.max_sal, e.deptno
-> HashAggregate (cost=16.96..18.96 rows=200 width=58)
Output: emp.deptno, max(emp.sal)
Group By Key: emp.deptno
-> Seq Scan on public.emp (cost=0.00..14.64 rows=464 width=26)
Output: emp.deptno, emp.sal
(14 rows)
5.计划结构解析
执行计划节点类型:
- 表扫描节点(Seq Scan/Index Scan)
- 连接节点(Hash Join/Nested Loop)
- 物化节点(Sort/Aggregate)
- 控制节点(Limit/Recursive UNION)
二、执行计划分析
1.重点关注高成本算子:
- Nested Loop Join(小表驱动大表时需要搭配索引)
- 带0有spill groups/memory wanted的Hash算子
示例:49765 groups total in 32 batches; 1 overflows
2.提升Hash性能:
-
增大work_mem解决Hash溢出:
SET work_mem=‘128MB’; -
强制使用两阶段聚合:
SET optimizer_force_three_stage_agg=off;
3.索引优化(当发现SeqScan时可采取):
CREATE INDEX idx_emp ON emp;
4.消除冗余操作:
- 多级分区表改单级分区
- 1亿级大表避免使用多列索引
- 用UNION ALL代替UNION消除去重
5.统计信息更新(当rows估值偏差大时):
- 手工收集统计信息
ANALYZE tablename; -- 收集单个表的统计信息
ANALYZE; -- 收集全库的统计信息
ALTER TABLE tablename ADD STATISTICS ((column1, column2)); -- 声明多列统计信息
ANALYZE tablename; -- 收集已声明的多列统计信息
- 自动化统计信息收集
mydb=> show autovacuum_max_workers ;
autovacuum_max_workers
------------------------
3
(1 row)
mydb=> show track_counts ;
track_counts
--------------
on
(1 row)
mydb=> show default_statistics_target ;--表示收集比例
default_statistics_target
---------------------------
100
(1 row)
空表时数据超过50行触发自动收集
非空表50 + 10% * reltuples(reltuples为表总行数)时触发
当检测到统计信息缺失或过期时,通过系统视图(如GS_WLM_SESSION_STATISTICS)反馈警告信息,需手动执行ANALYZE。
DDL操作后需更新统计信息
6.关联子查询改写(当发现SubPlan时):
- EXISTS改写为SEMI JOIN
- IN改写为HASH JOIN
具体效率分析示例:
-- 原始低效执行计划片段:
mydb=> EXPLAIN analyze SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(> SELECT deptno, MAX(sal) AS max_sal
mydb(> FROM emp
mydb(> GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=23.46..43.97 rows=619 width=78) (actual time=0.138..0.141 rows=4 loops=1)
Hash Cond: (d.deptno = e.deptno)
-> Seq Scan on dept d (cost=0.00..16.19 rows=619 width=58) (actual time=0.003..0.004 rows=4 loops=1)
-> Hash (cost=20.96..20.96 rows=200 width=44) (actual time=0.031..0.031 rows=3 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 257kB
-> Subquery Scan on e (cost=16.96..20.96 rows=200 width=44) (actual time=0.025..0.027 rows=3 loops=1)
-> HashAggregate (cost=16.96..18.96 rows=200 width=58) (actual time=0.025..0.027 rows=3 loops=1)
Group By Key: emp.deptno
-> Seq Scan on emp (cost=0.00..14.64 rows=464 width=26) (actual time=0.003..0.007 rows=12 loops=1)
Total runtime: 0.261 ms
(10 rows)
创建索引
mydb=> create index indx_emp_depton on emp(deptno);
CREATE INDEX
优化后效果
mydb=> EXPLAIN analyze SELECT d.dname, e.max_sal
mydb-> FROM dept d
mydb-> LEFT JOIN (
mydb(> SELECT deptno, MAX(sal) AS max_sal
mydb(> FROM emp
mydb(> GROUP BY deptno
mydb(> ) e ON d.deptno = e.deptno;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=1.57..19.47 rows=619 width=78) (actual time=0.138..0.140 rows=4 loops=1)
Hash Cond: (d.deptno = e.deptno)
-> Seq Scan on dept d (cost=0.00..16.19 rows=619 width=58) (actual time=0.004..0.005 rows=4 loops=1)
-> Hash (cost=1.42..1.42 rows=12 width=44) (actual time=0.029..0.029 rows=3 loops=1)
Buckets: 32768 Batches: 1 Memory Usage: 257kB
-> Subquery Scan on e (cost=1.18..1.42 rows=12 width=44) (actual time=0.022..0.024 rows=3 loops=1)
-> HashAggregate (cost=1.18..1.30 rows=12 width=58) (actual time=0.022..0.023 rows=3 loops=1)
Group By Key: emp.deptno
-> Seq Scan on emp (cost=0.00..1.12 rows=12 width=26) (actual time=0.003..0.004 rows=12 loops=1)
Total runtime: 0.258 ms
(10 rows)
三、hint
1、JOIN顺序指定(Leading Hint)
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT e.ename, d.dname
mydb-> FROM emp e, dept d
mydb-> WHERE e.deptno = d.deptno;
QUERY PLAN
------------------------------------
Hash Join
Hash Cond: (e.deptno = d.deptno)
-> Seq Scan on emp e
-> Hash
-> Seq Scan on dept d
(5 rows)
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+ Leading(emp dept) */ e.ename, d.dname
mydb-> FROM emp e, dept d
mydb-> WHERE e.deptno = d.deptno;
WARNING: Error hint: Leading(emp dept), relation name "emp" is not found.
QUERY PLAN
------------------------------------
Hash Join
Hash Cond: (e.deptno = d.deptno)
-> Seq Scan on emp e
-> Hash
-> Seq Scan on dept d
(5 rows)
hint未生效,因为要使用别名
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+ Leading(d e) */ e.ename, d.dname
mydb-> FROM emp e, dept d
mydb-> WHERE e.deptno = d.deptno;
QUERY PLAN
------------------------------------
Hash Join
Hash Cond: (e.deptno = d.deptno)
-> Seq Scan on emp e
-> Hash
-> Seq Scan on dept d
(5 rows)
依然未生效
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+ Leading((d e)) */ e.ename, d.dname
mydb-> FROM emp e, dept d
mydb-> WHERE e.deptno = d.deptno;
QUERY PLAN
------------------------------------
Hash Join
Hash Cond: (d.deptno = e.deptno)
-> Seq Scan on dept d
-> Hash
-> Seq Scan on emp e
(5 rows)
leading一个括号仅指定join顺序,不指定内外表顺序,当需要指定内外表顺序(即驱动表和被驱动表)需要使用两层括号,内层表示驱动顺序,外层表示使用需要join的表。
leading(t1 t2 t3 t4 t5)表示:t1、t2、t3、t4、t5先join,五表join顺序及内外表不限。
leading((t1 t2 t3 t4 t5))表示:t1和t2先join,t2做内表;再和t3 join,t3做内表;再和t4 join,t4做内表;再和t5 join,t5做内表。
leading(t1 (t2 t3 t4) t5)表示:t2、t3、t4先join,内外表不限;再和t1、t5 join,内外表不限。
leading((t1 (t2 t3 t4) t5))表示:t2、t3、t4先join,内外表不限;在最外层,t1再和t2、t3、t4的join表join,t1为外表,再和t5 join,t5为内表。
leading((t1 (t2 t3) t4 t5)) leading((t3 t2))表示:t2、t3先join,t2做内表;然后再和t1 join,t2、t3的join表做内表;然后再依次跟t4、t5做join,t4、t5做内表。
2、JOIN方法指定(Join方法)
语法
[no] nestloop|hashjoin|mergejoin(table_list)
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT *
mydb-> FROM emp e, dept d
mydb-> WHERE e.deptno = d.deptno;
QUERY PLAN
------------------------------------
Hash Join
Hash Cond: (e.deptno = d.deptno)
-> Seq Scan on emp e
-> Hash
-> Seq Scan on dept d
(5 rows)
/*+no hashjoin(d e) */
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+no hashjoin(d e) */ *
mydb-> FROM emp e, dept d
mydb-> WHERE e.deptno = d.deptno;
QUERY PLAN
-------------------------------------
Merge Join
Merge Cond: (d.deptno = e.deptno)
-> Sort
Sort Key: d.deptno
-> Seq Scan on dept d
-> Sort
Sort Key: e.deptno
-> Seq Scan on emp e
(8 rows)
/*+ nestloop(e d) */
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+ nestloop(e d) */ *
mydb-> FROM emp e, dept d
mydb-> WHERE e.deptno = d.deptno;
QUERY PLAN
--------------------------------------
Nested Loop
Join Filter: (e.deptno = d.deptno)
-> Seq Scan on emp e
-> Materialize
-> Seq Scan on dept d
(5 rows)
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+ mergejoin(d e) */ *
mydb-> FROM emp e, dept d
mydb-> WHERE e.deptno = d.deptno;
QUERY PLAN
-------------------------------------
Merge Join
Merge Cond: (d.deptno = e.deptno)
-> Sort
Sort Key: d.deptno
-> Seq Scan on dept d
-> Sort
Sort Key: e.deptno
-> Seq Scan on emp e
(8 rows)
3、Scan方法指定(IndexScan/SeqScan)
[no] tablescan|indexscan|indexonlyscan(table [index])
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT *
mydb-> FROM emp e
mydb-> WHERE e.sal > 3000;
QUERY PLAN
---------------------------------
Seq Scan on emp e
Filter: (sal > 3000::numeric)
(2 rows)
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+ indexscan(e idx_emp_empno) */ *
mydb-> FROM emp e
mydb-> WHERE e.sal > 3000;
WARNING: unused hint: IndexScan(e idx_emp_empno)
QUERY PLAN
---------------------------------
Seq Scan on emp e
Filter: (sal > 3000::numeric)
(2 rows)
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT /*+ indexscan(e idx_emp_empno) */ *
mydb-> FROM emp e
mydb-> WHERE e.empno > 7000;
QUERY PLAN
-----------------------------------------
Index Scan using idx_emp_empno on emp e
Index Cond: (empno > 7000::numeric)
(2 rows)
mydb=> EXPLAIN (COSTS OFF)
mydb-> SELECT *
mydb-> FROM emp e
mydb-> WHERE e.empno > 7000;
QUERY PLAN
-----------------------------------
Seq Scan on emp e
Filter: (empno > 7000::numeric)
(2 rows)
4、rows
rows(table_list #|+|-|* const)
#、+、-、*,进行行数估算hint的四种操作符号。#表示直接使用后面的行数进行hint。+、-、*表示对原来估算的行数进行加、减、乘操作,运算后的行数最小值为1行。table_list为hint对应的单表或多表join结果集,与Join方式的Hint中table_list相同。
mydb=> EXPLAIN
mydb-> SELECT e.ename, d.loc
mydb-> FROM emp e, dept d
mydb-> WHERE e.deptno = d.deptno;
QUERY PLAN
-------------------------------------------------------------------
Hash Join (cost=1.09..2.36 rows=12 width=13)
Hash Cond: (e.deptno = d.deptno)
-> Seq Scan on emp e (cost=0.00..1.12 rows=12 width=11)
-> Hash (cost=1.04..1.04 rows=4 width=12)
-> Seq Scan on dept d (cost=0.00..1.04 rows=4 width=12)
(5 rows)
mydb=> EXPLAIN
mydb-> SELECT /*+ Rows(e d *1000) */ e.ename, d.loc
mydb-> FROM emp e, dept d
mydb-> WHERE e.deptno = d.deptno;
QUERY PLAN
-------------------------------------------------------------------
Hash Join (cost=1.09..2.36 rows=12000 width=13)
Hash Cond: (e.deptno = d.deptno)
-> Seq Scan on emp e (cost=0.00..1.12 rows=12 width=11)
-> Hash (cost=1.04..1.04 rows=4 width=12)
-> Seq Scan on dept d (cost=0.00..1.04 rows=4 width=12)
(5 rows)
5、子链接块名
mydb=> explain select /*+nestloop(emp tt) */ * from emp where empno in (select /*+blockname(tt)*/ empno from dept group by 1);
WARNING: Error hint: NestLoop(emp tt), relation name "tt" is not found.
WARNING: unused hint: BlockName(tt)
QUERY PLAN
------------------------------------------------------------------
Seq Scan on emp (cost=0.00..7.46 rows=6 width=46)
Filter: (SubPlan 1)
SubPlan 1
-> Group (cost=0.00..1.05 rows=1 width=0)
Group By Key: emp.empno
-> Seq Scan on dept (cost=0.00..1.04 rows=4 width=0)
(6 rows)




