暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

错用分页语句引发性能问题案例解析

原创 磐维数据库 3天前
30

前言

客户环境反应有一慢查询语句,语句发来后我一看语句有点怪,因为该语句是分页模式,但这页数不对。 为了脱敏,我用TPCH环境的表进行模拟分析。

TPCH环境的搭建参见opengauss相关blog

https://opengauss.org/zh/blogs/2022/openGauss-MogDB-TPCH%E6%80%A7%E8%83%BD%E6%B5%8B%E8%AF%95%E6%AD%A5%E9%AA%A4.html

模拟环境

模拟语句

SELECT *

FROM (SELECT rownum r, t.*

FROM orders t

WHERE o_orderstatus = 'F'

AND rownum <= 1005000) m

WHERE r > 1004000;

表信息

tpch=> \dt+ orders

List of relations

Schema | Name | Type | Owner | Size | Storage | Description

--------+----------+-------+-------+---------+-------------------------------------------------------------------+-------------

tpch | orders | table | tpch | 6106 MB | {orientation=row,compression=no,fillfactor=80,parallel_workers=1} |

(1 row)

tpch=> \d+ orders;

Table "tpch.orders"

Column | Type | Modifiers | Storage | Stats target | Description

-----------------+---------------+-----------+----------+--------------+-------------

o_orderkey | integer | not null | plain | |

o_custkey | integer | not null | plain | |

o_orderstatus | character(1) | not null | extended | |

o_totalprice | numeric(15,2) | not null | main | |

o_orderdate | date | not null | plain | |

o_orderpriority | character(15) | not null | extended | |

o_clerk | character(15) | not null | extended | |

o_shippriority | integer | not null | plain | |

o_comment | varchar(79) | not null | extended | |

Indexes:

"orders_pkey" PRIMARY KEY, btree (o_orderkey) TABLESPACE pg_default

"orders_n1" btree (o_custkey) TABLESPACE pg_default

Foreign-key constraints:

"orders_o_custkey_fkey" FOREIGN KEY (o_custkey) REFERENCES customer(c_custkey)

Referenced by:

TABLE "lineitem" CONSTRAINT "lineitem_l_orderkey_fkey" FOREIGN KEY (l_orderkey) REFERENCES orders(o_orderkey)

Has OIDs: no

Options: orientation=row, compression=no, fillfactor=80, parallel_workers=1

分析

执行计划

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------

Subquery Scan on m (cost=0.00..85647.71 rows=335000 width=119) (actual time=1147.690..1148.639 rows=1000 loops=1)

Filter: (m.r > 1004000)

Rows Removed by Filter: 1004000

-> Limit (cost=0.00..73085.21 rows=1005000 width=111) (actual time=0.030..1026.762 rows=1005000 loops=1)

-> Seq Scan on orders t (cost=0.00..267116.55 rows=3673139 width=111) (actual time=0.028..940.799 rows=1005000 loops=1)

Filter: (o_orderstatus = 'F'::bpchar)

Rows Removed by Filter: 1059512

Total runtime: 1148.749 ms

(8 rows)

问题分析

该语句之所以慢是因为访问了大量的数据,虽然最终只返回1000行,但访问的行为 1,059,512 + 1,00,5000 很明显该语句不是一个正常的分页模式语句,我们知道一般分页语句是在页面上用的,常访问的只有前面几页,正常分页语句应该如下所示

explain analyze

SELECT *

FROM (SELECT rownum r, t.*

FROM orders t

WHERE o_orderstatus = 'F'

AND rownum <= 200) m

WHERE r > 100;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------

Subquery Scan on m (cost=0.00..17.04 rows=67 width=119) (actual time=0.120..0.239 rows=100 loops=1)

Filter: (m.r > 100)

Rows Removed by Filter: 100

-> Limit (cost=0.00..14.54 rows=200 width=111) (actual time=0.012..0.214 rows=200 loops=1)

-> Seq Scan on orders t (cost=0.00..267116.55 rows=3673139 width=111) (actual time=0.011..0.196 rows=200 loops=1)

Filter: (o_orderstatus = 'F'::bpchar)

Rows Removed by Filter: 209

Total runtime: 0.289 ms

(8 rows)

那该语句到底是用于什么业务呢,经与客户沟通(沟通..沟通..),原来该语句非页面查询用,而是一个导出语句。 该环境有限制,不允许一次导出所有语句,只能分批导出,于是就有了我们看到的奇怪分页语句。 我们知道分页模式并非是个稳定的高性能语句,如同我们翻书一样,越后面越慢。

explain analyze

SELECT *

FROM (SELECT rownum r, t.*

FROM orders t

WHERE o_orderstatus = 'F'

AND rownum <= 1000) m

WHERE r > 0;

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------

Subquery Scan on m (cost=0.00..85.22 rows=333 width=119) (actual time=0.014..5.873 rows=1000 loops=1)

Filter: (m.r > 0)

-> Limit (cost=0.00..72.72 rows=1000 width=111) (actual time=0.012..5.678 rows=1000 loops=1)

-> Seq Scan on orders t (cost=0.00..267116.55 rows=3673139 width=111) (actual time=0.011..5.585 rows=1000 loops=1)

Filter: (o_orderstatus = 'F'::bpchar)

Rows Removed by Filter: 1015

Total runtime: 5.998 ms

(7 rows)

explain analyze

SELECT *

FROM (SELECT rownum r, t.*

FROM orders t

WHERE o_orderstatus = 'F'

AND rownum <= 1005000) m

WHERE r > 1004000;

QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------

Subquery Scan on m (cost=0.00..85647.71 rows=335000 width=119) (actual time=821.165..821.947 rows=1000 loops=1)

Filter: (m.r > 1004000)

Rows Removed by Filter: 1004000

-> Limit (cost=0.00..73085.21 rows=1005000 width=111) (actual time=0.027..728.000 rows=1005000 loops=1)

-> Seq Scan on orders t (cost=0.00..267116.55 rows=3673139 width=111) (actual time=0.025..658.500 rows=1005000 loops=1)

Filter: (o_orderstatus = 'F'::bpchar)

Rows Removed by Filter: 1057326

Total runtime: 822.100 ms

(8 rows)

解决方案

既然问题找到了,客户的需求是分批导出数据,而非一定要分页,那我们分批导出数据就可以了。我们可以借用前面的分页方式,不过改为每次只取一页。 为了达到这个目的,需要进行交互,用主键列作为分批标定值。

优化方案

\timing on

set serveroutput on;

DECLARE

v_orderkey orders.o_orderkey%TYPE := 0;

v_total_count NUMBER := 0;

v_batch_count NUMBER;

v_row_orders orders%ROWTYPE;

v_cur_key orders.o_orderkey%TYPE;

TYPE t_ref_cursor IS REF CURSOR;

v_cursor t_ref_cursor;

BEGIN

LOOP

v_batch_count := 0;

OPEN v_cursor FOR

'SELECT *

FROM orders

WHERE o_orderkey > ' || v_orderkey || '

AND o_orderstatus = ''F''

ORDER BY o_orderkey

LIMIT 1000';

LOOP

FETCH v_cursor INTO v_row_orders;

v_cur_key := v_row_orders.o_orderkey;

EXIT WHEN v_cursor%NOTFOUND;

v_orderkey := v_cur_key;

v_batch_count := v_batch_count + 1;

v_total_count := v_total_count + 1;

-- 业务处理逻辑,只使用 v_orderkey

END LOOP;

CLOSE v_cursor;

IF v_batch_count > 0 THEN

dbms_output.put_line('批次: ' || v_batch_count || ' 行 | 累计: ' || v_total_count);

END IF;

EXIT WHEN v_batch_count = 0;

END LOOP;

dbms_output.put_line('总处理行数: ' || v_total_count);

END;

/

该方式查询速度稳定,优化后合计执行时间如下:

批次: 501 行 | 累计: 3654501

总处理行数: 3654501

ANONYMOUS BLOCK EXECUTE

Time: 205890.506 ms

行数与直接查询一致:

tpch=> SELECT COUNT(*) FROM orders WHERE o_orderstatus = 'F';

count

---------

3654501

(1 row)

原处理逻辑模拟

\timing on

set serveroutput on;

DECLARE

v_start_row NUMBER := 0; -- 起始行号

v_page_size NUMBER := 1000; -- 每批处理行数

v_total_count NUMBER := 0; -- 总行数计数器

v_batch_count NUMBER; -- 每批处理行数

v_sql VARCHAR2(2000);

v_row_orders orders%ROWTYPE;

-- 声明游标变量

TYPE t_ref_cursor IS REF CURSOR;

v_cursor t_ref_cursor;

BEGIN

LOOP

-- 重置批次计数器

v_batch_count := 0;

-- 构建动态SQL

v_sql := 'SELECT o_orderkey,o_custkey,o_orderstatus,o_totalprice,o_orderdate,o_orderpriority,o_clerk,o_shippriority,o_comment

FROM (

SELECT rownum AS r, t.*

FROM orders t

WHERE o_orderstatus = ''F''

AND rownum <= ' || v_start_row + v_page_size || '

) m

WHERE r > ' || v_start_row;

-- 打开游标并执行动态SQL

OPEN v_cursor FOR v_sql;

-- 循环处理结果

LOOP

-- 获取一行数据

FETCH v_cursor INTO v_row_orders;

-- 如果没有更多数据,退出循环

EXIT WHEN v_cursor%NOTFOUND;

-- 递增计数器

v_batch_count := v_batch_count + 1;

v_total_count := v_total_count + 1;

-- 这里可以添加实际业务处理逻辑

END LOOP;

-- 关闭游标

CLOSE v_cursor;

-- 打印本批次和累计行数

IF v_batch_count > 0 THEN

dbms_output.put_line('批次处理: ' || v_batch_count || ' 行 | 累计: ' || v_total_count || ' 行');

-- 更新起始行位置

v_start_row := v_start_row + v_batch_count;

END IF;

-- 如果本批次没有数据,退出循环

EXIT WHEN v_batch_count = 0;

END LOOP;

-- 处理完成,打印总计

dbms_output.put_line('=== 处理完成 ===');

dbms_output.put_line('总处理行数: ' || v_total_count);

END;

/

执行时间,可以观察,该方式如前面分析开始很快,后面非常慢,未能等待执行完毕。

批次处理: 1000 行 | 累计: 3654000 行

批次处理: 501 行 | 累计: 3654501 行

=== 处理完成 ===

总处理行数: 3654501

ANONYMOUS BLOCK EXECUTE

Time: 3558759.797 ms

总结

两种方案对比 分页模式:3,558,759 ms 主键定位方式:205,890 ms 17倍的差距

  1. 分页语句的特性,先快后慢,当查询的页码过大时不要使用分页语句。
  2. 当需要遍历表中所有数据时不要使用分页语句
最后修改时间:2026-06-01 00:00:13
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论