一、问题背景
在从MySQL迁移至OpenGauss数据库的过程中,遇到了一个严重的性能问题。一条在MySQL中运行正常的关联查询语句,在OpenGauss中执行时间达到了570秒,严重影响了系统性能。本文将详细记录优化过程,分享从问题诊断到最终解决的过程。
二、环境准备
2.1 表结构定义
-- 创建 t1 表(文件信息表)
CREATE TABLE t1 (
object_id character varying(255) NOT NULL,
file_name character varying(255),
md5 character varying(255),
size bigint,
valid smallint DEFAULT 1,
PRIMARY KEY (object_id)
);
-- 创建索引
CREATE INDEX t1_idx_md5 ON t1 USING btree (md5);
CREATE INDEX t1_idx_valid ON t1 USING btree (valid);
-- 创建 t2 表(版本关联表)
CREATE TABLE t2 (
id bigserial PRIMARY KEY,
version_id bigint NOT NULL,
valid smallint DEFAULT 1,
asset_version_format_id integer DEFAULT 0,
object_id varbinary(255)
);
2.2 测试数据生成
-- 插入 t1 数据(约10万行有效数据)
INSERT INTO t1 (object_id, file_name, md5, size, valid)
SELECT
'file_' || i,
'file_' || i || '.jpg',
md5(random()::text),
(random() * 100000)::bigint,
1
FROM generate_series(1, 100000) i;
-- 插入100条无效数据
INSERT INTO t1 (object_id, file_name, md5, size, valid)
SELECT
'invalid_' || i,
'invalid_' || i || '.jpg',
md5(random()::text),
(random() * 100000)::bigint,
0
FROM generate_series(1, 100) i;
-- 插入 t2 数据(100条特定版本数据)
INSERT INTO t2 (version_id, valid, asset_version_format_id, object_id)
SELECT
48725,
1,
0,
('file_' || (i % 1000 + 1))
FROM generate_series(1, 100) i;
-- 插入其他版本数据(10万条)
INSERT INTO t2 (version_id, valid, asset_version_format_id, object_id)
SELECT
(random() * 100000)::bigint,
1,
floor(random() * 5)::integer,
('file_' || (random() * 1000)::integer)::bytea
FROM generate_series(1, 100000) i;
三、性能问题初现
3.1 原始查询语句
SELECT t1.object_id, t2.object_id, t1.size, t1.md5, t1.file_name
FROM t1 t1
JOIN t2 t2 ON t2.object_id = t1.object_id
WHERE t2.version_id = 48725
AND t2.valid = 1
AND t2.asset_version_format_id = 0
AND t1.valid = 1;
执行结果:耗时约570秒
3.2 数据分布分析
-- 查看t2表符合条件的数据量
select count(*) from t2
WHERE t2.version_id = 48725
AND t2.valid = 1
AND t2.asset_version_format_id = 0;
-- 结果:100行(仅占总数据量的0.1%)
-- t2表总数据量
select count(*) from t2;
-- 结果:100100行
四、执行计划深度分析
4.1 初始执行计划
EXPLAIN
SELECT t1.object_id, t2.object_id, t1.size, t1.md5, t1.file_name
FROM t1 t1
JOIN t2 t2 ON t2.object_id = t1.object_id
WHERE t2.version_id = 48725
AND t2.valid = 1
AND t2.asset_version_format_id = 0
AND t1.valid = 1;
执行计划输出:
QUERY PLAN
------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..2393.45 rows=1 width=1588)
Join Filter: (t2.object_id = (t1.object_id)::text)
-> Index Scan using t1_idx_valid on t1 (cost=0.00..144.84 rows=34 width=1556)
Index Cond: (valid = 1)
-> Materialize (cost=0.00..2239.68 rows=1 width=32)
-> Seq Scan on t2 (cost=0.00..2239.68 rows=1 width=32)
Filter: ((version_id = 48725) AND (valid = 1) AND (asset_version_format_id = 0))
(7 rows)
4.2 问题诊断
- 缺失关键索引:t2表缺少针对
version_id的过滤索引 - 连接顺序不当:执行计划选择t1作为驱动表,但实际t2过滤后数据量更小
五、优化步骤实施
5.1 第一步:创建缺失索引
-- 为t1表的连接字段创建索引
create index idx_t1_objectid on t1 (object_id);
-- 为t2表创建复合索引(覆盖version_id和object_id)
create index idx_t2_version_objectid on t2 (version_id, object_id);
5.2 第二步:更新统计信息
-- 收集最新的统计信息
analyze table t1;
analyze table t2;
5.3 第三步:分析优化后执行计划
添加索引后的执行计划:
QUERY PLAN
----------------------------------------------------------------------------------------------------
Nested Loop (cost=5.18..659088.83 rows=40 width=73)
Join Filter: (t2.object_id = (t1.object_id)::text)
-> Seq Scan on t1 (cost=0.00..2585.25 rows=99987 width=65)
Filter: (valid = 1)
-> Materialize (cost=5.18..338.96 rows=25 width=8)
-> Bitmap Heap Scan on t2 (cost=5.18..338.83 rows=25 width=8)
Recheck Cond: (version_id = 48725)
Filter: ((valid = 1) AND (asset_version_format_id = 0))
-> Bitmap Index Scan on idx_t2_version_objectid (cost=0.00..5.17 rows=123 width=0)
Index Cond: (version_id = 48725)
t2过滤后数据较少,由它当做驱动表才是正确的。
使用hint提示优化器调整连接顺序:
EXPLAIN
SELECT /*+ leading((t2 t1)) */ t1.object_id, t2.object_id, t1.size, t1.md5, t1.file_name
FROM t1
JOIN t2 ON t2.object_id = t1.object_id
WHERE t2.version_id = 48725
AND t2.valid = 1
AND t2.asset_version_format_id = 0
AND t1.valid = 1;
调整后执行计划如下:
QUERY PLAN
-------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..710741.89 rows=40 width=73)
Join Filter: (t2.object_id = (t1.object_id)::text)
-> Index Scan using idx_t2_version_objectid on t2 (cost=0.00..466.99 rows=25 width=8)
Index Cond: (version_id = 48725)
Filter: ((valid = 1) AND (asset_version_format_id = 0))
-> Materialize (cost=0.00..5136.18 rows=99987 width=65)
-> Seq Scan on t1 (cost=0.00..2585.25 rows=99987 width=65)
Filter: (valid = 1)
执行计划有所改善,但仍不是最优。
问题依旧存在:虽然t2使用了索引,但连接条件中的隐式类型转换问题,t2.object_id为varbinary类型,t1.object_id为varchar类型,连接时存在隐式类型转换,导致t1还是使用全表扫描。
六、终极优化方案
6.1 根本问题解决:消除类型转换
消除类型转换有多种方式,可以将数据类型都调整成一样(t2.object_id可以由varbinary改成varchar类型),也可以改写sql,显示处理类型转换:
-- 改写SQL,显式处理类型转换
EXPLAIN
SELECT t1.object_id, t2.object_id, t1.size, t1.md5, t1.file_name
FROM t2
JOIN t1 ON convert_from(t2.object_id, 'UTF8') = t1.object_id
WHERE t2.version_id = 48725
AND t2.valid = 1
AND t2.asset_version_format_id = 0
AND t1.valid = 1;
优化后的执行计划:
QUERY PLAN
----------------------------------------------------------------------------------------------
Nested Loop (cost=5.18..546.02 rows=40 width=73)
-> Bitmap Heap Scan on t2 (cost=5.18..338.83 rows=25 width=8)
Recheck Cond: (version_id = 48725)
Filter: ((valid = 1) AND (asset_version_format_id = 0))
-> Bitmap Index Scan on idx_t2_version_objectid (cost=0.00..5.17 rows=123 width=0)
Index Cond: (version_id = 48725)
-> Index Scan using idx_t1_objectid on t1 (cost=0.00..8.28 rows=1 width=65)
Index Cond: ((object_id)::text = convert_from((t2.object_id)::bytea, 'UTF8'::name))
Filter: (valid = 1)
6.2 最终执行效果
mydb[15400]=#explain analyze SELECT t1.object_id, t2.object_id, t1.size, t1.md5, t1.file_name
mydb-# FROM t2
mydb-# JOIN t1 ON convert_from(t2.object_id, 'UTF8') = t1.object_id
mydb-# WHERE t2.version_id = 48725
mydb-# AND t2.valid = 1
mydb-# AND t2.asset_version_format_id = 0
mydb-# AND t1.valid = 1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=5.18..546.02 rows=40 width=73) (actual time=0.350..4.377 rows=100 loops=1)
-> Bitmap Heap Scan on t2 (cost=5.18..338.83 rows=25 width=8) (actual time=0.183..0.367 rows=100 loops=1)
Recheck Cond: (version_id = 48725)
Filter: ((valid = 1) AND (asset_version_format_id = 0))
Rows Removed by Filter: 2
Heap Blocks: exact=3
-> Bitmap Index Scan on idx_t2_version_objectid (cost=0.00..5.17 rows=123 width=0) (actual time=0.097..0.097 rows=102 loops=1)
Index Cond: (version_id = 48725)
-> Index Scan using idx_t1_objectid on t1 (cost=0.00..8.28 rows=1 width=65) (actual time=3.175..3.326 rows=100 loops=100)
Index Cond: ((object_id)::text = convert_from((t2.object_id)::bytea, 'UTF8'::name))
Filter: (valid = 1)
Total runtime: 4.823 ms
(12 rows)
最终执行时间:5 ms左右
七、总结
本次优化案例展示了数据库迁移过程中可能遇到的性能陷阱,通过深入分析执行计划,识别出隐式类型转换这一根本问题,并采用针对性的优化策略,最终实现了从570秒到5毫秒的性能飞跃。
| 优化阶段 | 执行时间 | 性能提升 | 关键改进点 |
|---|---|---|---|
| 原始SQL | 570,000 ms | 基准 | - |
| 仅添加索引 | 未显著改善 | 不明显 | 添加缺失索引 |
| 调整连接顺序 | 有所改善 | 一般 | 使用hint调整连接顺序 |
| 最终优化SQL | 5 ms | 114000倍 | 消除隐式类型转换 |
核心优化要点:
- 数据类型一致性:确保连接条件两边的数据类型完全一致
- 显式类型转换:使用
convert_from()等函数显式处理数据 - 复合索引设计:针对高频查询条件设计覆盖索引
- 统计信息更新:数据变化后及时更新统计信息
- 执行计划分析:定期分析关键SQL的执行计划
最后修改时间:2025-12-30 10:00:51
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




