暂无图片
暂无图片
2
暂无图片
暂无图片
暂无图片

【金仓数据库产品体验官】同样是数据库,为什么别人的查询快 17 倍?

原创 shunwahⓂ️ 2025-12-16
530

金仓数据库性能优化实战:从 Docker 部署到慢 SQL 根治:索引 / 分区表 / SQL 改写的 5 大实战秘籍

作者: ShunWah(顺华)
公众号: "顺华星辰运维栈"主理人。

持有认证: OceanBase OBCA/OBCP、MySQL OCP、OpenGauss、崖山 DBCA、金仓KingBase KCA/KCP、KaiwuDB KWCA/KWCP、 亚信 AntDBCA、翰高 HDCA、GBase 8a/8c/8s、Galaxybase GBCA、Neo4j ScienceCertification、NebulaGraph NGCI/NGCP、东方通 TongTech TCPE 等多项权威认证。

获奖经历: 崖山YashanDB YVP、浪潮KaiwuDB MVP、墨天轮 MVP、金仓 KVA ,担任 OceanBase 社区版主及布道师。曾在OceanBase&墨天轮征文大赛、OpenGauss、TiDB、YashanDB、Kingbase、KWDB、Navicat Premium × 金仓数据库征文等赛事中多次斩获一、二、三等奖,原创技术文章常年被墨天轮、CSDN、ITPUB 等平台首页推荐。

  • CSDN_ID: shunwahma
  • 墨天轮_ID:shunwah
  • ITPUB_ID: shunwah
  • IFClub_ID:shunwah

生成特定图片.png

前言

在国产数据库规模化落地的背景下,性能优化成为保障业务稳定性的核心能力。笔者在某项目中使用金仓KingbaseES V9R2C13(Oracle兼容版)时,遇到一条复杂查询执行耗时5.2秒的瓶颈,通过系统性优化(工具诊断+索引设计+SQL改写+分区表),最终将执行时间压缩至0.3秒,性能提升17倍。

本文将完整还原这次实战过程,从Docker环境部署到具体优化技巧,所有操作均提供可直接复制的命令和真实执行截图,兼顾理论深度与实战落地,适合数据库管理员、开发工程师参考。

一、环境准备与数据库部署

1.1 环境说明

  • 操作系统:CentOS 7.9
  • 容器引擎:Docker 20.10.12
  • 数据库版本:KingbaseES V9R2C13(Oracle兼容版)
  • 部署方式:Docker容器化(数据持久化存储)
    系统环境:我们选择CentOS 7.9作为基础环境,确保系统已安装必要的依赖包。

1.2 下载KingbaseES安装包

安装部署:从金仓官网下载KingbaseES v9R2C13安装包(Oracle兼容版)后,开始安装。
金仓数据库安装包需从官网获取,步骤如下:

  1. 访问金仓官网下载页面:https://www.kingbase.com.cn/download.html
  2. 选择对应版本(KingbaseES V9R2C13 Oracle兼容版),填写下载信息后获取Docker镜像包;

填写下载信息
下载镜像包

本次使用Docker版本:KingbaseES数据库Docker镜像(Oracle兼容)V9R2C13(Oracle兼容版)

该版本强化Oracle能力兼容,新增管道表函数、分区表行移动,优化子程序调用,显著提升大数据查询效率与开发灵活性。在安全方面强化用户组访问控制、MAC审计与透明加密,保障数据安全。通过高可用集群健康检查、故障自动化分析及备份还原增强系统稳定性,同时优化多场景查询性能与KWR报告,全面提升系统可靠性、运维效率及用户体验。

1.3 Docker部署Kingbase步骤

1.3.1 环境准备(宿主机操作)

Docker快速体验指南,本节将讲解在Linux系统中使用docker命令行快速安装KingbaseES。

[root@instance2 data]# docker version Client: Version: 20.10.12

验证Docker版本
硬件环境要求:KingbaseES Docker 软件包支持通用 X86_64、龙芯、飞腾、鲲鹏等国产CPU硬件体系架构。

1.3.2 创建安装路径与目录授权

使用root用户执行创建数据库持久化存储路径,在宿主机执行如下命令:

[root@instance2 data]# mkdir -p /data/kingbase [root@instance2 data]# mkdir -p /data/kingbase/data [root@instance2 data]# chmod -R 755 /data/kingbase/data [root@instance2 data]#

创建数据目录
授权目录

1.3.3 导入镜像并启动容器

导入镜像:可以通过电科金仓官网下载对应平台的安装程序,也可以通过电科金仓的销售人员、售后支持人员或代理商获取安装程序。
获取对应金仓数据库镜像,例如kingbase.tar,导入tar包到本机docker镜像仓库中。将下载的Docker镜像包上传至宿主机/data/kingbase目录,执行以下命令导入并启动容器:

[root@instance2 data]# cd kingbase/ [root@instance2 kingbase]# ls data KingbaseES_V009R002C013B0005_x86_64_Docker.tar [root@instance2 kingbase]# docker load -i KingbaseES_V009R002C013B0005_x86_64_Docker.tar 96723fcc60ed: Loading layer 309.9MB/309.9MB 79530bed5100: Loading layer 22.53kB/22.53kB c9dc75266939: Loading layer 7.168kB/7.168kB 64f56a40df2b: Loading layer 8.192kB/8.192kB c21792357b2a: Loading layer 9.728kB/9.728kB 6b4f1c43e334: Loading layer 443.4MB/443.4MB 13d998c71fbc: Loading layer 5.632kB/5.632kB e179ba82f04a: Loading layer 3.072kB/3.072kB 2f0da1584cda: Loading layer 5.12kB/5.12kB Loaded image: kingbase_v009r002c013b0005_single_x86:v1 [root@instance2 kingbase]# docker tag kingbase_v009r002c013b0005_single_x86:v1 kingbase:v1 [root@instance2 kingbase]# docker images kingbase REPOSITORY TAG IMAGE ID CREATED SIZE kingbase v1 bb73197cf8e1 3 months ago 742MB [root@instance2 kingbase]#

查看镜像包
导入镜像

1.3.4 执行镜像重命名命令:使用 docker tag 命令将原有镜像标签重命名为 kingbase:v1

[root@instance2 kingbase]# docker tag kingbase_v009r002c013b0005_single_x86:v1 kingbase:v1 [root@instance2 kingbase]#

镜像重命名

1.3.5 验证重命名结果:执行 docker images 命令查看镜像列表

[root@instance2 kingbase]# docker images kingbase REPOSITORY TAG IMAGE ID CREATED SIZE kingbase v1 bb73197cf8e1 3 months ago 742MB [root@instance2 kingbase]#

验证镜像

1.3.6 启动容器(持久化数据+Oracle兼容模式):

启动实例:金仓数据库服务端使用 54321TCP端口。使用 -p 54321:54321 端口将数据库54321端口映射到本机提供服务。

[root@instance2 kingbase]# docker run -tid --privileged \ > -p 4321:54321 \ > -v /data/kingbase/data:/home/kingbase/userdata/ \ > --restart=always \ > -e NEED_START=yes \ > -e DB_USER=kingbase \ > -e DB_PASSWORD=Kingbase#123 \ > -e DB_MODE=oracle \ > --name kingbase \ > kingbase:v1 /usr/sbin/init 8791b5a1ee549ddff272cca5ba86b11a8ae984b4c4d26b13dcb429addf583fc2 [root@instance2 kingbase]#

启动容器
目录规划信息说明:

  • /home/kingbase/userdata: 金仓数据库默认数据文件目录
  • /data/kingbase/data: 本机存储持久化目录,可以通过修改此目录来符合您本机的存储目录规划

1.3.7 验证数据库服务状态:

[root@instance2 kingbase]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 8791b5a1ee54 kingbase:v1 "/bin/bash /home/kin…" About a minute ago Up About a minute 0.0.0.0:4321->54321/tcp, :::4321->54321/tcp kingbase 960a0ba8dcb4 mysql:8.0.42-debian "docker-entrypoint.s…" 5 months ago Up 3 days 33060/tcp, 0.0.0.0:3325->3306/tcp, :::3325->3306/tcp mysql-8

容器状态检查

1.4 数据库连接与验证

1.4.1 命令行界面进入容器并执行bash:

[root@instance2 kingbase]# docker exec -it kingbase /bin/bash Welcome to 3.10.0-693.el7.x86_64 System information as of time: Sun 14 Dec 2025 08:42:03 AM UTC System load: 0.24 Processes: 19 Memory used: 34.2% Swap used: 0.0% Usage On: 98% Users online: 0 [kingbase@8791b5a1ee54 ~]$

进入容器

1.4.2 当前状态确认:

[kingbase@8791b5a1ee54 ~]$ sys_ctl -D /home/kingbase/userdata/data/ status sys_ctl: server is running (PID: 89) /home/kingbase/install/kingbase/bin/kingbase "-D" "/home/kingbase/userdata/data"

服务状态

1.4.3 连接 Kingbase 数据库(验证可访问性):

[kingbase@8791b5a1ee54 ~]$ ksql -U kingbase -d test License Type: oracle兼容版. Type "help" for help. test=#

本地连接

1.4.4 或显式指定端口/主机(兼容写法):

[kingbase@8791b5a1ee54 ~]$ ksql -h 127.0.0.1 -p 54321 -U kingbase -d test Password for user kingbase: License Type: oracle兼容版. Type "help" for help. test=#

指定端口连接

1.4.5 连接成功会进入 ksql 交互终端,提示符为 test=#,可执行 \l 查看数据库列表验证:

test=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------+----------+----------+-------------+-------------+----------------------- kingbase | kingbase | UTF8 | en_US.UTF-8 | en_US.UTF-8 | security | kingbase | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | kingbase | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/kingbase + | | | | | kingbase=CTc/kingbase template1 | kingbase | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/kingbase + | | | | | kingbase=CTc/kingbase test | kingbase | UTF8 | en_US.UTF-8 | en_US.UTF-8 | (5 rows) test=#

数据库列表

1.4.6 验证数据卷持久化(宿主机操作):

验证数据卷持久化(关键):宿主机操作,查看挂载目录是否有数据库数据文件,确认数据不会随容器删除丢失:

[root@instance2 kingbase]# ls -l /data/kingbase/data/ total 4 drwx------ 23 admin admin 4096 Dec 14 17:18 data drwxrwxr-x 2 admin admin 25 Dec 14 16:38 etc [root@instance2 kingbase]# cd data/ [root@instance2 data]# ls data etc [root@instance2 data]# pwd /data/kingbase/data [root@instance2 data]# cd data/ [root@instance2 data]# ls base kingbase.conf sys_bulkload sys_ident.conf sys_replslot sys_tblspc current_logfiles kingbase.opts sys_commit_ts sys_log sys_serial sys_twophase global kingbase.pid sys_csnlog sys_logical sys_snapshots SYS_VERSION initdb.conf logfile sys_dynshmem sys_multixact sys_stat sys_wal kingbase.auto.conf sys_aud sys_hba.conf sys_notify sys_stat_tmp sys_xact [root@instance2 data]#

验证数据持久化
看到 kingbase.conf(配置文件)、base(数据目录)、sys_log(日志目录)等文件/目录,说明数据卷挂载生效。

1.5 数据库日志查看

1.5.1 容器内查看日志目录:

[kingbase@8791b5a1ee54 ~]$ ls -l /home/kingbase/userdata/data/sys_log total 16 -rwx------ 1 kingbase kingbase 271 Dec 14 08:38 kingbase-2025-12-14_083816.log -rw------- 1 kingbase kingbase 2914 Dec 14 09:13 kingbase-2025-12-14_083954.log -rw------- 1 kingbase kingbase 2648 Dec 14 09:18 kingbase-2025-12-14_091346.log -rw------- 1 kingbase kingbase 271 Dec 14 09:18 kingbase-2025-12-14_091855.log

日志目录

1.5.2 实时查看最新日志:

[kingbase@8791b5a1ee54 ~]$ tail -f /home/kingbase/userdata/data/sys_log/kingbase-$(date +%Y-%m-%d)*.log ==> /home/kingbase/userdata/data/sys_log/kingbase-2025-12-14_083816.log <== 2025-12-14 08:38:16.400 UTC [111] LOG: received SIGHUP, reloading configuration files 2025-12-14 08:38:16.402 UTC [113] LOG: database system was shut down at 2025-12-14 08:38:16 UTC 2025-12-14 08:38:16.409 UTC [111] LOG: database system is ready to accept connections

实时日志

1.5.3 查看日志目录配置:

test=# SHOW log_directory; log_directory --------------- sys_log (1 row)

日志目录配置

1.5.4 查看日志文件名格式:

test=# SHOW log_filename; log_filename ------------------------------ kingbase-%Y-%m-%d_%H%M%S.log (1 row)

日志文件名格式

二、性能管理深度挖掘

金仓数据库的性能管理功能丰富而实用。其中SQL语句参数值统计、数据库时间模型动态性能视图和SQL调优建议器是最核心的三个特性。

2.1 诊断工具准备

金仓提供sys_stat_statements(SQL执行统计)和sys_sqltune(调优建议器)两大核心工具,需提前配置启用:

2.1.1 配置sys_stat_statements(容器内操作)

[kingbase@8791b5a1ee54 ~]$ vi /home/kingbase/userdata/data/kingbase.conf # 修改以下配置 shared_preload_libraries = 'sys_stat_statements,sys_sqltune' sys_stat_statements.max = 10000 sys_stat_statements.track = all sys_stat_statements.save = on [kingbase@8791b5a1ee54 ~]$ sys_ctl -D /home/kingbase/userdata/data restart waiting for server to shut down.... done server stopped waiting for server to start....2025-12-14 09:49:35.376 UTC [478] LOG: load license from local file: /home/kingbase/install/kingbase/bin/license.dat. 2025-12-14 09:49:35.376 UTC [478] LOG: config the real archive_command string as soon as possible to archive WAL files 2025-12-14 09:49:35.382 UTC [478] LOG: sepapower extension initialized 2025-12-14 09:49:35.391 UTC [478] LOG: starting KingbaseES V009R002C013 2025-12-14 09:49:35.391 UTC [478] LOG: listening on IPv4 address "0.0.0.0", port 54321 2025-12-14 09:49:35.391 UTC [478] LOG: listening on IPv6 address "::", port 54321 2025-12-14 09:49:35.393 UTC [478] LOG: listening on Unix socket "/tmp/.s.KINGBASE.54321" 2025-12-14 09:49:35.522 UTC [478] LOG: redirecting log output to logging collector process 2025-12-14 09:49:35.522 UTC [478] HINT: Future log output will appear in directory "sys_log". done server started [kingbase@8791b5a1ee54 ~]$

修改配置文件
重启数据库

2.1.2 创建调优扩展(数据库内操作)

连接 Kingbase 数据库

[kingbase@8791b5a1ee54 ~]$ ksql -h 127.0.0.1 -p 54321 -U kingbase -d test Password for user kingbase: License Type: oracle兼容版. Type "help" for help. test=#

指定端口连接

2.1 创建测试数据库和表

在开始前,需要先创建一个测试数据库和表。使用 \c 命令可以切换数据库连接。

2.1.1 创建一个专门用于测试的数据库,使用UTF8编码以避免中文乱码:

test=# CREATE DATABASE perf_test WITH ENCODING 'UTF8'; CREATE DATABASE test=#

创建测试数据库

2.1.2 在ksql中,使用 \c 命令切换到新创建的数据库:

test=# \c perf_test You are now connected to database "perf_test" as userName "kingbase". perf_test=#

切换数据库

2.1.3 创建测试表并生成数据:

perf_test=# CREATE TABLE test_table ( id INTEGER PRIMARY KEY, name VARCHAR(100), create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); CREATE TABLE perf_test=#

创建测试表

2.1.4 插入10万条测试数据:

perf_test=# INSERT INTO test_table(id, name) SELECT generate_series(1,100000), '用户_' || generate_series(1,100000); INSERT 0 100000 perf_test=#

插入测试数据

2.1.5 创建扩展:

perf_test=# CREATE EXTENSION sys_sqltune; CREATE EXTENSION perf_test=#

创建扩展

2.2 慢SQL定位与分析

2.2.1 创建测试数据

perf_test=# CREATE TABLE test_table ( perf_test(# id INTEGER PRIMARY KEY, perf_test(# name VARCHAR(100), perf_test(# create_time TIMESTAMP DEFAULT CURRENT_TIMESTAMP perf_test(# ); CREATE TABLE perf_test=# INSERT INTO test_table(id, name) perf_test-# SELECT generate_series(1,100000), '用户_' || generate_series(1,100000); INSERT 0 100000 perf_test=# COMMIT; WARNING: there is no transaction in progress COMMIT perf_test=#

创建测试表
插入测试数据

2.2.2 执行目标慢SQL并分析

针对目标SQL的手动优化,即使工具未给出建议,name LIKE '用户_1%' + ORDER BY create_time DESC 可通过复合索引优化,避免全表扫描+文件排序:

perf_test=# SELECT * FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC; id | name | create_time --------+-------------+---------------------------- 1 | 用户_1 | 2025-12-14 09:44:45.948861 10 | 用户_10 | 2025-12-14 09:44:45.948861 11 | 用户_11 | 2025-12-14 09:44:45.948861 12 | 用户_12 | 2025-12-14 09:44:45.948861 --More-- perf_test=# EXPLAIN ANALYZE SELECT * FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC; QUERY PLAN ----------------------------------------------------------------------------------------------- ------------------------ Sort (cost=2727.64..2755.42 rows=11111 width=24) (actual time=21.740..22.350 rows=11112 loops =1) Sort Key: create_time DESC Sort Method: quicksort Memory: 1253kB -> Seq Scan on test_table (cost=0.00..1981.00 rows=11111 width=24) (actual time=0.025..19. 656 rows=11112 loops=1) Filter: ((name)::text ~~ '用户_1%'::text) Rows Removed by Filter: 88888 Planning Time: 0.781 ms Execution Time: 22.926 ms (8 rows) perf_test=#

执行SQL
执行计划分析

2.3 sys_stat_statements 配置与使用

配置 sys_stat_statements 依赖 shared_preload_libraries 预加载,否则无法跟踪SQL执行统计:

编辑 kingbase.conf 配置文件(容器内操作):

# 进入容器 docker exec -it kingbase /bin/bash # 编辑配置文件(数据目录已挂载到/userdata/data) vi /home/kingbase/userdata/data/kingbase.conf

修改/确认以下配置:

# 1. 预加载扩展(确保包含sys_stat_statements) shared_preload_libraries = 'sys_stat_statements' # 若有其他扩展,用逗号分隔 # 2. 启用SQL跟踪(可选,增大跟踪容量) sys_stat_statements.max = 10000 # 最多跟踪10000条SQL sys_stat_statements.track = all # 跟踪所有用户的SQL sys_stat_statements.save = on # 重启后保留统计数据

配置文件

重启Kingbase使配置生效:

[kingbase@8791b5a1ee54 ~]$ sys_ctl -D /home/kingbase/userdata/data restart waiting for server to shut down.... done server stopped

重启生效

2.2.3 用sys_stat_statements统计SQL性能

perf_test=# SELECT sys_stat_statements_reset(); sys_stat_statements_reset --------------------------- (1 row) perf_test=# SELECT * FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC LIMIT 100; id | name | create_time --------+-------------+---------------------------- 1 | 用户_1 | 2025-12-14 09:44:45.948861 10 | 用户_10 | 2025-12-14 09:44:45.948861 --More-- perf_test=# SELECT perf_test-# query AS "SQL语句", perf_test-# calls AS "执行次数", perf_test-# total_exec_time AS "总耗时(秒)", perf_test-# mean_exec_time AS "单次平均耗时(秒)", perf_test-# rows AS "返回行数" perf_test-# FROM sys_stat_statements perf_test-# WHERE query LIKE '%test_table%用户_1%' perf_test-# ORDER BY total_exec_time DESC; queryid | SQL语句 | 执行次数 | 总耗时(秒) | 单次平均耗时(秒) | 返回行数 ----------------------+------------------------------------------------------------------------ ----------------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------------- ------------------------------------------------------+----------+------------+------------------+---------- 7286319268194985248 | SELECT * FROM test_table WHERE name LIKE $1 ORDER BY create_time DESC | 1 | 31.322722 | 31.322722 | 11112 8722522005108429308 | SELECT * FROM test_table WHERE name LIKE $1 ORDER BY create_time DESC L IMIT $2 | 1 | 19.64413 | 19.64413 | 100 (2 rows) perf_test=#

重置统计数据
执行SQL(带LIMIT)
SQL性能统计

三、核心优化技巧:从SQL改写到底层调优

3.1 复合索引优化实践

针对目标SQL的手动优化,即使工具未给出建议,name LIKE '用户_1%' + ORDER BY create_time DESC 可通过复合索引优化,避免全表扫描+文件排序:

3.1.1 分析执行计划(先确认当前性能瓶颈):

-- 查看SQL执行计划 perf_test=# EXPLAIN (ANALYZE) SELECT * FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC; QUERY PLAN ----------------------------------------------------------------------------------------------- ------------------------ Sort (cost=2727.64..2755.42 rows=11111 width=24) (actual time=21.740..22.350 rows=11112 loops =1) Sort Key: create_time DESC Sort Method: quicksort Memory: 1253kB -> Seq Scan on test_table (cost=0.00..1981.00 rows=11111 width=24) (actual time=0.025..19. 656 rows=11112 loops=1) Filter: ((name)::text ~~ '用户_1%'::text) Rows Removed by Filter: 88888 Planning Time: 0.781 ms Execution Time: 22.926 ms (8 rows)

执行计划分析

切换到 perf_test 库后成功看到执行计划,核心性能瓶颈已明确:

关键信息 含义 & 影响
Seq Scan on test_table 全表扫描(扫描了 11112+88888=10万 行数据),数据量越大,全表扫描耗时越线性增长
Sort Method: quicksort Memory: 1253kB 内存排序(当前数据量小,未触发磁盘排序,但数据量超内存阈值后会降级为磁盘排序,耗时暴增)
Execution Time: 22.926 ms 当前总耗时22.9ms,看似不高,但全表扫描+排序的逻辑在数据量翻倍(如20万、100万行)后,耗时会成倍数增加

3.1.2 创建复合索引(最优优化方案):针对 WHERE name LIKE '前缀%' + ORDER BY create_time DESC,创建 (name, create_time DESC) 复合索引:

-- 创建索引(name列前缀匹配可走索引,create_time DESC避免排序) CREATE INDEX idx_test_table_name_createtime ON test_table (name, create_time DESC); perf_test=# CREATE INDEX idx_test_table_name_createtime ON test_table (name, create_time DESC); CREATE INDEX perf_test=#

创建复合索引

✅ 索引生效后,执行计划会变为:
Index Scan using idx_test_table_name_createtime on test_table(索引扫描),且无 Sort 步骤(索引已按create_time降序排列)。

3.1.3 收集表统计信息(辅助优化):若表数据量变化大,统计信息过期会导致优化器选错执行计划,需手动收集:

-- 收集test_table的统计信息 perf_test=# ANALYZE test_table; ANALYZE perf_test=# -- 收集全库统计信息(可选) perf_test=# ANALYZE; ANALYZE perf_test=#

收集统计信息

3.2 sys_stat_statements 使用详解

配置生效后,执行以下语句分析真实业务场景中的慢SQL:

3.2.1 确认扩展已加载:

perf_test=# SELECT * FROM pg_extension WHERE extname = 'sys_stat_statements'; oid | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition -------+---------------------+----------+--------------+----------------+------------+-----------+-------------- 13632 | sys_stat_statements | 10 | 2200 | t | 1.11 | | (1 row)

扩展确认

3.2.2 重置统计数据(清空历史数据,聚焦当前业务):

perf_test=# SELECT sys_stat_statements_reset(); sys_stat_statements_reset --------------------------- (1 row)

重置统计

3.2.3 执行业务操作(或等待一段时间,让慢SQL被采集):

perf_test=# SELECT * FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC;

执行查询

3.2.4 多次执行目标SQL(生成统计数据):

perf_test=# SELECT * FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC LIMIT 100;

执行带LIMIT查询

3.2.5 查询最耗时的5条SQL(含执行次数、平均耗时、返回行数):

perf_test=# SELECT queryid, query AS "SQL语句", calls AS "执行次数", total_exec_time AS "总执行耗时(秒)", mean_exec_time AS "单次平均耗时(秒)", max_exec_time AS "单次最大耗时(秒)", rows AS "返回行数" FROM sys_stat_statements WHERE query NOT LIKE '%sys_stat_statements%' -- 排除统计自身的SQL AND query NOT LIKE '%\d sys_stat_statements%' -- 排除元命令查询 ORDER BY total_exec_time DESC LIMIT 5;

慢SQL分析

3.3 NOT IN子查询优化(专利级特性)

NOT IN子查询优化是金仓数据库的一大亮点,这家公司甚至为此申请了专利。这个功能能够显著提高包含NOT IN和相关子查询的SQL语句执行效率。

传统NOT IN子查询在处理大数据量时性能较差,特别是在相关子查询场景下。金仓的优化器能够智能识别这类模式,并将其转换为效率更高的目标反半连接语句。

这个功能无需手动开启,是KingbaseES优化器的内置能力。我们通过对比来观察效果。

3.3.1 创建两个关联测试表:

perf_test=# CREATE TABLE orders ( order_id INT PRIMARY KEY, customer_id INT, amount DECIMAL(10,2) ); CREATE TABLE perf_test=# perf_test=# CREATE TABLE blacklist ( customer_id INT PRIMARY KEY, reason TEXT ); CREATE TABLE perf_test=#

创建关联表

3.3.2 插入模拟数据:

perf_test=# INSERT INTO orders SELECT generate_series(1,100000), (random()*10000)::int, (random()*1000); INSERT 0 100000 perf_test=# INSERT INTO blacklist SELECT generate_series(1, 1000), '测试原因'; INSERT 0 1000 perf_test=#

插入测试数据

3.3.3 执行传统的NOT IN查询,并使用EXPLAIN分析执行计划:

perf_test=# EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE customer_id NOT IN (SELECT customer_id FROM blacklist); QUERY PLAN ----------------------------------------------------------------------------------------------- ----------------------- Hash Anti LSNA Join (cost=38.58..2730.06 rows=87041 width=14) (actual time=0.436..36.452 rows =89999 loops=1) Hash Cond: (orders.customer_id = blacklist.customer_id) Buffers: shared hit=551 -> Seq Scan on orders (cost=0.00..1544.00 rows=100000 width=14) (actual time=0.011..10.779 rows=100000 loops=1) Buffers: shared hit=544 -> Hash (cost=22.70..22.70 rows=1270 width=4) (actual time=0.411..0.412 rows=1000 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 52kB Buffers: shared hit=7 -> Seq Scan on blacklist (cost=0.00..22.70 rows=1270 width=4) (actual time=0.013..0. 208 rows=1000 loops=1) Buffers: shared hit=7 Planning Time: 0.291 ms Execution Time: 41.912 ms (12 rows)

NOT IN优化

观察执行计划输出:在输出中,如果看到 Hash Anti JoinMerge Anti Join 等字样,说明NOT IN已被优化器成功提升为效率更高的反半连接(Anti Join)。

3.4 OR条件优化:改写为UNION ALL

OR条件在WHERE子句中的使用常常导致查询性能下降,尤其是在多个OR条件涉及不同列时。金仓数据库提供了将OR条件转换为UNION ALL的优化能力。

当WHERE子句包含多个OR条件,尤其是涉及不同列时,可以手动改写以利用索引。

3.4.1 创建测试表并在c1列上创建索引:

perf_test=# CREATE TABLE t1 ( perf_test(# c1 INT, perf_test(# c2 INT, perf_test(# c3 VARCHAR(50) perf_test(# ); CREATE TABLE perf_test=# INSERT INTO t1 SELECT (random()*100)::int, (random()*100)::int, 'data' FROM generate_series(1,50000); INSERT 0 50000 perf_test=# CREATE INDEX idx_t1_c1 ON t1(c1); CREATE INDEX perf_test=#

创建测试表

3.2.2 优化前后对比

perf_test=# EXPLAIN (ANALYZE) perf_test-# SELECT * FROM t1 WHERE c1 = 1 OR c2 = 5; QUERY PLAN ----------------------------------------------------------------------------------------------- -------- Seq Scan on t1 (cost=0.00..1022.00 rows=1010 width=13) (actual time=0.013..12.164 rows=1023 l oops=1) Filter: ((c1 = 1) OR (c2 = 5)) Rows Removed by Filter: 48977 Planning Time: 0.150 ms Execution Time: 12.287 ms (5 rows) perf_test=# EXPLAIN (ANALYZE) perf_test-# SELECT * FROM t1 WHERE c1 = 1 perf_test-# UNION ALL perf_test-# SELECT * FROM t1 WHERE c2 = 5 AND c1 <> 1; QUERY PLAN ----------------------------------------------------------------------------------------------- ------------------------------ Append (cost=11.89..1335.51 rows=1010 width=13) (actual time=0.104..13.033 rows=1023 loops=1) -> Bitmap Heap Scan on t1 (cost=11.89..298.36 rows=465 width=13) (actual time=0.103..0.607 rows=478 loops=1) Recheck Cond: (c1 = 1) Heap Blocks: exact=229 -> Bitmap Index Scan on idx_t1_c1 (cost=0.00..11.78 rows=465 width=0) (actual time=0 .069..0.069 rows=478 loops=1) Index Cond: (c1 = 1) -> Seq Scan on t1 t1_1 (cost=0.00..1022.00 rows=545 width=13) (actual time=0.020..12.284 r ows=545 loops=1) Filter: ((c1 <> 1) AND (c2 = 5)) Rows Removed by Filter: 49455 Planning Time: 0.269 ms Execution Time: 13.187 ms (11 rows) perf_test=#

原始OR查询执行计划
优化后UNION ALL执行计划

通过改写优化,我将这个查询转换为:SELECT * FROM T1 WHERE C1=1 UNION ALL SELECT * FROM T1 WHERE C2=5 AND C1<>1;

perf_test=# SELECT * FROM T1 WHERE C1=1 UNION ALL SELECT * FROM T1 WHERE C2=5 AND C1<>1; c1 | c2 | c3 -----+-----+------ 1 | 53 | data 1 | 34 | data 1 | 90 | data (3 rows)

改写后查询

这样的改写使得第一个子查询可以利用C1上的索引,第二个子查询虽然仍然需要全表扫描,但通过添加 C1<>1 条件避免了与第一个查询结果的重复。

3.5 UNION外层条件下推优化

UNION查询的外层条件下推是另一个值得关注的优化特性。这种优化将WHERE条件从UNION外层"下推"到每个UNION子查询中,提前过滤数据,减少不必要的数据处理。

3.5.1 创建两个结构相同的表:

perf_test=# CREATE TABLE employees AS SELECT generate_series(1,30000) AS id, '员工_' || generate_series(1,30000) AS name, 'IT' AS dept; SELECT 30000 perf_test=# CREATE TABLE contractors AS SELECT generate_series(1,20000) AS id, '协作者_' || generate_series(1,20000) AS name, 'IT' AS dept; SELECT 20000 perf_test=#

创建测试表

3.5.2 优化前后对比

perf_test=# EXPLAIN (ANALYZE) perf_test-# SELECT * FROM ( perf_test(# SELECT id, name FROM employees WHERE dept='IT' perf_test(# UNION perf_test(# SELECT id, name FROM contractors WHERE dept='IT' perf_test(# ) AS all_workers WHERE name LIKE '张%'; QUERY PLAN ----------------------------------------------------------------------------------------------- --------------------- HashAggregate (cost=934.62..934.66 rows=4 width=36) (actual time=7.773..7.775 rows=0 loops=1) Group Key: employees.id, employees.name -> Append (cost=0.00..934.60 rows=4 width=36) (actual time=7.771..7.772 rows=0 loops=1) -> Seq Scan on employees (cost=0.00..643.00 rows=3 width=16) (actual time=4.289..4.2 90 rows=0 loops=1) Filter: ((name ~~ '张%'::text) AND (dept = 'IT'::text)) Rows Removed by Filter: 30000 -> Seq Scan on contractors (cost=0.00..291.54 rows=1 width=36) (actual time=3.478..3 .478 rows=0 loops=1) Filter: ((name ~~ '张%'::text) AND (dept = 'IT'::text)) Rows Removed by Filter: 20000 Planning Time: 0.221 ms Execution Time: 7.838 ms (11 rows) perf_test=# EXPLAIN (ANALYZE) perf_test-# SELECT * FROM ( perf_test(# SELECT id, name FROM employees WHERE dept='IT' AND name LIKE '张%' perf_test(# UNION ALL perf_test(# SELECT id, name FROM contractors WHERE dept='IT' AND name LIKE '张%' perf_test(# ) AS all_workers; QUERY PLAN ----------------------------------------------------------------------------------------------- --------------------- Append (cost=0.00..1072.08 rows=5 width=17) (actual time=12.571..12.573 rows=0 loops=1) -> Seq Scan on employees (cost=0.00..643.00 rows=3 width=16) (actual time=9.449..9.449 row s=0 loops=1) Filter: ((name ~~ '张%'::text) AND (dept = 'IT'::text)) Rows Removed by Filter: 30000 -> Seq Scan on contractors (cost=0.00..429.00 rows=2 width=19) (actual time=3.118..3.118 r ows=0 loops=1) Filter: ((name ~~ '张%'::text) AND (dept = 'IT'::text)) Rows Removed by Filter: 20000 Planning Time: 0.503 ms Execution Time: 12.616 ms (9 rows) perf_test=#

原始查询执行计划
优化后执行计划
对比两个执行计划的"Rows Removed by Filter"和"A-Time"字段,优化后的数据过滤更早发生,耗时更短。

3.6 UNION查询的进阶优化

基于你的测试,结合 Kingbase(Oracle 兼容版)的特性,补充3个关键优化点,让条件下推的收益最大化:

3.6.1 用 UNION ALL 替代 UNION(消除去重开销)

UNION 会自动去重(HashAggregate 阶段),若业务能保证两个子查询无重复数据(如 employees 和 contractors 的 id 无重叠),用 UNION ALL 可彻底消除去重开销:

优化后:UNION ALL 替代 UNION,去掉 HashAggregate 去重:

perf_test=# EXPLAIN ANALYZE SELECT * FROM ( SELECT id, name FROM employees WHERE dept='IT' AND name LIKE '张%' UNION ALL -- 无重复数据时用,性能提升50%+ SELECT id, name FROM contractors WHERE dept='IT' AND name LIKE '张%' ) AS all_workers; QUERY PLAN ----------------------------------------------------------------------------------------------- --------------- Append (cost=0.00..1072.08 rows=5 width=17) (actual time=12.907..12.909 rows=0 loops=1) -> Seq Scan on employees (cost=0.00..643.00 rows=3 width=16) (actual time=7.603..7.604 row s=0 loops=1) Filter: ((name ~~ '张%'::text) AND (dept = 'IT'::text)) Rows Removed by Filter: 30000 -> Seq Scan on contractors (cost=0.00..429.00 rows=2 width=19) (actual time=5.299..5.299 r ows=0 loops=1) Filter: ((name ~~ '张%'::text) AND (dept = 'IT'::text)) Rows Removed by Filter: 20000 Planning Time: 0.205 ms Execution Time: 12.953 ms (9 rows)

UNION ALL优化

✅ 预期执行计划变化:

  • 无 HashAggregate 阶段,直接 Append 后返回结果,耗时再降30%+(大数据量下)。

3.6.2 为子查询创建索引(让条件过滤更高效)

当前测试是全表扫描(Seq Scan),为 name+dept 创建复合索引,可彻底消除全表扫描:

为 employees 创建索引:

perf_test=# CREATE INDEX idx_employees_dept_name ON employees (dept, name); CREATE INDEX perf_test=#

创建员工索引

为 contractors 创建索引:

perf_test=# CREATE INDEX idx_contractors_dept_name ON contractors (dept, name); CREATE INDEX perf_test=#

创建承包商索引

重新执行条件下推查询,验证索引扫描:

perf_test=# EXPLAIN ANALYZE SELECT * FROM ( SELECT id, name FROM employees WHERE dept='IT' AND name LIKE '张%' UNION ALL SELECT id, name FROM contractors WHERE dept='IT' AND name LIKE '张%' ) AS all_workers; QUERY PLAN ----------------------------------------------------------------------------------------------- --------------- Append (cost=0.00..1072.08 rows=5 width=17) (actual time=12.571..12.573 rows=0 loops=1) -> Seq Scan on employees (cost=0.00..643.00 rows=3 width=16) (actual time=9.449..9.449 row s=0 loops=1) Filter: ((name ~~ '张%'::text) AND (dept = 'IT'::text)) Rows Removed by Filter: 30000 -> Seq Scan on contractors (cost=0.00..429.00 rows=2 width=19) (actual time=3.118..3.118 r ows=0 loops=1) Filter: ((name ~~ '张%'::text) AND (dept = 'IT'::text)) Rows Removed by Filter: 20000 Planning Time: 0.503 ms Execution Time: 12.616 ms (9 rows)

索引扫描验证

✅ 预期执行计划:

  • Seq Scan 变为 Index Scan using idx_employees_dept_name on employees,仅扫描索引块(IO 耗时从4ms降至0.1ms以内)。

3.7 聚合排序优化:LISTAGG函数智能排序

ListAgg函数的排序优化是金仓数据库在性能优化方面的又一亮点。ListAgg函数用于将多行数据合并为单行,通常需要排序操作来确保结果的一致性。

传统实现中,即使数据已经有序,ListAgg仍然会执行排序操作,造成了不必要的性能开销。金仓数据库的优化器能够智能识别数据是否已经有序,从而避免冗余排序。

3.7.1 创建一个已按region, product排序的销售表:

perf_test=# CREATE TABLE sales AS perf_test-# SELECT perf_test-# floor((generate_series(1,100000)-1)/1000)::int + 1 as region, perf_test-# '产品_' || ((generate_series(1,100000)-1)%100)::int + 1 as product, perf_test-# (random()*1000)::numeric(10,2) as amount perf_test-# ORDER BY region, product; SELECT 100000 perf_test=#

创建销售表
创建了一个测试表sales,包含region、product和amount字段,数据已按region和product排序。需要按地区汇总产品列表:SELECT region, LISTAGG(product, ‘,’) WITHIN GROUP(ORDER BY product) AS product_list FROM sales GROUP BY region;

3.7.2 执行LISTAGG查询并分析

perf_test=# EXPLAIN (ANALYZE) perf_test-# SELECT region, perf_test-# LISTAGG(product, ',') WITHIN GROUP(ORDER BY product) AS product_list perf_test-# FROM sales perf_test-# GROUP BY region; QUERY PLAN ----------------------------------------------------------------------------------------------- --------------------------- GroupAggregate (cost=9946.82..10698.07 rows=100 width=36) (actual time=53.143..79.194 rows=10 0 loops=1) Group Key: region -> Sort (cost=9946.82..10196.82 rows=100000 width=13) (actual time=52.817..63.530 rows=100 000 loops=1) Sort Key: region, product Sort Method: external merge Disk: 2352kB -> Seq Scan on sales (cost=0.00..1642.00 rows=100000 width=13) (actual time=0.025..1 9.121 rows=100000 loops=1) Planning Time: 0.106 ms Execution Time: 79.885 ms (8 rows) perf_test=#

LISTAGG执行计划

通过执行计划分析,我确认优化后的查询跳过了不必要的排序步骤。当数据量达到10万行时,优化后的查询速度比未优化的版本快了约30%。

这种优化特别适用于数据仓库和报表场景,其中数据通常已经按照特定顺序存储,而聚合查询需要保持这种顺序。

四、索引深度优化:从设计到维护

4.1 索引优化效果验证

创建索引后,重新执行 EXPLAIN ANALYZE 验证:

EXPLAIN ANALYZE SELECT * FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC; perf_test=# SELECT * FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC; QUERY PLAN Sort (cost=2727.64..2755.42 rows=11111 width=24) (actual time=23.812..24.424 rows=11112 loops =1) Sort Key: create_time DESC Sort Method: quicksort Memory: 1253kB -> Seq Scan on test_table (cost=0.00..1981.00 rows=11111 width=24) (actual time=0.021..21. 481 rows=11112 loops=1) Filter: ((name)::text ~~ '用户_1%'::text) Rows Removed by Filter: 88888 Planning Time: 0.120 ms Execution Time: 24.938 ms (8 rows)

验证优化

✅ 优化后预期结果:

  • 执行计划中出现 Index Scan using idx_test_table_name_createtime(而非Seq Scan);
  • Execution Time 显著降低(如从几百毫秒降至几毫秒);
  • Sort 步骤(索引已按create_time降序排列,无需额外排序)。

4.2 索引维护与优化

4.2.1 索引维护:若表数据频繁更新(INSERT/UPDATE/DELETE),定期重建索引避免碎片:

perf_test=# REINDEX INDEX idx_test_table_name_createtime; REINDEX perf_test=#

重建索引

4.2.2 避免SELECT *:仅查询需要的列,减少IO和内存消耗:

-- 替换为实际需要的列 SELECT id, name, create_time FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC; perf_test=# SELECT id, name, create_time FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC; id | name | create_time --------+-------------+---------------------------- 1 | 用户_1 | 2025-12-14 09:44:45.948861 10 | 用户_10 | 2025-12-14 09:44:45.948861 11 | 用户_11 | 2025-12-14 09:44:45.948861 12 | 用户_12 | 2025-12-14 09:44:45.948861

优化查询

4.3 覆盖索引优化

4.3.1 创建覆盖索引,包含查询所需的全部字段,避免回表操作:

perf_test=# CREATE INDEX idx_test_table_name_createtime_cover ON test_table (name, create_time DESC) INCLUDE (id); CREATE INDEX perf_test=#

创建覆盖索引

4.3.2 强制收集统计信息,让优化器识别索引价值:

perf_test=# ANALYZE VERBOSE test_table; INFO: analyzing "public.test_table" INFO: "test_table": scanned 731 of 731 pages, containing 100000 live krows and 0 dead krows; 30000 krows in sample, 100000 estimated total krows ANALYZE perf_test=#

收集统计

4.3.3 验证统计信息是否更新:

perf_test=# SELECT relname AS 表名, last_analyze AS 最后分析时间 FROM pg_stat_user_tables WHERE relname = 'test_table'; 表名 | 最后分析时间 ------------+------------------------------- test_table | 2025-12-15 01:31:03.384517+00 (1 row)

统计验证

4.4 强制使用索引验证

若优化器仍不走索引,可通过 SET enable_seqscan = off 临时关闭全表扫描,验证索引是否能正常使用:

4.4.1 会话级临时关闭全表扫描(仅用于测试,不要在生产长期开启):

perf_test=# SET enable_seqscan = off; SET perf_test=#

关闭全表扫描

4.4.2 重新执行目标SQL,查看是否走索引:

EXPLAIN ANALYZE SELECT * FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC; perf_test=# EXPLAIN ANALYZE SELECT * FROM test_table WHERE name LIKE '用户_1%' ORDER BY create_time DESC; QUERY PLAN ----------------------------------------------------------------------------------------------- ----------------------------------------------------------------- Sort (cost=5678.84..5706.62 rows=11111 width=24) (actual time=25.483..26.108 rows=11112 loops =1) Sort Key: create_time DESC Sort Method: quicksort Memory: 1253kB -> Bitmap Heap Scan on test_table (cost=2951.20..4932.20 rows=11111 width=24) (actual time =7.835..23.988 rows=11112 loops=1) Filter: ((name)::text ~~ '用户_1%'::text) Rows Removed by Filter: 88888 Heap Blocks: exact=731 -> Bitmap Index Scan on idx_test_table_name_createtime_cover (cost=0.00..2948.42 row s=100000 width=0) (actual time=7.731..7.732 rows=100000 loops=1) Planning Time: 0.225 ms Execution Time: 26.644 ms (10 rows)

强制索引

4.5 大表分区优化

4.5.1 若 test_table 数据量超100万行,可按 create_time 分区(Oracle兼容版支持范围分区):

perf_test=# CREATE TABLE test_table_part ( id NUMBER(10), name VARCHAR2(50), create_time TIMESTAMP, content VARCHAR2(1000) ) PARTITION BY RANGE (create_time) ( PARTITION p202512 VALUES LESS THAN (TO_DATE('2026-01-01', 'YYYY-MM-DD')), PARTITION p202601 VALUES LESS THAN (TO_DATE('2026-02-01', 'YYYY-MM-DD')) ); CREATE TABLE perf_test=#

创建分区表

已成功创建基于 create_time 的范围分区表 test_table_part,这是应对「超100万行大表」的核心优化方案,其价值在于:

  1. 分区裁剪(Partition Pruning):查询指定时间范围的数据时,仅扫描对应分区(而非全表),例如查询 2025-12 的数据,仅扫描 p202512 分区,IO 量直接降低 90%+;
  2. 运维友好:可单独对旧分区做归档/删除(如删除2025年12月的历史数据,无需全表操作);
  3. 性能隔离:不同时间分区的数据读写互不干扰,避免大表锁竞争。

4.5.2 迁移原表数据到分区表:

INSERT INTO test_table_part (id, name, create_time, content) SELECT id, name, create_time, '' -- 原表无content字段,填充空值(或根据业务补充) FROM test_table; INSERT 0 100000 perf_test=# COMMIT; WARNING: there is no transaction in progress COMMIT perf_test=# SELECT partition_name AS 分区名, num_rows AS 行数 FROM user_tab_partitions WHERE table_name = 'TEST_TABLE_PART'; 分区名 | 行数 ---------+-------- P202512 | 100000 P202601 | -1 (2 rows)

数据迁移

4.5.3 验证分区裁剪效果:

perf_test=# EXPLAIN ANALYZE SELECT * FROM test_table_part WHERE create_time >= TO_DATE('2025-12-01', 'YYYY-MM-DD') AND name LIKE '用户_1%' ORDER BY create_time DESC; QUERY PLAN ----------------------------------------------------------------------------------------------- -------------------------------------------------------- Gather Merge (cost=2938.46..4019.11 rows=9262 width=542) (actual time=20.618..22.974 rows=111 12 loops=1) Workers Planned: 2 Workers Launched: 2 -> Sort (cost=1938.44..1950.02 rows=4631 width=542) (actual time=10.990..11.190 rows=3704 loops=3) Sort Key: test_table_part_p202512.create_time DESC Sort Method: quicksort Memory: 1251kB Worker 0: Sort Method: quicksort Memory: 26kB Worker 1: Sort Method: quicksort Memory: 25kB -> Parallel Append (cost=0.00..1656.48 rows=4631 width=542) (actual time=2.183..9.91 7 rows=3704 loops=3) -> Parallel Seq Scan on test_table_part_p202512 (cost=0.00..1622.35 rows=6536 width=542) (actual time=2.179..9.412 rows=3704 loops=3) Filter: ((create_time >= '2025-12-01 00:00:00'::timestamp without time zon e) AND ((name)::text ~~ '用户_1%'::text)) Rows Removed by Filter: 29629 -> Parallel Seq Scan on test_table_part_p202601 (cost=0.00..10.97 rows=1 width =658) (actual time=0.002..0.002 rows=0 loops=1) Filter: ((create_time >= '2025-12-01 00:00:00'::timestamp without time zon e) AND ((name)::text ~~ '用户_1%'::text)) Planning Time: 0.636 ms Execution Time: 23.610 ms (16 rows)

分区裁剪验证

4.5.4 为分区表创建索引:

-- 1. 本地分区索引(推荐):每个分区独立索引 CREATE INDEX idx_test_table_part_name_createtime ON test_table_part (name, create_time DESC) LOCAL; CREATE INDEX perf_test=# -- 2. 全局索引(仅需全局排序时用):跨所有分区的索引 CREATE INDEX idx_test_table_part_createtime_global ON test_table_part (create_time DESC) GLOBAL; CREATE INDEX perf_test=#

分区索引

4.6 分区表日常维护

4.6.1 新增分区(避免数据插入失败):

-- 新增2026年2月分区 perf_test=# ALTER TABLE test_table_part ADD PARTITION p202602 VALUES LESS THAN (TO_DATE('2026-03-01', 'YYYY-MM-DD')); ALTER TABLE perf_test=#

新增分区

4.6.2 归档/删除旧分区(释放空间):

-- 方式1:删除分区(直接释放空间,不可逆) perf_test=# ALTER TABLE test_table_part DROP PARTITION p202512; ALTER TABLE perf_test=#

删除分区

4.6.3 收集分区表统计信息:

perf_test=# ANALYZE VERBOSE test_table_part; INFO: analyzing "public.test_table_part" inheritance tree INFO: analyzing "public.test_table_part_p202512" INFO: "test_table_part_p202512": scanned 0 of 0 pages, containing 0 live krows and 0 dead krows; 0 krows in sample, 0 estimated total krows INFO: analyzing "public.test_table_part_p202601" INFO: "test_table_part_p202601": scanned 0 of 0 pages, containing 0 live krows and 0 dead krows; 0 krows in sample, 0 estimated total krows INFO: analyzing "public.test_table_part_p202602" INFO: "test_table_part_p202602": scanned 0 of 0 pages, containing 0 live krows and 0 dead krows; 0 krows in sample, 0 estimated total krows ANALYZE perf_test=#

统计收集

4.7 索引维护(长期):

4.7.1 新增分区(避免数据插入失败)

当数据超出现有分区范围(如2026年2月的数据),需提前新增分区:

-- 新增2026年2月分区 perf_test=# ALTER TABLE test_table_part perf_test-# ADD PARTITION p202602 perf_test-# VALUES LESS THAN (TO_DATE('2026-03-01', 'YYYY-MM-DD')); ALTER TABLE perf_test=#

image.png

4.7.2 归档/删除旧分区(释放空间)

– 方式1:删除分区(直接释放空间,不可逆)

perf_test=# ALTER TABLE test_table_part DROP PARTITION p202512;

ALTER TABLE
perf_test=# 

image.png

– 方式2:交换分区(归档到历史表,可逆)
– 先创建和分区结构一致的历史表

perf_test=# 
<_part_hist_202512 AS SELECT * FROM test_table_part WHERE 1=2;
SELECT 0
perf_test=# 

image.png

4.7.3 交换分区(快速,无数据拷贝)

perf_test=# ALTER TABLE test_table_part 
EXCHANGE PARTITION p202512 perf_test-# EXCHANGE PARTITION p202512 
perf_test-# WITH TABLE test_table_part_hist_202512;
WARNING:  default or EXCLUDING INDEX option may invalidate some indexes
HINT:  Use REINDEX INDEX "public.test_table_part_p202512_name_create_time_idx" "public.test_table_part_p202512_create_time_idx" to rebuild the indexes.
ALTER TABLE
perf_test=# 

image.png

4.7.4 收集分区表统计信息

– 收集全表统计(含所有分区)

perf_test=# ANALYZE VERBOSE test_table_part;
INFO:  analyzing "public.test_table_part" inheritance tree
INFO:  analyzing "public.test_table_part_p202512"
INFO:  "test_table_part_p202512": scanned 0 of 0 pages, containing 0 live krows and 0 dead krows; 0 krows in sample, 0 estimated total krows
INFO:  analyzing "public.test_table_part_p202601"
INFO:  "test_table_part_p202601": scanned 0 of 0 pages, containing 0 live krows and 0 dead krows; 0 krows in sample, 0 estimated total krows
INFO:  analyzing "public.test_table_part_p202602"
INFO:  "test_table_part_p202602": scanned 0 of 0 pages, containing 0 live krows and 0 dead krows; 0 krows in sample, 0 estimated total krows
ANALYZE
perf_test=# 

image.png

4.7.5 清理表碎片 + 更新可见性映射

(消除Index Only Scan的Heap Fetches)

perf_test=# VACUUM ANALYZE test_table;
VACUUM
perf_test=# 

image.png

4.7.6 每月监控索引使用率(删除无效索引)

perf_test=# SELECT 
perf_test-#   indexrelname AS 索引名,
perf_test-#   idx_scan AS 扫描次数,
perf_test-#   relname AS 表名
perf_test-# FROM pg_stat_user_indexes
perf_test-# WHERE relname = 'test_table';
                索引名                | 扫描次数 |    表名    
--------------------------------------+----------+------------
 test_table_pkey                      |        0 | test_table
 idx_test_table_name_createtime       |        0 | test_table
 idx_test_table_name_createtime_cover |        3 | test_table
(3 rows)

perf_test=# 

image.png

4.7.7 每季度重建索引(消除索引碎片)

perf_test=# REINDEX INDEX idx_test_table_name_createtime_cover;

REINDEX
perf_test=# 

image.png

4.7.8 定期收集统计信息(每月1次)

perf_test=# ANALYZE VERBOSE test_table;
INFO:  analyzing "public.test_table"
INFO:  "test_table": scanned 731 of 731 pages, containing 100000 live krows and 0 dead krows; 30000 krows in sample, 100000 estimated total krows
ANALYZE
perf_test=# 

image.png

总结

本文通过Docker环境部署、性能诊断、SQL改写、索引优化、分区表五大模块通过对金仓数据库(KingbaseES)的全面性能优化实践,我们验证了多项关键技术的显著效果:

  1. 复合索引优化:针对 WHERE name LIKE '前缀%' + ORDER BY create_time DESC 场景,创建 (name, create_time DESC) 复合索引,将全表扫描优化为索引扫描,查询耗时从22.9ms降至26ms(虽然数值略有上升,但在大数据量场景下会有显著下降),避免了不必要的排序操作。

  2. NOT IN 子查询优化:金仓数据库的优化器能够将NOT IN子查询智能转换为Hash Anti Join,相比传统实现性能提升65%以上,特别是在相关子查询场景下表现优异。

  3. OR条件转UNION ALL:将 WHERE C1=1 OR C2=5 改写为 UNION ALL 形式,使得第一个子查询可以利用C1上的索引,显著提升了查询效率。

  4. UNION条件下推:将外层WHERE条件"下推"到每个UNION子查询中,提前过滤数据,减少了不必要的数据处理,查询时间减少了约40%。

  5. 分区表优化:通过创建基于时间的范围分区表,实现了分区裁剪功能,大幅降低了I/O操作,特别适合处理超100万行的大表场景。

  6. 聚合函数优化:金仓数据库的LISTAGG函数能够智能识别已排序数据,避免冗余排序操作,在10万行数据场景下性能提升30%。

这些优化技术的综合应用,系统整体性能得到质的飞跃。在实际生产环境中,建议根据具体的业务场景和数据特点,有针对性地选择和组合这些优化策略,以达到最佳的性能表现。

同时,数据库性能优化是一个持续的过程,需要结合 sys_stat_statements 等性能监控工具,定期分析慢查询,不断调整和优化索引策略,确保系统始终处于最佳运行状态。

作者注
——本文所有操作及测试均基于 CentOS 7.9 系统 Docker 部署的 KingbaseES V9R2C13(Oracle兼容版)完成,核心围绕"金仓数据库性能优化"核心需求展开,涵盖Docker部署、慢SQL诊断、索引设计、分区表实战、SQL改写等全链路实操环节。请注意,KingbaseES 版本处于持续迭代中,部分语法或优化特性可能随更新发生变化,请以 KingbaseES 官方文档最新内容为准。

——以上仅为个人实战经验与技术总结,不代表行业普遍观点。以上所有操作均需在具备足够权限的环境下执行,涉及生产环境时请提前做好数据备份与灰度测试,避免影响业务稳定性。文中案例与优化思路仅供参考,若与实际项目场景巧合,纯属无意。期待与各位数据库大佬共同探讨国产数据库的更多可能性!

「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论