暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

OceanBase实操笔记-使用 SQL_AUDIT 分析 SQL 监控视图

原创 董小姐 2025-07-15
331

有2种办法,分别是:

  • 通过 SQL_AUDIT 分析性能问题
  • 通过 OCP 白屏化监控界面分析性能问题

GV$OB_SQL_AUDIT是最常用的 SQL 监控视图,能够记录每一次 SQL 请求的来源、执行状态、资源消耗及等待事件,无论 SQL执行是成功还是失败。除此之外,还记录了 SQL 文本、执行计划等关键信息,是诊断 SQL 问题的利器。使用 OceanBase 的GV$OB_SQL_AUDIT可查询常见的监控指标,来分析不合预期的性能问题。

学习测验

不定项选择

1. 以下关于动态性能视图的说法,哪些是正确的?

A. 动态性能视图展示系统动态变化的状态信息。命名特点以 GV$和 V$ 开头,V$ 只展示登录到的节点上的信息,GV$ 展示所有节点的信息。

B. 用户租户下的动态性能视图可以查看本租户的信息,比如 GV$OBUNITS 视图可以在用户租户内查询到本租户的 Unit 资源分配信息。系统租户的动态性能视图可以查看整个集群所有租户的信息。

C. GV$OB_SQLAUDIT 是一个常用的 SQL监控视图,能够记录每一次SQL请求的来源、执行状态、资源消耗及等待事件,无论 SQL执行是成功还是失败。

D. GV$OB_SQL AUDIT 视图的数据有自动淘汰机制,在排査 SQL性能抖动问题时,往往第一件事就是关闭 SQLAudit 功能以保存现场,避免抖动现场的监控数据被淘汰。

正确答案:ABCD

通过 SQL_AUDIT 分析性能问题实操练习

SQL 审计视图 gv$ob_sql_audit 可以查询在 OceanBase 中已执行的所有 SQL,会记录所有执行详情(成功/失败)。这对开发同学了解自己的业务 SQL 和定位问题细节非常有帮助。如当数据库出现 CPU 使用率突然增高,很多时候是因为出现了部分消耗资源较多的 SQL 执行,解决此类问题,首先需要定位耗时 SQL 。

该审计表只记录执行完成的 SQL(同时会记录执行出错详情,对应的 RET_CODE 不为 0 即执行错误 ),且该视图在记录数或占用内存达到阈值时会触发淘汰机制。

连接数据库

使用 root 用户登录集群的 sys 租户。

obclient -h127.0.0.1 -uroot@sys -P2881 -Dtestdb -A

--参数说明
testdb 数据库已预建完成,不需要手动创建。

查询 SQL 审计表信息

SQL 审计表 gv$ob_sql_audit 记录了最近执行的 SQL 详细信息,可以用于分析 SQL 执行情况。在使用 gv$ob_sql_audit主要关注耗时信息和特征信息

查询基本信息

基本信息

说明

SVR_IP

SQL 执行所在 OBServer 节点 IP。

USER_NAME

执行 SQL 的数据库账号。

DB_NAME

执行 SQL 所属数据库。

TRACE_ID

关联同一个 SQL 执行的全链路执行记录,可根据 trace_id 查询 SQL 所经历的执行分布式路径。

PLAN_ID

根据 plan_id 查询 gv$ob_plan_cache_plan_explain

视图获取当前执行计划。

注意

查询 gv$ob_ob_plan_cache_plan_explain 视图时仅支持 GET 操作,查询条件必须包含 ip, port, tenant_id, plan_id 四个字段。

示例
SELECT svr_ip, trace_id, user_name, db_name, sql_id, query_sql, plan_id FROM  oceanbase.gv$ob_sql_audit LIMIT 1;

输出如下:

obclient [testdb]> SELECT svr_ip, trace_id, user_name, db_name, sql_id, query_sql, plan_id FROM  oceanbase.gv$ob_sql_audit LIMIT 1;
+-----------+-----------------------------------+-----------+-----------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+
| svr_ip    | trace_id                          | user_name | db_name   | sql_id                           | query_sql                                                                                                                                                                                                                                           | plan_id |
+-----------+-----------------------------------+-----------+-----------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+
| 127.0.0.1 | YB427F000001-000638DA3A444276-0-0 | root      | oceanbase | 24D486F20861EBD9E0217692DB0E9C61 |      update __all_weak_read_service set min_version=1751359990766355721, max_version=1751359990766355721     where tenant_id = 1 and level_id = 0 and level_value = '' and min_version = 1751359990566166671 and max_version = 1751359990566166671  |     663 |
+-----------+-----------------------------------+-----------+-----------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+
1 row in set (0.003 sec)

查询耗时信息

耗时信息

说明

ELAPSED_TIME

SQL 请求到响应的总耗时,单位微秒。

  • QUEUE_TIME:在队列等待的时间,此值过高代表着租户 CPU 不足或者服务器负载过高。
  • EXECUTE_TIME:表示 CPU 执行时间。

TOTAL_WAIT_TIME_MICRO

总等待耗时,贯穿于整个 SQL 执行周期里。

EVENT

等待事件。

示例
SELECT svr_ip, query_sql, elapsed_time, queue_time, execute_time, total_wait_time_micro, event FROM  oceanbase.gv$ob_sql_audit LIMIT 1;

输出如下:

obclient [testdb]> SELECT svr_ip, query_sql, elapsed_time, queue_time, execute_time, total_wait_time_micro, event FROM  oceanbase.gv$ob_sql_audit LIMIT 1;
+-----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+------------+--------------+-----------------------+-------+
| svr_ip    | query_sql                                                                                                                                                                                                                                           | elapsed_time | queue_time | execute_time | total_wait_time_micro | event |
+-----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+------------+--------------+-----------------------+-------+
| 127.0.0.1 |      update __all_weak_read_service set min_version=1751359990766355721, max_version=1751359990766355721     where tenant_id = 1 and level_id = 0 and level_value = '' and min_version = 1751359990566166671 and max_version = 1751359990566166671  |          223 |          0 |          161 |                     0 |       |
+-----------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+------------+--------------+-----------------------+-------+
1 row in set (0.003 sec)

查询特征信息

特征信息

说明

IS_EXECUTOR_RPC

是否 RPC,1 表示 RPC 调用,0 表示非 RPC 调用。

IS_INNER_SQL

是否内部 SQL,1 表示内部 SQL,0 表示非内部 SQL。

PLAN_TYPE

计划类型,1 表示本地执行,2 表示远程调用,3 表示分布式调用。

RPC_COUNT

RPC 调用次数。

IS_HIT_PLAN

是否命中 plancache,1 表示命中,0 表示未命中。

REQUEST_TYPE

请求类型,2 表示本地请求,3 表示远程请求。

示例
SELECT sql_id, query_sql, is_executor_rpc, is_inner_sql, ret_code, plan_type, rpc_count, is_hit_plan, request_type FROM  oceanbase.gv$ob_sql_audit LIMIT 1;

输出如下:

obclient [testdb]> SELECT sql_id, query_sql, is_executor_rpc, is_inner_sql, ret_code, plan_type, rpc_count, is_hit_plan, request_type FROM  oceanbase.gv$ob_sql_audit LIMIT 1;
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+--------------+----------+-----------+-----------+-------------+--------------+
| sql_id                           | query_sql                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | is_executor_rpc | is_inner_sql | ret_code | plan_type | rpc_count | is_hit_plan | request_type |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+--------------+----------+-----------+-----------+-------------+--------------+
| 91A25C0C8FEE16A27E3F3700CA8F05FC | INSERT INTO __all_column (tenant_id, table_id, column_id, column_name, rowkey_position, index_position, partition_key_position, data_type, data_length, data_precision, data_scale, zero_fill, nullable, autoincrement, is_hidden, on_update_current_timestamp, orig_default_value_v2, cur_default_value_v2, cur_default_value, order_in_rowkey, collation_type, comment, schema_version, column_flags, extended_type_info, prev_column_id, srs_id, udt_set_id, sub_data_type, gmt_create, gmt_modified) VALUES (0, 21465, 41, X'434F4D4D454E54', 0, 0, 0, 30, 536870911, -1, -1, 0, 1, 0, 0, 0, NULL, X'', NULL, 0, 45, X'', 1751360418887968, 0, NULL, 40, -32, 0, 0, now(6), now(6)) ON DUPLICATE KEY UPDATE column_name = X'434F4D4D454E54', rowkey_position = 0, index_position = 0, partition_key_position = 0, data_type = 30, data_length = 536870911, data_precision = -1, data_scale = -1, zero_fill = 0, nullable = 1, autoincrement = 0, is_hidden = 0, on_update_current_timestamp = 0, orig_default_value_v2 = NULL, cur_default_value_v2 = X'', cur_default_value = NULL, order_in_rowkey = 0, collation_type = 45, comment = X'', schema_version = 1751360418887968, column_flags = 0, extended_type_info = NULL, prev_column_id = 40, srs_id = -32, udt_set_id = 0, sub_data_type = 0, gmt_create = now(6), gmt_modified = now(6) |               0 |            1 |        0 |         1 |         0 |           1 |            1 |
+----------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------------+--------------+----------+-----------+-----------+-------------+--------------+
1 row in set (0.003 sec)

出错的 SQL 语句

模拟 SQL 执行出错

SELECT wrong syntax FROM table_not_exists;

输出如下:

obclient [testdb]> SELECT wrong syntax FROM table_not_exists;
ERROR 1146 (42S02): Table 'testdb.table_not_exists' doesn't exist

查询最近 5 分钟出错的语句

SELECT substr(usec_to_time(s.request_time),1,19) request_time, 
                   s.svr_ip, s.user_name, s.db_name, s.sql_id, s.query_sql, s.ret_code, s.elapsed_time, s.queue_time, s.execute_time, 
                   round(s.request_memory_used/1024/1024/1024,2) req_mem_mb, plan_type, is_executor_rpc, is_inner_sql, trace_id 
                   FROM oceanbase.gv$ob_sql_audit s
                   WHERE ret_code < 0 
                   AND db_name=DATABASE()
                   AND request_time >= time_to_usec(DATE_SUB(current_timestamp, INTERVAL 5 MINUTE) )
                   ORDER BY request_time DESC
                   LIMIT 100;

输出如下:

+---------------------+-----------+-----------+---------+----------------------------------+-------------------------------------------+----------+--------------+------------+--------------+------------+-----------+-----------------+--------------+-----------------------------------+
| request_time        | svr_ip    | user_name | db_name | sql_id                           | query_sql                                 | ret_code | elapsed_time | queue_time | execute_time | req_mem_mb | plan_type | is_executor_rpc | is_inner_sql | trace_id                          |
+---------------------+-----------+-----------+---------+----------------------------------+-------------------------------------------+----------+--------------+------------+--------------+------------+-----------+-----------------+--------------+-----------------------------------+
| 2025-07-01 17:03:16 | 127.0.0.1 | root      | testdb  | E137511E3EC5BF34A1F0C8134944B7AA | SELECT wrong syntax FROM table_not_exists |    -5019 |          312 |         24 |           22 |       0.00 |         0 |               0 |            0 | YB427F000001-000638DA39043EA1-0-0 |
+---------------------+-----------+-----------+---------+----------------------------------+-------------------------------------------+----------+--------------+------------+--------------+------------+-----------+-----------------+--------------+-----------------------------------+
1 row in set (0.010 sec)


慢 SQL

通过 sleep() 函数模拟慢 SQL

SELECT sleep(5) as slow_test_query_5_seconds FROM dual;

输出如下:

obclient [testdb]> SELECT sleep(5) as slow_test_query_5_seconds FROM dual;
+---------------------------+
| slow_test_query_5_seconds |
+---------------------------+
|                         0 |
+---------------------------+
1 row in set (5.001 sec)

匹配 SQL

通过 query_sql like '%slow_test%' 查询条件匹配上面通过 sleep() 函数模拟慢 SQL

SELECT svr_ip, query_sql,
                      elapsed_time, queue_time, execute_time, total_wait_time_micro, event
                   FROM  oceanbase.gv$ob_sql_audit WHERE query_sql LIKE '%slow_test%' LIMIT 10;

输出结果如下:

+-----------+--------------------------------------------------------+--------------+------------+--------------+-----------------------+------------+
| svr_ip    | query_sql                                              | elapsed_time | queue_time | execute_time | total_wait_time_micro | event      |
+-----------+--------------------------------------------------------+--------------+------------+--------------+-----------------------+------------+
| 127.0.0.1 | SELECT sleep(5) as slow_test_query_5_seconds FROM dual |      5000723 |         19 |      5000104 |                900411 | sleep wait |
+-----------+--------------------------------------------------------+--------------+------------+--------------+-----------------------+------------+
1 row in set (0.023 sec)

查询耗时最长的 SQL

elapsed_time 统计了 SQL 执行的总耗时,按照 elapsed_time 逆序排序查询耗时最长的 10 个 SQL 执行记录。

注意

使用 is_executor_rpc=0 and is_inner_sql=0 条件排除内部请求和 RPC 请求

SELECT svr_ip, query_sql,
                      elapsed_time, queue_time, execute_time, total_wait_time_micro, event
                   FROM oceanbase.gv$ob_sql_audit 
                   WHERE is_executor_rpc=0 AND is_inner_sql=0 
                   AND request_time >= time_to_usec(DATE_SUB(current_timestamp, INTERVAL 5 MINUTE) )
                   ORDER BY elapsed_time DESC LIMIT 10;

输出如下:

+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+------------+--------------+-----------------------+------------+
| svr_ip    | query_sql                                                                                                                                                                                                           | elapsed_time | queue_time | execute_time | total_wait_time_micro | event      |
+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+------------+--------------+-----------------------+------------+
| 127.0.0.1 | SELECT sleep(5) as slow_test_query_5_seconds FROM dual                                                                                                                                                              |      5000723 |         19 |      5000104 |                900411 | sleep wait |
| 127.0.0.1 | SELECT svr_ip, query_sql,
                      elapsed_time, queue_time, execute_time, total_wait_time_micro, event
                   FROM  oceanbase.gv$ob_sql_audit WHERE query_sql LIKE '%slow_test%' LIMIT 10 |        23077 |         39 |        20323 |                     0 |            |
+-----------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------+------------+--------------+-----------------------+------------+
2 rows in set (0.014 sec)

占用 CPU 最多的 SQL

CPU 时间的计算公式为 execute_time+get_plan_time-total_wait_time_micro分布式计划和远程计划的时间是分别计算的,因此在查询时需要使用 trace_id 进行聚合计算

查询本地计划 CPU 时间

SELECT svr_ip, query_sql,plan_type,
                       (execute_time+get_plan_time-total_wait_time_micro) AS cpu_time,
                       elapsed_time, queue_time, execute_time, total_wait_time_micro
                   FROM  oceanbase.gv$ob_sql_audit 
                   WHERE is_executor_rpc=0 AND is_inner_sql=0 AND plan_type=1
                   AND request_time >= time_to_usec(DATE_SUB(current_timestamp, INTERVAL 5 MINUTE) )
                   ORDER BY cpu_time DESC LIMIT 10;

输出如下:

+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+----------+--------------+------------+--------------+-----------------------+
| 127.0.0.1 | SELECT svr_ip, query_sql,
                      elapsed_time, queue_time, execute_time, total_wait_time_micro, event
                   FROM  oceanbase.gv$ob_sql_audit WHERE query_sql LIKE '%slow_test%' LIMIT 10                                                                                                                                                                                |         1 |    23023 |        23077 |         39 |        20323 |                     0 |
| 127.0.0.1 | SELECT svr_ip, query_sql,
                      elapsed_time, queue_time, execute_time, total_wait_time_micro, event
                   FROM oceanbase.gv$ob_sql_audit 
                   WHERE is_executor_rpc=0 AND is_inner_sql=0 
                   AND request_time >= time_to_usec(DATE_SUB(current_timestamp, INTERVAL 5 MINUTE) )
                   ORDER BY elapsed_time DESC LIMIT 10 |         1 |    12453 |        12487 |         18 |         9178 |                     0 |
+-----------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+----------+--------------+------------+--------------+-----------------------+
2 rows in set (0.014 sec)

查询分布式计划和远程计划 CPU 时间

SELECT trace_id, sql_id, substr(usec_to_time(request_time),1,19) AS request_time_, query_sql, plan_type,
                      sum(execute_time+get_plan_time-total_wait_time_micro) AS cpu_time,
                      sum(execute_time) AS execute_time, 
                      sum(get_plan_time) AS get_plan_time, 
                      sum(total_wait_time_micro) AS total_wait_time_micro
                   FROM oceanbase.gv$ob_sql_audit 
                   WHERE is_executor_rpc=0 AND is_inner_sql=0 AND plan_type<>1
                   AND request_time >= time_to_usec(DATE_SUB(current_timestamp, INTERVAL 5 MINUTE) )
                   GROUP BY trace_id
                   ORDER BY cpu_time DESC LIMIT 10;

输出结果如下:

+-----------------------------------+----------------------------------+---------------------+-------------------------------------------+-----------+----------+--------------+---------------+-----------------------+
| trace_id                          | sql_id                           | request_time_       | query_sql                                 | plan_type | cpu_time | execute_time | get_plan_time | total_wait_time_micro |
+-----------------------------------+----------------------------------+---------------------+-------------------------------------------+-----------+----------+--------------+---------------+-----------------------+
| YB427F000001-000608716A6206EB-0-0 | E137511E3EC5BF34A1F0C8134944B7AA | 2023-10-24 15:48:04 | SELECT wrong syntax FROM table_not_exists |         0 |      223 |           21 |           202 |                     0 |
+-----------------------------------+----------------------------------+---------------------+-------------------------------------------+-----------+----------+--------------+---------------+-----------------------+
1 row in set

扩展链接:OceanBase 官网中的“《OceanBase 从入门到实践》-- 第七章:OceanBase 数据库的诊断和调优


通过 OCP 白屏化监控界面分析性能问题

示例:通过 OCP工具 来分析 CPU 负载异常




「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论