排除集群环境不稳定因素
/data/log1)的顺序写性能、集群多数派节点之间的网络延时会影响 COMMIT 的延时。这个 COMMIT 的延时后期是可以观察监控的。OceanBase 集群对网络的延时容忍程度还是很大的,最大能到 50ms。不过生产环境肯定不能有这么大的延时。尽管 OceanBase 可以工作,性能也会下降的很厉害。一般建议同机房 5ms以内,同城 10ms 以内。
server_permanent_offline_time,默认值 3600s ),节点就会下线。由于这个默认值很大,通常问题发生的时候并不容易为人所察觉。
NTP或
chrony做时间同步,这点不是问题。大部分客户的测试环境都没有可靠的 NTP 服务器,导致 OceanBase 在测试环境部署时经常碰到时间同步问题要解决。一个变通的方法是选定一台服务器临时当
NTP服务(如选
OCP服务器),其他 OB 节点都跟这台同步。如果感觉 NTP 或者
chrony同步不可靠,最粗暴直接的方法就是直接在节点的
crontab任务里直接配置每分钟使用
ntpdate直接同步一次时间。实际 POC 发现这个总是有效的。如果集群节点机型不一致(CPU类型和主频不一致),这个时间误差很难缩小到1ms以内,能做到 10ms 以内也可以。
clockdiff。
[root@observer01 ~]# clockdiff observer05...................................................host=observer05 rtt=1(0)ms/1ms delta=-15ms/-16ms Sat Mar 20 11:09:42 2021[root@observer01 ~]# ntpdate -u ntp8.cloud.aliyuncs.com20 Mar 11:09:54 ntpdate[14167]: adjust time server 100.100.5.2 offset -0.010433 sec[root@observer01 ~]# ntpdate -u ntp8.cloud.aliyuncs.com20 Mar 11:10:02 ntpdate[14974]: adjust time server 100.100.5.2 offset -0.004026 sec[root@observer01 ~]# clockdiff observer05..................................................host=observer05 rtt=1(0)ms/1ms delta=1ms/0ms Sat Mar 20 11:10:12 2021[root@observer01 ~]#
观察租户整体性能
QPS(每秒 SQL 请求数,包括
SELECT、
INSERT、
UPDATE、
DELETE)以及其
RT(SQL 执行耗时)、
TPS(每秒事务数,跟 ORACLE 一致 )以及其
RT(事务提交延时)。此外,关注这些指标在 OceanBase 集群节点上的性能信息。
dooba可以观察很方便实时观察 OceanBase 租户性能。
grant select on oceanbase.* to dbamonitor identified by '123456';
dooba脚本在
/home/admin/oceanbase/bin/下,是个
python脚本。我的做了一些修改,会稍有不同(最上面一行高亮的是近一定记录数的平均值)。
python dooba.py -hobserver00 -udbamonitor@sys#obdemo -P2883 -p123456


说明:
gv$sysstat。各项指标如下。
一级 分类 | 二级分类 | 缩写 | 全称 | 含义 |
|---|---|---|---|---|
| Gallery | SQL COUNT | 租户 SQL QPS数据 | ||
| SEL. | sql select count | 平均每秒查询次数 | ||
| INS. | sql insert count | 平均每秒插入次数 | ||
| UPD. | sql update count | 平均每秒更新次数 | ||
| DEL. | sql delete count | 平均每秒删除次数 | ||
| REP. | sql replace count | 平均每秒替换次数 | ||
| CMT. | trans commit count | 平均每秒事务提交次数 | ||
| ROL. | trans rollback count | 平均每秒事务回滚次数 | ||
| SQL RT | 租户 SQL延时数据 | |||
| SEL. | sql select time | 平均每次查询耗时 | ||
| INS. | sql insert time | 平均每次插入耗时 | ||
| UPD. | sql update time | 平均每次更新耗时 | ||
| DEL. | sql delete time | 平均每次删除耗时 | ||
| REP. | sql replace time | 平均每次替换耗时 | ||
| CMT. | trans commit time | 平均每次事务提交延时 | ||
| RPC | 网络信息,不准,忽略 | |||
| MEMORY(T) | 租户内存性能数据 | |||
| ⊿ACTIVE | active memstore used | 平均每秒增量内存变化量 | ||
| TOTAL | total memstore used | 增量内存累计总量 | ||
| PCT. | total memstore used | 增量内存占比 | ||
| IOPS | 集群的IO性能数据 | |||
| SES. | active sessions | 当前活跃会话数 | ||
| IOR | io read count | 平均每秒读IO次数 | ||
| IOR-SZ | io read bytes | 平均每次读IO大小 | ||
| IOW | io write count | 平均每秒写IO次数 | ||
| IOW-SZ | io write bytes | 平均每次写IO大小 | ||
| SQL Page | observer | 每个节点的性能数据 | ||
| Active Sess | active sessions | 租户在该节点当前活跃会话数 | ||
| CPU | cpu usage | 租户在该节点的CPU利用率(相对值) | ||
| Cache-BI Hit | block index cache hit | 租户在该节点数据块的索引块的命中率 | ||
| Cache-Blk Hit | block cache hit | 租户在该节点数据块的命中率 | ||
| Cache-Loc Hit | location cache hit | 租户分区在该节点位置缓存命中率 | ||
| Cache-Row Hit | row cache hit | 租户分区在该节点行缓存命中率 | ||
| IO-R Cnt | io read count | 租户在该节点平均每秒读IO次数 | ||
| IO-R Size | io read bytes | 租户在该节点平均每次读IO大小 | ||
| IO-W Cnt | io write count | 租户在该节点平均每次写IO次数 | ||
| IO-W Size | io write bytes | 租户在该节点平均每次写IO大小 | ||
| SQL | 每个节点的SQL性能数据 | |||
| SSC | sql select count | 租户在该节点平均每秒查询次数 | ||
| SSRT | sql select time | 租户在该节点平均每次查询耗时 | ||
| SIC | sql insert count | 租户在该节点平均每秒插入次数 | ||
| SIRT | sql insert time | 租户在该节点平均每次插入耗时 | ||
| SUC | sql update count | 租户在该节点平均每秒更新次数 | ||
| SURT | sql update time | 租户在该节点平均每次更新耗时 | ||
| SDC | sql delete count | 租户在该节点平均每秒删除次数 | ||
| SDRT | sql delete time | 租户在该节点平均每次删除耗时 | ||
| SRC | sql replace count | 租户在该节点平均每秒替换次数 | ||
| SRRT | sql replace time | 租户在该节点平均每次替换耗时 | ||
| TCC | trans commit count | 租户在该节点平均每秒事务提交次数 | ||
| TCRT | trans commit time | 租户在该节点平均每次事务提交延时 | ||
| SLC | sql local count | 租户在该节点平均每秒本地SQL次数 | ||
| SRC | sql remote count | 租户在该节点平均每秒远程SQL次数 |
是不是每个节点都在提供读写服务?性能分别如何?
连接数达到多少了?每个节点是多少?
IO吞吐量有多少?每个节点是多少?
数据库SQL耗时有多少?每个节点是多少?
每个节点的跨节点访问的SQL是多少?
每个节点的数据缓存命中率多少?
内存使用率如何?是否有转储?
租户虽然是3节点,实际只有一个节点在提供读写服务。
业务SQL大头是查询和更新,其次是少量插入和删除。
查询平均延时 900us (微秒),插入平均延时 320us ,更新平均延时 230us , 删除平均延时 200us 。
租户事务平均延时 800us,事务比较小。
租户有少量的IO读,吞吐量在230MB.
节点远程SQL比例约占总SQL比例在 20% 左右
观察租户 SQL 整体性能
gv$plan_cache_plan_stat和
gv$plan_cache_plan_explain。前者查看所有SQL的执行性能信息(次数、耗时、行数等等)。分析它可以根据执行耗时、执行次数、逻辑读、物理读排序获得相应的TOP SQL。
SELECT s.tenant_id, svr_ip,plan_Id,TYPE, d.database_name , query_sql, first_load_time, avg_exe_usec, slow_count,executions, slowest_exe_usec,round( rows_processed executions ) rows_per, disk_reads ,buffer_gets FROM `gv$plan_cache_plan_stat` s LEFT JOIN `gv$database` d ON (s.tenant_id =d.tenant_id AND s.db_id =d.database_id )WHERE s.tenant_id=1001 -- 改成具体的 tenant_id AND d.database_name in ('TPCC') AND s.executions > 10 -- and query_sql like '%emp%'ORDER BY avg_exe_usec desc, slow_count desc limit 10 ;

SQL性能调优
第一条SQL平均返回行数是2708 行,那超过 1ms 是可以解释的。这个估计是业务相关。
第二条SQL看平均返回行数就意义不大,因为是
count(*)
操作。需要看看实际 SQL 和执行计划。
分析 SQL 实际执行计划和解析执行计划
SELECT *FROM `gv$plan_cache_plan_explain` WHERE tenant_id=1001 AND ip = '172.30.118.70' AND port=2882 AND plan_id=338606 ;

explain命令看看这个SQL的执行计划。
EXPLAIN extended_noaddrSELECT count(*) AS low_stock FROM ( SELECT s_w_id, s_i_id, s_quantity FROM bmsql_stock WHERE s_w_id = 409 AND s_quantity < 11 AND s_i_id IN ( SELECT ol_i_id FROM bmsql_district JOIN bmsql_order_line ON ol_w_id = d_w_id AND ol_d_id = d_id AND ol_o_id >= d_next_o_id - 20 AND ol_o_id < d_next_o_id WHERE d_w_id = 409 AND d_id = 3 ) )\G*************************** 1. row ***************************Query Plan: ===========================================================|ID|OPERATOR |NAME |EST. ROWS|COST |-----------------------------------------------------------|0 |SCALAR GROUP BY | |1 |335053||1 | NESTED-LOOP JOIN | |2078 |334656||2 | SUBPLAN SCAN |VIEW1 |8677 |19956 ||3 | HASH DISTINCT | |8677 |18758 ||4 | NESTED-LOOP JOIN| |8949 |9610 ||5 | TABLE GET |BMSQL_DISTRICT |1 |53 ||6 | TABLE SCAN |BMSQL_ORDER_LINE|26847 |11293 ||7 | TABLE GET |BMSQL_STOCK |1 |37 |===========================================================Outputs & filters:------------------------------------- 0 - output([T_FUN_COUNT(*)]), filter(nil), group(nil), agg_func([T_FUN_COUNT(*)]) 1 - output([1]), filter(nil), conds(nil), nl_params_([VIEW1.BMSQL_ORDER_LINE.OL_I_ID]), inner_get=false, self_join=false, batch_join=true 2 - output([VIEW1.BMSQL_ORDER_LINE.OL_I_ID]), filter(nil), access([VIEW1.BMSQL_ORDER_LINE.OL_I_ID]) 3 - output([BMSQL_ORDER_LINE.OL_I_ID]), filter(nil), distinct([BMSQL_ORDER_LINE.OL_I_ID]) 4 - output([BMSQL_ORDER_LINE.OL_I_ID]), filter(nil), conds(nil), nl_params_([BMSQL_DISTRICT.D_NEXT_O_ID - 20], [BMSQL_DISTRICT.D_NEXT_O_ID]), inner_get=false, self_join=false, batch_join=true 5 - output([BMSQL_DISTRICT.D_NEXT_O_ID], [BMSQL_DISTRICT.D_NEXT_O_ID - 20]), filter([BMSQL_DISTRICT.D_NEXT_O_ID > BMSQL_DISTRICT.D_NEXT_O_ID - 20]), access([BMSQL_DISTRICT.D_NEXT_O_ID]), partitions(p0), is_index_back=false, filter_before_indexback[false], range_key([BMSQL_DISTRICT.D_W_ID], [BMSQL_DISTRICT.D_ID]), range[409,3 ; 409,3], range_cond([BMSQL_DISTRICT.D_W_ID = 409], [BMSQL_DISTRICT.D_ID = 3]) 6 - output([BMSQL_ORDER_LINE.OL_I_ID]), filter(nil), access([BMSQL_ORDER_LINE.OL_I_ID]), partitions(p0), is_index_back=false, range_key([BMSQL_ORDER_LINE.OL_W_ID], [BMSQL_ORDER_LINE.OL_D_ID], [BMSQL_ORDER_LINE.OL_O_ID], [BMSQL_ORDER_LINE.OL_NUMBER]), range(MIN ; MAX), range_cond([BMSQL_ORDER_LINE.OL_W_ID = 409], [BMSQL_ORDER_LINE.OL_D_ID = 3], [BMSQL_ORDER_LINE.OL_O_ID >= ?], [BMSQL_ORDER_LINE.OL_O_ID < ?]) 7 - output([1]), filter([BMSQL_STOCK.S_QUANTITY < 11]), access([BMSQL_STOCK.S_QUANTITY]), partitions(p0), is_index_back=false, filter_before_indexback[false], range_key([BMSQL_STOCK.S_W_ID], [BMSQL_STOCK.S_I_ID]), range(MIN ; MAX), range_cond([BMSQL_STOCK.S_W_ID = 409], [BMSQL_STOCK.S_I_ID = ?])1 row in set (0.03 sec)
分析 SQL执行位置和数据副本分布
第三条SQL的Type为 2 是远程SQL。根据这个 SQLID 再找找所有类似的SQL。
SELECT s.tenant_id, svr_ip,plan_Id,TYPE, d.database_name , sql_id, query_sql, first_load_time, avg_exe_usec, slow_count,executions, slowest_exe_usec ,round( rows_processed executions,2 ) avg_rows, round(disk_reads/executions ) avg_disk_reads,round(buffer_gets executions ) avg_buffer_getsFROM `gv$plan_cache_plan_stat` s LEFT JOIN `gv$database` d ON (s.tenant_id =d.tenant_id AND s.db_id =d.database_id )WHERE s.tenant_id=1001 -- 改成具体的 tenant_id AND d.database_name in ('TPCC') AND s.executions > 10 -- and query_sql like '%emp%' and s.sql_id = 'A460265EC2F0763A15DD27CE9E4E2200'ORDER BY avg_exe_usec desc, slow_count desc limit 10 ;

SELECT t.table_name, tg.tablegroup_name , t.part_num , t2.partition_id, t2.ZONE, t2.svr_ip, t2.role -- , a.primary_zone , IF(t.locality = '' OR t.locality IS NULL, a.locality, t.locality) AS locality , t2.row_count, round(data_size/1024/1024,0) data_size_MBFROM oceanbase.__all_tenant AS a JOIN oceanbase.__all_virtual_database AS d ON ( a.tenant_id = d.tenant_id ) JOIN oceanbase.__all_virtual_table AS t ON (t.tenant_id = d.tenant_id AND t.database_id = d.database_id) JOIN oceanbase.__all_virtual_meta_table t2 ON (t.tenant_id = t2.tenant_id AND (t.table_id=t2.table_id OR t.tablegroup_id=t2.table_id) AND t2.ROLE IN (1) ) LEFT JOIN oceanbase.__all_virtual_tablegroup AS tg ON (t.tenant_id = tg.tenant_id and t.tablegroup_id = tg.tablegroup_id) WHERE a.tenant_id IN (1001 ) AND t.table_type IN (3) AND d.database_name = 'TPCC' ORDER BY t.tenant_id, tg.tablegroup_name, d.database_name, t.table_name, t2.partition_id;+-------------------+-----------------+----------+--------------+-------+----------------+------+-----------+--------------+| table_name | tablegroup_name | part_num | partition_id | ZONE | svr_ip | role | row_count | data_size_MB |+-------------------+-----------------+----------+--------------+-------+----------------+------+-----------+--------------+| BMSQL_CONFIG | NULL | 1 | 0 | zone2 | 172.30.118.70 | 1 | 4 | 2 || BMSQL_ITEM | NULL | 1 | 0 | zone3 | 172.23.152.229 | 1 | 100000 | 10 || BMSQL_CUSTOMER | TPCC_GROUP2 | 1 | 0 | zone2 | 172.30.118.70 | 1 | 30000000 | 14138 || BMSQL_DISTRICT | TPCC_GROUP2 | 1 | 0 | zone2 | 172.30.118.70 | 1 | 10000 | 4 || BMSQL_HISTORY | TPCC_GROUP2 | 1 | 0 | zone2 | 172.30.118.70 | 1 | 32103424 | 748 || BMSQL_NEW_ORDER | TPCC_GROUP2 | 1 | 0 | zone2 | 172.30.118.70 | 1 | 9218558 | 6 || BMSQL_OORDER | TPCC_GROUP2 | 1 | 0 | zone2 | 172.30.118.70 | 1 | 32181498 | 342 || BMSQL_OORDER2 | TPCC_GROUP2 | 1 | 0 | zone2 | 172.30.118.70 | 1 | 1000000 | 12 || BMSQL_OORDER3 | TPCC_GROUP2 | 1 | 0 | zone2 | 172.30.118.70 | 1 | 1000000 | 6 || BMSQL_ORDER_LINE | TPCC_GROUP2 | 1 | 0 | zone2 | 172.30.118.70 | 1 | 321795064 | 11116 || BMSQL_ORDER_LINE2 | TPCC_GROUP2 | 1 | 0 | zone2 | 172.30.118.70 | 1 | 300149409 | 8668 || BMSQL_STOCK | TPCC_GROUP2 | 1 | 0 | zone2 | 172.30.118.70 | 1 | 100000000 | 23820 || BMSQL_WAREHOUSE | TPCC_GROUP2 | 1 | 0 | zone2 | 172.30.118.70 | 1 | 1000 | 2 |+-------------------+-----------------+----------+--------------+-------+----------------+------+-----------+--------------+13 rows in set (0.04 sec)
172.30.118.70上,因为它们在同一个表分组
TPCC_GROUP2中。表
BMSQL_ITEM表在节点
172.23.152.229上。再看表
BMSQL_ITEM也不是复制表。
SHOW CREATE TABLE bmsql_item;CREATE TABLE "BMSQL_ITEM" ( "I_ID" NUMBER(38) NOT NULL, "I_NAME" VARCHAR2(24), "I_PRICE" NUMBER(5,2), "I_DATA" VARCHAR2(50), "I_IM_ID" NUMBER(38), CONSTRAINT "BMSQL_ITEM_OBPK_1615598299895820" PRIMARY KEY ("I_ID"), CONSTRAINT "BMSQL_ITEM_UK" UNIQUE ("I_NAME")) COMPRESS FOR ARCHIVE REPLICA_NUM = 3 BLOCK_SIZE = 16384 USE_BLOOM_FILTER = FALSE TABLET_SIZE = 134217728 PCTFREE = 10
BMSQL_ITEM的那些 SQL类型是 1 的 SQL 都是正常的,平均延时在 180us 。主键访问(
TABLE GET),加上频繁访问命中行缓存,这个延时水平是正常的。而远程访问的那个是不正常,增加了网络上的传输时间。
SELECT PLAN_DEPTH ,PLAN_LINE_ID ,OPERATOR ,NAME,ROWS,COST,PROPERTY FROM `gv$plan_cache_plan_explain` WHERE tenant_id=1001 AND ip = '172.30.118.70' AND port=2882 AND plan_id=340985 order by PLAN_LINE_ID ;+------------+--------------+----------------------+------------+------+------+----------------------------------------------------------------------------------------------------------------------------+| PLAN_DEPTH | PLAN_LINE_ID | OPERATOR | NAME | ROWS | COST | PROPERTY |+------------+--------------+----------------------+------------+------+------+----------------------------------------------------------------------------------------------------------------------------+| 0 | 0 | PHY_DIRECT_RECEIVE | NULL | 1 | 52 | NULL || 1 | 1 | PHY_DIRECT_TRANSMIT | NULL | 1 | 52 | NULL || 2 | 2 | PHY_TABLE_SCAN | BMSQL_ITEM | 1 | 52 | table_rows:100000, physical_range_rows:1, logical_range_rows:1, index_back_rows:0, output_rows:1, est_method:local_storage |+------------+--------------+----------------------+------------+------+------+----------------------------------------------------------------------------------------------------------------------------+3 rows in set (0.01 sec)
分析 SQL 所在会话的事务的全部 SQL
gv$sql_audit)。
SELECT /*+ read_consistency(weak) ob_querytimeout(100000000) */ substr(usec_to_time(request_time),1,19) request_time_, s.svr_ip, s.sid, TRANSACTION_HASH ,s.query_sql, s.affected_rows, s.return_rows, s.ret_code, s.event, s.elapsed_time, s.queue_time, s.execute_time, plan_typeFROM gv$sql_audit sWHERE s.tenant_id=1001 and user_name='TPCC' AND request_time > ( time_to_usec(now())-36000000000*1) and sql_id='A460265EC2F0763A15DD27CE9E4E2200' and PLAN_TYPE = 2ORDER BY REQUEST_ID DESCLIMIT 100;+---------------------+---------------+------------+----------------------+---------------------------------------------------------------------------+---------------+-------------+----------+----------+--------------+------------+--------------+-----------+| request_time_ | svr_ip | sid | TRANSACTION_HASH | query_sql | affected_rows | return_rows | ret_code | event | elapsed_time | queue_time | execute_time | plan_type |+---------------------+---------------+------------+----------------------+---------------------------------------------------------------------------+---------------+-------------+----------+----------+--------------+------------+--------------+-----------+| 2021-03-20 17:58:11 | 172.30.118.70 | 3222006301 | 17714189861425713202 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 75723 | 0 | 1 | 0 | sync rpc | 1981 | 26 | 1903 | 2 || 2021-03-20 17:58:11 | 172.30.118.70 | 3222006303 | 15077805753115918393 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 79859 | 0 | 1 | 0 | sync rpc | 1715 | 23 | 1625 | 2 || 2021-03-20 17:58:11 | 172.30.118.70 | 3222006308 | 2930088302319928265 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 65475 | 0 | 1 | 0 | sync rpc | 1957 | 16 | 1879 | 2 || 2021-03-20 17:58:11 | 172.30.118.70 | 3222006307 | 8484084185755511457 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 83934 | 0 | 1 | 0 | sync rpc | 1709 | 16 | 1636 | 2 || 2021-03-20 17:58:11 | 172.30.118.70 | 3222006305 | 10846175025601860115 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 6083 | 0 | 1 | 0 | sync rpc | 1694 | 24 | 1613 | 2 || 2021-03-20 17:58:11 | 172.30.118.70 | 3222005734 | 1898163990547624876 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 35331 | 0 | 1 | 0 | sync rpc | 1901 | 11 | 1801 | 2 || 2021-03-20 17:58:11 | 172.30.118.70 | 3222006304 | 14242749847107990387 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 90094 | 0 | 1 | 0 | sync rpc | 1715 | 17 | 1626 | 2 || 2021-03-20 17:58:11 | 172.30.118.70 | 3222006301 | 17714189861425713202 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 73073 | 0 | 1 | 0 | sync rpc | 1621 | 17 | 1553 | 2 || 2021-03-20 17:58:11 | 172.30.118.70 | 3222006307 | 8484084185755511457 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 82931 | 0 | 1 | 0 | sync rpc | 1689 | 19 | 1609 | 2 || 2021-03-20 17:58:11 | 172.30.118.70 | 3222006308 | 2930088302319928265 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 64992 | 0 | 1 | 0 | sync rpc | 1989 | 37 | 1885 | 2 |<.....>| 2021-03-20 17:58:11 | 172.30.118.70 | 3222004245 | 10292058482827229406 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 1235 | 0 | 1 | 0 | sync rpc | 1657 | 16 | 1562 | 2 || 2021-03-20 17:58:11 | 172.30.118.70 | 3222000235 | 6630926809476864374 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 94062 | 0 | 1 | 0 | sync rpc | 1753 | 18 | 1672 | 2 || 2021-03-20 17:58:11 | 172.30.118.70 | 3222006307 | 6638448895855477417 | SELECT i_price, i_name, i_data FROM bmsql_item WHERE i_id = 82801 | 0 | 1 | 0 | sync rpc | 2034 | 17 | 1962 | 2 |+---------------------+---------------+------------+----------------------+---------------------------------------------------------------------------+---------------+-------------+----------+----------+--------------+------------+--------------+-----------+100 rows in set (0.60 sec)
transaction_hash值,这是事务的标识。然后根据事务标识查看该事务的所有SQL记录。(注意:227以前的版本同一会话开启新事务的时候,事务标识不变化,给定位带来一点点麻烦。此后修复了。)
SELECT /*+ ob_querytimeout(100000000) */substr(usec_to_time(request_time),1,19) request_time_, s.svr_ip, s.query_sql, plan_type,TRANSACTION_HASH FROM gv$sql_audit sWHERE s.tenant_id=1001 and user_name='TPCC' -- and sid = 3222006308 and TRANSACTION_HASH ='6638448895855477417'ORDER BY REQUEST_ID LIMIT 100;

bmsql_item的 SQL 不是事务里的第一条 SQL,所以事务的SQL都被发送到第一个 SQL 访问的表
BMSQL_DISTRICT主副本所在的节点
172.30.118.70上。后期访问节点
172.23.152.229的 SQL 自然就是远程 SQL。
复制表减少 SQL 远程节点访问
BMSQL_ITEM表是不能跟它们在一个表分组里。此时,OceanBase的解决方案就是把表
BMSQL_ITEM表变为复制表,这样它的分区在所有节点的副本都是全同步(跟主副本严格一致),备副本数据跟主副本也严格一致,此时业务事务可以就近访问该节点上的本地备副本。
2-2-2的时候,是会触发一个副本复制的过程。这里测试环境是
1-1-1,节点上本来就有备副本,所以是立即生效)。
ALTER TABLE bmsql_item duplicate_scope='cluster';


总结
其他参考
OceanBase SQL调优入门官方文档:https://www.oceanbase.com/docs/oceanbase-database/oceanbase-database-2-1/V2.1/iyf3ru
文章转载自OceanBase技术闲谈,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




