通过临时文件定位对应的sql
现场在执行一些中间结果集较大的sql时,在tmpdata文件夹下会产生大量的中间临时文件。在生成大量临时文件过程中,会导致磁盘io繁忙,在一定程度上影响了集群整体性能。因为现场并发sql较多,通过show processlist排查sql,效率低。如何通过tmpdata下临时文件快速准确找出所对应的sql?
以下介绍了根据临时文件定位对应SQL的方法,找到一个较为简单的方法获取正在执行的SQL占用磁盘空间的情况。
在gn层,performance_schema中有一个表 session_memory_usage_info。
suse103:~ # gncli
GBase client 8.5.1.2-52064.build11.5_R12.4.2.3.67091. Copyright © 2004-2016, GBase. All Rights Reserved.
gbase> use performance_schema;
Query OK, 0 rows affected (Elapsed: 00:00:00.00)
gbase> show tables;
±-----------------------------+
| Tables_in_performance_schema |
±-----------------------------+
| DISK_USAGE_INFO |
| CACHE_USAGE_INFO |
| CACHE_CELL_STATUS_INFO |
| HEAP_USAGE_INFO |
| SESSION_MEMORY_USAGE_INFO |
| MEMORY_USAGE_INFO |
| TABLES |
±-----------------------------+
7 rows in set (Elapsed: 00:00:00.00)
gbase> select * from performance_schema.SESSION_MEMORY_USAGE_INFO;
±--------±----±--------±-----±--------------------±-----------+
| HOST | ID | CURRENT | PEAK | PEAK_TIMESTAMP | TEMP_SPACE |
±--------±----±--------±-----±--------------------±-----------+
| suse103 | 344 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 342 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 324 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 289 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 274 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 273 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 241 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 215 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 214 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 213 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 132 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 52 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 31 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 26 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 25 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
| suse103 | 17 | 0 | 0 | 1970-01-01 08:00:00 | 0 |
±--------±----±--------±-----±--------------------±-----------+
16 rows in set (Elapsed: 00:00:00.00)
gbase> select * from performance_schema.SESSION_MEMORY_USAGE_INFO where TEMP_SPACE>0;
Empty set (Elapsed: 00:00:00.00)
查询SESSION_MEMORY_USAGE_INFO这个表,其中temp_space字段就是表示占用tmpdata空间大小,ID字段表示SQL执行的ID(对应show processlist 中的第一个字段“ID”)。然后在gn层的show processlist中可以找到对应的SQL,根据node层的SQL可以查找对应的gc层的SQL。
举例说明:
以下测试在8.5.1.2_build11.5_r12.4_2_3版本上完成。
(1)执行一条lineorder表自关联的sql,此sql形成笛卡尔积,会产生大量临时文件。
gbase> select * from lineorder a,lineorder b where a.lo_linenumber=b.lo_linenumber;
(2)查看/opt/gnode/tmpdata下会产生大量临时文件,并且还在一直增加
suse103:/opt/gnode/tmpdata/cache_gbase # pwd
/opt/gnode/tmpdata/cache_gbase
suse103:/opt/gnode/tmpdata/cache_gbase # ll
total 75497472
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000001965940xa3d8500.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000001CCF3A0xa3a61c0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000002D97070x38ae3c0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT0000000753F980x9626440.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT0000000FB81C50xbd28000.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000013199B0x17fc7f40.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000026185130x619a6c0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT000000268BA140x3d14440.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000030F188E0x9626300.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000035DD840x17fc7e00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000057D15420x38aea00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT0000005C49BAC0x3d12a00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT0000005F38C290xbd28140.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000069942570x619bac0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000069F92D60xa3dbe80.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:25 GB_MAT00000078F35830xa3dbd40.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000101965940xa3d8500.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000101CCF3A0xa3a61c0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000102D97070x38ae3c0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000130F188E0x9626300.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000135DD840x17fc7e00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000157D15420x38aea00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT0000015C49BAC0x3d12a00.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT0000015F38C290xbd28140.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000169942570x619bac0.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000169F92D60xa3dbe80.express_tmp
-rw-rw---- 1 gbase gbase 1979711488 Nov 8 14:26 GB_MAT00000178F35830xa3dbd40.express_tmp
-rw-rw---- 1 gbase gbase 469762048 Nov 8 14:27 GB_MAT00000201965940xa3d8500.express_tmp
-rw-rw---- 1 gbase gbase 469762048 Nov 8 14:27 GB_MAT00000201CCF3A0xa3a61c0.express_tmp
-rw-rw---- 1 gbase gbase 402653184 Nov 8 14:27 GB_MAT00000202D97070x38ae3c0.express_tmp
-rw-rw---- 1 gbase gbase 469762048 Nov 8 14:27 GB_MAT0000020753F980x9626440.express_tmp
-rw-rw---- 1 gbase gbase 436207616 Nov 8 14:27 GB_MAT0000020FB81C50xbd28000.express_tmp
-rw-rw---- 1 gbase gbase 436207616 Nov 8 14:27 GB_MAT00000213199B0x17fc7f40.express_tmp
-rw-rw---- 1 gbase gbase 436207616 Nov 8 14:27 GB_MAT00000226185130x619a6c0.express_tmp
-rw-rw---- 1 gbase gbase 469762048 Nov 8 14:27 GB_MAT00000269F92D60xa3dbe80.express_tmp
-rw-rw---- 1 gbase gbase 469762048 Nov 8 14:27 GB_MAT00000278F35830xa3dbd40.express_tmp
drwxrwx–x 2 gbase gbase 6 Nov 1 17:34 HashJoin
drwxrwx–x 2 gbase gbase 6 Nov 8 13:58 TrashCan
drwxrwx–x 3 gbase gbase 21 Nov 1 17:34 tmp_materialized
suse103:/opt/gnode/tmpdata # pwd
/opt/gnode/tmpdatasuse103:/opt/gnode/tmpdata # du -sh *
97G cache_gbase
suse103:/opt/gnode/tmpdata #
(3)下面按照上面描述方法,找出生成大量临时文件的sql语句。
查询SESSION_MEMORY_USAGE_INFO表找出sql id。
gbase> select * from performance_schema.SESSION_MEMORY_USAGE_INFO where TEMP_SPACE>0;
±--------±----±--------±-----±--------------------±------------+
| HOST | ID | CURRENT | PEAK | PEAK_TIMESTAMP | TEMP_SPACE |
±--------±----±--------±-----±--------------------±------------+
| suse103 | 342 | 0 | 0 | 1970-01-01 08:00:00 | 86872424448 |
±--------±----±--------±-----±--------------------±------------+
1 row in set (Elapsed: 00:00:00.00)
gbase>
根据sql id通过show full processlist找出gnode层sql语句。
suse103:~ # gncli -uroot -e"show full processlist" | grep 342
342 root 10.10.10.103:35972 ssbm Query 224 init SELECT /10.10.10.103_179_2_2016-11-08_14:24:21/ /*+ TID(‘78’) */ _tmp_rht_n4_179_t9_1_1477982902_s.lo_orderkey AS lo_orderkey, _tmp_rht_n4_179_t9_1_1477982902_s.lo_linenumber AS lo_linenumber, _tmp_rht_n4_179_t9_1_1477982902_s.lo_custkey AS lo_custkey, _tmp_rht_n4_179_t9_1_1477982902_s.lo_partkey AS lo_partkey, _tmp_rht_n4_179_t9_1_1477982902_s.lo_suppkey AS lo_suppkey, _tmp_rht_n4_179_t9_1_1477982902_s.lo_orderdate AS lo_orderdate, _tmp_rht_n4_179_t9_1_1477982902_s.lo_orderpriority AS lo_orderpriority, _tmp_rht_n4_179_t9_1_1477982902_s.lo_shippriority AS lo_shippriority, _tmp_rht_n4_179_t9_1_1477982902_s.lo_quantity AS lo_quantity, _tmp_rht_n4_179_t9_1_1477982902_s.lo_extendedprice AS lo_extendedprice, _tmp_rht_n4_179_t9_1_1477982902_s.lo_ordtotalprice AS lo_ordtotalprice, _tmp_rht_n4_179_t9_1_1477982902_s.lo_discount AS lo_discount, _tmp_rht_n4_179_t9_1_1477982902_s.lo_revenue AS lo_revenue, _tmp_rht_n4_179_t9_1_1477982902_s.lo_supplycost AS lo_supplycost, _tmp_rht_n4_179_t9_1_1477982902_s.lo_tax AS lo_tax, _tmp_rht_n4_179_t9_1_1477982902_s.lo_commitdate AS lo_commitdate, _tmp_rht_n4_179_t9_1_1477982902_s.lo_shipmode AS lo_shipmode, _tmp_rht_n4_179_t9_2_1477982902_s.c1 AS c1, _tmp_rht_n4_179_t9_2_1477982902_s.c2 AS c2, _tmp_rht_n4_179_t9_2_1477982902_s.c3 AS c3, _tmp_rht_n4_179_t9_2_1477982902_s.c4 AS c4, _tmp_rht_n4_179_t9_2_1477982902_s.c5 AS c5, _tmp_rht_n4_179_t9_2_1477982902_s.c6 AS c6, _tmp_rht_n4_179_t9_2_1477982902_s.c7 AS c7, _tmp_rht_n4_179_t9_2_1477982902_s.c8 AS c8, _tmp_rht_n4_179_t9_2_1477982902_s.c9 AS c9, _tmp_rht_n4_179_t9_2_1477982902_s.c10 AS c10, _tmp_rht_n4_179_t9_2_1477982902_s.c11 AS c11, _tmp_rht_n4_179_t9_2_1477982902_s.c12 AS c12, _tmp_rht_n4_179_t9_2_1477982902_s.c13 AS c13, _tmp_rht_n4_179_t9_2_1477982902_s.c14 AS c14, _tmp_rht_n4_179_t9_2_1477982902_s.c15 AS c15, _tmp_rht_n4_179_t9_2_1477982902_s.c16 AS c16, _tmp_rht_n4_179_t9_2_1477982902_s.c17 AS c17 FROM gctmpdb._tmp_rht_n4_179_t9_1_1477982902_s INNER JOIN gctmpdb._tmp_rht_n4_179_t9_2_1477982902_s ON (_tmp_rht_n4_179_t9_1_1477982902_s.lo_linenumber = _tmp_rht_n4_179_t9_2_1477982902_s.c2)
suse103:~ #
根据gnode层sql语句(/10.10.10.103_179_2_2016-11-08_14:24:21/或者临时表名_tmp_rht_n4_179_t9_1_1477982902_s)确定集群层sql session id,进而定位到具体sql。
suse103:~ # gccli -uroot -e"show full processlist" | grep -v Sleep | grep 179
179 root 127.0.0.1:63945 ssbm Query 571 Sending task to gnodes select * from lineorder a,lineorder b where a.lo_linenumber=b.lo_linenumber
suse103:~ #
另外,还有一种根据trace文件信息,获取对应SQL产生临时文件大小的方法(可用于SQL结束后查找问题SQL)。
具体信息如下:
执行计划总结(SUMMARY)
SUMMARY –- 执行计划总结
elapsed time: 00:00:07.897 – SQL执行时间
data loaded from storage: 130M, 4.184s, 419 DC.-- 总共从磁盘读了130MB (共419个DC)的数据,用时4.184s
data decompressed: 0B, 0s.-- 总共解压了0B数据,用时0s
temp space IO stats:-- 查询过程中临时表空间使用情况
CB write( 0B, 0time, 0sec), read( 0B, 0time, 0sec)-- 物化中间结果(读写大小, 读写请求次数,读写花费时间)
SRT write( 0B, 0time, 0sec), read( 0B, 0time, 0sec)-- 排序中间结果
GDC write( 0B, 0time, 0sec), read( 0B, 0time, 0sec)-- groupby distinct中间结果
MAT write( 57K, 12time, 0sec), read( 440K, 88time, 0sec)-- join中间结果
HSJ write( 0B, 0time, 0sec), read( 0B, 0time, 0sec)-- One-Pass Hash Join分片文件中间结果
第一个字段是写了多少磁盘文件,第二个字段是该文件读了多少次,第三个字段是花费多少时间。这里我们只关心write部分的第一个字段即可。可以用脚本获取每个SQL中写磁盘文件超过一个上限的(例如10G)所有SQL,然后再具体分析SQL。
还有一部分也能看到执行过程中磁盘动态占用的情况:
资源监控部分
[M:1.45G, 27M,D: 633K] [DC: 19006, 3899]
[M:1.45G, 41M,D: 633K] [DC: 19500, 3901]
[M:1.45G, 41M,D: 633K] [DC: 19500, 3901]
[M:1.50G, 54M,D: 633K] [DC: 19500, 4110]
[M:1.50G, 48M,D: 633K] [DC: 19500, 4110]
[M:1.50G, 48M,D: 633K] [DC: 19500, 4110]
[M:1.50G, 41M,D: 633K] [DC: 19500, 4111] – [内存:数据堆,large堆,磁盘:临时表空间] [访问DC数:内存命中,磁盘访问]
– (1). 数据堆:data heap的使用状况,全局状态
– (2). Large堆: 算子buffer的使用情况, 全局状态
– (3). 临时表空间: session级,语句执行过程中占用的临时磁盘空间
– (4). 内存中访问DC总数 :session级 可以看出查询中处理的数据量 语句执行过程中,从内存中访问的DC总数
– (5). 磁盘中访问DC总数 :session级 可以看到在什么步聚引起的IO 语句执行过程中,从磁盘中访问的DC总数




