案例分析总结:
遇到复杂sql执行缓慢的问题,在确认数据库层面无异常后,一般处理逻辑是对执行计划进行分析看是否有调优的空间。首先通过简化文本来分析sql的表连接关系和业务逻辑等,在充分了解sql的前提下再对执行计划进行分析并找到问题点,最后对症下药,通过语句等价改写、语句中加入hint、绑定执行计划、统计信息收集等等方式达到优化执行计划的目的。
案例分析过程:
本次分享的是个帐平台数据库生产环境中一句复杂sql的分析和优化, sql的文本如下,看到这样的sql先不能头晕眼花哦,要通过一些方法进行简化然后着手分析:
select a.msg_type,
a.PROCESSING_CODE,
a.POS_COND_CODE,
a.PAN,
a.PAN_LEN,
a.AMT_TRANS,
a.RETRIVL_REF,
a.BAT_NO,
a.CUP_SSN,
a.DATE_SETTLMT,
a.CARD_ACCP_TERM_ID,
a.CARD_ACCP_ID,
a.TIME_LOCAL_TRANS,
a.DATE_LOCAL_TRANS,
a.REVSAL_FLAG,
a.CANCEL_FLAG,
SUM(a.SUPPLY_FEE_AMT)
from (selecta.msg_type,
a.PROCESSING_CODE,
a.POS_COND_CODE,
a.PAN,
a.PAN_LEN,
a.AMT_TRANS as AMT_TRANS,
a.RETRIVL_REF,
a.BAT_NO,
a.CUP_SSN,
a.DATE_SETTLMT,
a.CARD_ACCP_TERM_ID,
a.CARD_ACCP_ID,
a.TIME_LOCAL_TRANS,
a.DATE_LOCAL_TRANS,
a.REVSAL_FLAG,
a.CANCEL_FLAG,
b.SUPPLY_FEE_AMT
from(select msg_type,
PROCESSING_CODE,
POS_COND_CODE,
PAN,
PAN_LEN,
AMT_TRANS,
RETRIVL_REF,
BAT_NO,
CUP_SSN,
DATE_SETTLMT,
CARD_ACCP_TERM_ID,
CARD_ACCP_ID,
TIME_LOCAL_TRANS,
DATE_LOCAL_TRANS,
REVSAL_FLAG,
CANCEL_FLAG,
txn_num,
resp_code,
trans_state
from xx_posp_txn_his a
where a.trans_state = '1'
and a.DATE_SETTLMT = '20151118'
and a.REVSAL_FLAG = '0'
and a.CANCEL_FLAG = '0'
and a.processing_code in ('000000', '990000')
and a.resp_code = '00'
and a.txn_num in ('1105', '1125')
and a.CARD_ACCP_ID in
(select m_org_code
from xx_UMS_MERCHANT_REL_INFO
whereF_UMS_CODE = '100600'
andm_org_code <> '111999949000000'
andm_org_code not in
(select mer_code
from T_INFO_UNIONPAY_MERCHANT
where agent_code = '100600'))) a
join(select SUPPLY_FEE_AMT, TERMINAL_COMM_SEQ_NO
from xx_log_offline_payment_his
wheresupply_org_code in
(select m_org_code
fromT_UMS_MERCHANT_REL_INFO
whereF_UMS_CODE = '100600'
andm_org_code <> '111999949000000'
and m_org_code not in
(select mer_code
from xx_INFO_UNIONPAY_MERCHANT
where agent_code = '100600'))
and settle_date = '20151118'
union all
select SUPPLY_FEE_AMT, TERMINAL_COMM_SEQ_NO
from xx_log_online_payment_his
where supply_org_code in
(select m_org_code
fromT_UMS_MERCHANT_REL_INFO
where F_UMS_CODE ='100600'
andm_org_code <> '111999949000000'
andm_org_code not in
(select mer_code
from XX_INFO_UNIONPAY_MERCHANT
where agent_code = '100600'))
and settle_date = '20151118') b
ona.RETRIVL_REF = b.TERMINAL_COMM_SEQ_NO) a
group bya.msg_type,
a.PROCESSING_CODE,
a.POS_COND_CODE,
a.PAN,
a.PAN_LEN,
a.AMT_TRANS,
a.RETRIVL_REF,
a.BAT_NO,
a.CUP_SSN,
a.DATE_SETTLMT,
a.CARD_ACCP_TERM_ID,
a.CARD_ACCP_ID,
a.TIME_LOCAL_TRANS,
a.DATE_LOCAL_TRANS,
a.REVSAL_FLAG,
a.CANCEL_FLAG
这句sql语句美化后也有120行样子了,算是比较繁杂的语句了,这该怎么入手分析呢,我一般先把无关紧要的那些处于select和from之间的输出列去掉,然后将不涉及表关联的谓词和非索引列谓词去掉,这里还可以把三处标红的相同子查询简化掉,另外,原语句中表的别名重复使用了,这会使得sql可读性变差,也顺便修改一下,经过简化后本文就变得很简单了:
select cc.XXX,sum(cc.SUPPLY_FEE_AMT)
from (selectaa.XXX,bb.XXX
from
(selecta.XXX
from t_posp_txn_his a
where a.DATE_SETTLMT = '20151118'
anda.CARD_ACCP_ID in 子查询) aa
join (select XXX
from XX_log_offline_payment_his
where supply_org_code in 子查询
and settle_date = '20151118'
union all
select XXX
from xx_log_online_payment_his
wheresupply_org_code in 子查询
and settle_date ='20151118') bb
onaa.RETRIVL_REF = bb.TERMINAL_COMM_SEQ_NO) cc
group by cc.XXXX……;
简化后,很容易就能看懂这句sql的意思,我用不同颜色将主体区分成三部分,后面对执行计划的分析中也用相同的颜色对应。
三部分都涉及到的相同子查询,单独运行一下发现数据量比较少,只有671条数据:
selectm_org_code
from XX_UMS_MERCHANT_REL_INFO
where F_UMS_CODE ='100600'
and m_org_code<> '111999949000000'
and m_org_code not in
(select mer_code
from XX_INFO_UNIONPAY_MERCHANT
whereagent_code = '100600')
在数据库中运行的这条sql是有性能问题的,平均执行一次需要748S:

我们来看看执行计划:
看下蓝色的部分,XX_LOG_ONLINE_PAYMENT_HIS表是走了SETTLEDATE列的索引,
这部分过滤出来3378128多条数据,然后和子查询做hash连接,显然这个不好
select count(*) from xx_log_online_payment_his where settle_date = '20151118'
--3378128
前面说到那个子查询结果只有671条数据,所以用子查询做驱动表,在supply_org_code列上走嵌套连接会比较好,XX_LOG_ONLINE_PAYMENT_HIS表的supply_org_code列有索引的
再看绿色部分,先通过XX_LOG_OFFLINE_PAYMENT_HIS表的SETTLEDATE列索引过滤了数据(基本没有数据),然后和XX_UMS_MERCHANT_REL_INFO 做嵌套,最后和T_INFO_UNIONPAY_MERCHANT做HASH
最后看下紫色部分,XX_posp_txn_his表是走了DATE_SETTLMT列的索引来过滤数据,过滤效率一般,CARD_ACCP_ID列上没有索引
select count(*) from XX_posp_txn_his where DATE_SETTLMT ='20151118'--256582
看来问题都有子查询展开后的表连接顺序有关
那我们可以考虑,将三个相同的子查询独立成一个临时表,一来简化了sql文本,二来结构更清晰,三来将子查询作为一个整体视图使得连接关系更清楚
改写后语句如下:
withtmp_m_org_code as (
select m_org_code
from XX_UMS_MERCHANT_REL_INFO
where F_UMS_CODE = '100600'
and m_org_code<> '111999949000000'
and m_org_code not in
(select mer_code
from XX_INFO_UNIONPAY_MERCHANT
whereagent_code = '100600'))
select a.msg_type,
a.PROCESSING_CODE,
a.POS_COND_CODE,
a.PAN,
a.PAN_LEN,
a.AMT_TRANS,
a.RETRIVL_REF,
a.BAT_NO,
a.CUP_SSN,
a.DATE_SETTLMT,
a.CARD_ACCP_TERM_ID,
a.CARD_ACCP_ID,
a.TIME_LOCAL_TRANS,
a.DATE_LOCAL_TRANS,
a.REVSAL_FLAG,
a.CANCEL_FLAG,
SUM(a.SUPPLY_FEE_AMT)
from (selecta.msg_type,
a.PROCESSING_CODE,
a.POS_COND_CODE,
a.PAN,
a.PAN_LEN,
a.AMT_TRANS as AMT_TRANS,
a.RETRIVL_REF,
a.BAT_NO,
a.CUP_SSN,
a.DATE_SETTLMT,
a.CARD_ACCP_TERM_ID,
a.CARD_ACCP_ID,
a.TIME_LOCAL_TRANS,
a.DATE_LOCAL_TRANS,
a.REVSAL_FLAG,
a.CANCEL_FLAG,
b.SUPPLY_FEE_AMT
from(select msg_type,
PROCESSING_CODE,
POS_COND_CODE,
PAN,
PAN_LEN,
AMT_TRANS,
RETRIVL_REF,
BAT_NO,
CUP_SSN,
DATE_SETTLMT,
CARD_ACCP_TERM_ID,
CARD_ACCP_ID,
TIME_LOCAL_TRANS,
DATE_LOCAL_TRANS,
REVSAL_FLAG,
CANCEL_FLAG,
txn_num,
resp_code,
trans_state
from xx_posp_txn_his a,tmp_m_org_code mm
where a.trans_state = '1'
and a.DATE_SETTLMT = '20151118'
and a.REVSAL_FLAG = '0'
and a.CANCEL_FLAG = '0'
and a.processing_code in ('000000', '990000')
and a.resp_code = '00'
and a.txn_num in ('1105', '1125')
and a.CARD_ACCP_ID=mm.m_org_code) a
join(select off.SUPPLY_FEE_AMT, off.TERMINAL_COMM_SEQ_NO
from xx_log_offline_payment_his off,tmp_m_org_codemm
where off.supply_org_code =mm.m_org_code
and off.settle_date = '20151118'
union all
select onl.SUPPLY_FEE_AMT, onl.TERMINAL_COMM_SEQ_NO
from xx_log_online_payment_his onl,tmp_m_org_codemm
where onl.supply_org_code=mm.m_org_code
and onl.settle_date = '20151118')b
ona.RETRIVL_REF = b.TERMINAL_COMM_SEQ_NO) a
group bya.msg_type,
a.PROCESSING_CODE,
a.POS_COND_CODE,
a.PAN,
a.PAN_LEN,
a.AMT_TRANS,
a.RETRIVL_REF,
a.BAT_NO,
a.CUP_SSN,
a.DATE_SETTLMT,
a.CARD_ACCP_TERM_ID,
a.CARD_ACCP_ID,
a.TIME_LOCAL_TRANS,
a.DATE_LOCAL_TRANS,
a.REVSAL_FLAG,
a.CANCEL_FLAG;
来看一下新的执行计划,执行计划更加清晰了,表的访问也更合理了,20s内可以出结果
2jhu1wkw42gjd

运行后发现执行计划会突变成走XX_LOG_OFFLINE_PAYMENT_HIS表的SETTLEDATE列索引

我们发现,走SETTLEDATE列上的索引性能就会恶化变,注意这里显示和cardinality feedback有关,看来这里这个特性帮了倒忙

接下来,我们可以绑定执行计划或者是使用hint的方法来强制走plan_hash_value为3685817487的执行计划,当然建议绑定计划,方便省事,如果要使用hint的话,那么又要改动一下语句了,我尝试用hint关闭cardinality feedback特性后运行正常,
hint关闭CF特性的语句如下:
withtmp_m_org_code as (
selectm_org_code
from XX_UMS_MERCHANT_REL_INFO
where F_UMS_CODE ='100600'
and m_org_code<> '111999949000000'
and m_org_code not in
(select mer_code
from XX_INFO_UNIONPAY_MERCHANT
whereagent_code = '100600'))
select a.msg_type,
a.PROCESSING_CODE,
a.POS_COND_CODE,
a.PAN,
a.PAN_LEN,
a.AMT_TRANS,
a.RETRIVL_REF,
a.BAT_NO,
a.CUP_SSN,
a.DATE_SETTLMT,
a.CARD_ACCP_TERM_ID,
a.CARD_ACCP_ID,
a.TIME_LOCAL_TRANS,
a.DATE_LOCAL_TRANS,
a.REVSAL_FLAG,
a.CANCEL_FLAG,
SUM(a.SUPPLY_FEE_AMT)
from (selecta.msg_type,
a.PROCESSING_CODE,
a.POS_COND_CODE,
a.PAN,
a.PAN_LEN,
a.AMT_TRANS as AMT_TRANS,
a.RETRIVL_REF,
a.BAT_NO,
a.CUP_SSN,
a.DATE_SETTLMT,
a.CARD_ACCP_TERM_ID,
a.CARD_ACCP_ID,
a.TIME_LOCAL_TRANS,
a.DATE_LOCAL_TRANS,
a.REVSAL_FLAG,
a.CANCEL_FLAG,
b.SUPPLY_FEE_AMT
from(select msg_type,
PROCESSING_CODE,
POS_COND_CODE,
PAN,
PAN_LEN,
AMT_TRANS,
RETRIVL_REF,
BAT_NO,
CUP_SSN,
DATE_SETTLMT,
CARD_ACCP_TERM_ID,
CARD_ACCP_ID,
TIME_LOCAL_TRANS,
DATE_LOCAL_TRANS,
REVSAL_FLAG,
CANCEL_FLAG,
txn_num,
resp_code,
trans_state
from x_posp_txn_his a,tmp_m_org_code mm
where a.trans_state = '1'
and a.DATE_SETTLMT = '20151118'
and a.REVSAL_FLAG = '0'
and a.CANCEL_FLAG = '0'
and a.processing_code in ('000000', '990000')
and a.resp_code = '00'
and a.txn_num in ('1105', '1125')
and a.CARD_ACCP_ID=mm.m_org_code) a
join(select off.SUPPLY_FEE_AMT, off.TERMINAL_COMM_SEQ_NO
from XX_log_offline_payment_his off,tmp_m_org_codemm
where off.supply_org_code =mm.m_org_code
and off.settle_date = '20151118'
union all
select *+opt_param('_optimizer_use_feedback' 'false')*/ onl.SUPPLY_FEE_AMT, onl.TERMINAL_COMM_SEQ_NO
from x_log_online_payment_his onl,tmp_m_org_codemm
where onl.supply_org_code=mm.m_org_code
and onl.settle_date = '20151118') b
ona.RETRIVL_REF = b.TERMINAL_COMM_SEQ_NO) a
group bya.msg_type,
a.PROCESSING_CODE,
a.POS_COND_CODE,
a.PAN,
a.PAN_LEN,
a.AMT_TRANS,
a.RETRIVL_REF,
a.BAT_NO,
a.CUP_SSN,
a.DATE_SETTLMT,
a.CARD_ACCP_TERM_ID,
a.CARD_ACCP_ID,
a.TIME_LOCAL_TRANS,
a.DATE_LOCAL_TRANS,
a.REVSAL_FLAG,
a.CANCEL_FLAG;




