直奔主题,报错如下
ERROR: cannot pushdown the subquery
DETAIL: There exist a reference table in the outer part of the outer join
其中JYDB.MF_CalmarRatio_TMP为业务中产生的临时表,最初为了考虑join的问题,将TMP表设计成为Reference Table
代码限制 外链接主表不能是参考表,改SQL代码,考虑改为CTE亲和表
将MF_CalmarRatio_TMP表改成分布式表之前,SQL改过CTE 如下
INSERT INTO JYDB.MF_CalmarRatio_TID_TMP
with x as (select * from JYDB.MF_CalmarRatio)
SELECT DISTINCT nextval('JYDB.jsid') AS ID,
3 AS Flag,
A.InnerCode,
A.IndexCode,
A.EndDate
FROM JYDB.MF_CalmarRatio_TMP A
LEFT JOIN x B ON A.InnerCode = B.InnerCode
AND A.IndexCode = B.IndexCode
AND A.EndDate = B.EndDate
WHERE B.ID IS NULL;
ERROR: the intermediate result size exceeds canopy.max_intermediate_result_size (currently 1048576 kB)
DETAIL: Canopy restricts the size of intermediate results of complex subqueries and CTEs to avoid accidentally pulling large result sets into once place.
HINT: To run the current query, set canopy.max_intermediate_result_size to a higher value or -1 to disable.
不是无脑增大max_intermediate_result_size值
如上报错的原因是因为分布式表转换为亲和表后与参考表进行join,Canopy 查询结果的大小超出了 citus.max_intermediate_result_size 的限制,当前限制为 1048576 kB。Canopy 限制复杂子查询和公共表表达式(CTE)的中间结果大小,以避免意外将大型结果集放入单个位置。
为了运行当前查询,您可以将 Canopy.max_intermediate_result_size 设置为更高的值,或将其设置为 -1 以禁用此限制。但请注意,如果中间结果集非常大,可能会影响查询的性能和可扩展性。如果您需要查询大量数据,请考虑使用分批次查询或优化查询语句以减少中间结果集的大小。
最终MF_CalmarRatio_TMP改为分布式表
jydb@jydb=> select * from canopy_tables;
table_name | canopy_table_type | distribution_column | colocation_id | table_size | shard_count | table_owner | access_method
-----------------------------+-------------------+---------------------+---------------+------------+-------------+-------------+---------------
github_events | distributed | user_id | 5 | 1920 kB | 80 | jydb | heap
github_users | distributed | user_id | 5 | 640 kB | 80 | jydb | heap
jydb_deleterec | reference | <none> | 8 | 8039 MB | 1 | jydb | heap
jydb_system | reference | <none> | 8 | 192 kB | 1 | jydb | heap
jydb_tid.mf_calmarratio_tid | reference | <none> | 8 | 568 MB | 1 | jydb_tid | heap
mf_abnormalreturn | distributed | innercode | 4 | 18 GB | 80 | jydb | heap
mf_calmarratio | distributed | innercode | 4 | 28 GB | 80 | jydb | heap
mf_calmarratio_tid_tmp | reference | <none> | 8 | 20 MB | 1 | jydb | heap
mf_calmarratio_tmp | distributed | innercode | 4 | 46 MB | 80 | jydb | heap
mf_derivativeindexinfo | reference | <none> | 8 | 360 kB | 1 | jydb | heap
mf_fundmaxdrawd | distributed | innercode | 4 | 58 GB | 80 | jydb | heap
mf_nvperformancetransh | distributed | innercode | 4 | 5172 MB | 80 | jydb | heap
mf_nvperformancetransh_tmp | reference | <none> | 8 | 47 MB | 1 | jydb | heap
tmp_mf_calmarratio | distributed | innercode | 4 | 25 MB | 80 | jydb | heap
tmp_mf_calmarratio_tid | reference | <none> | 8 | 0 bytes | 1 | jydb | heap
tmp_mf_nvperformancetransh | reference | <none> | 8 | 4872 kB | 1 | jydb | heap
再次查询查看执行计划
jydb@jydb=> explain analyze SELECT DISTINCT nextval('JYDB.jsid') AS ID,
jydb@jydb-> 3 AS Flag,
jydb@jydb-> A.InnerCode,
jydb@jydb-> A.IndexCode,
jydb@jydb-> A.EndDate
jydb@jydb-> FROM JYDB.MF_CalmarRatio B
jydb@jydb-> LEFT JOIN JYDB.MF_CalmarRatio_TMP A ON A.InnerCode = B.InnerCode
jydb@jydb-> AND A.IndexCode = B.IndexCode
jydb@jydb-> AND A.EndDate = B.EndDate
jydb@jydb-> WHERE B.ID IS NULL;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------
HashAggregate (cost=1500.00..2750.00 rows=100000 width=28) (actual time=40.556..40.564 rows=0 loops=1)
Group Key: nextval('jsid'::regclass), remote_scan.flag, remote_scan.innercode, remote_scan.indexcode, remote_scan.enddate
Batches: 1 Memory Usage: 3097kB
-> Custom Scan (Canopy Adaptive) (cost=0.00..250.00 rows=100000 width=28) (actual time=40.173..40.181 rows=0 loops=1)
Task Count: 80
Tuple data received from nodes: 0 bytes
Tasks Shown: One of 80
-> Task
Tuple data received from node: 0 bytes
Node: host=10.101.0.122 port=5432 dbname=jydb
-> Unique (cost=4.76..4.77 rows=1 width=20) (actual time=0.083..0.084 rows=0 loops=1)
-> Sort (cost=4.76..4.76 rows=1 width=20) (actual time=0.083..0.083 rows=0 loops=1)
Sort Key: a.innercode, a.indexcode, a.enddate
Sort Method: quicksort Memory: 33kB
-> Nested Loop Left Join (cost=0.70..4.75 rows=1 width=20) (actual time=0.048..0.048 rows=0 loops=1)
-> Index Scan using ix_mf_calmarratio_id_105713 on mf_calmarratio_105713 b (cost=0.43..2.44 rows=1 width=16) (actual time=0.048..0.
048 rows=0 loops=1)
Index Cond: (id IS NULL)
-> Index Only Scan using ix_mf_calmarratio_tmp_105873 on mf_calmarratio_tmp_105873 a (cost=0.28..2.30 rows=1 width=16) (never execu
ted)
Index Cond: ((innercode = b.innercode) AND (enddate = b.enddate) AND (indexcode = b.indexcode))
Heap Fetches: 0
Planning Time: 1.236 ms
Execution Time: 0.130 ms
Planning Time: 7.373 ms
Execution Time: 42.619 ms
(24 rows)
最后修改时间:2023-03-02 17:14:55
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。