错误提示如下:
INSERT INTO JYDB.tmp_MF_StkHoldingAna SELECT DISTINCT A.InnerCode, A.ReportDate
FROM JYDB.MF_InvestIndustry A
WHERE A.JSID > 732301709313
AND A.JSID <= 732301709313
AND NOT EXISTS(SELECT 1 FROM JYDB.tmp_MF_StkHoldingAna B
WHERE B.InnerCode = A.InnerCode AND B.ReportDate = A.ReportDate);
ERROR: direct joins between distributed and local tables are not supported
HINT: Use CTE's or subqueries to select from local tables and use them in joins
其中tmp_MF_StkHoldingAna位cn上的本地表,将其转换为reference表,便可正常执行
lightdb@jydb=# select * from citus_tables ;
table_name | canopy_table_type | distribution_column | colocation_id | table_size | shard_count | table_owner | access_method
---------------------------------+-------------------+---------------------+---------------+------------+-------------+-------------+---------------
jydb.mf_investindustry | distributed | innercode | 4 | 388 MB | 80 | jydb | heap
jydb.tmp_mf_stkholdingana | reference | <none> | 8 | 72 kB | 1 | jydb | heap
lightdb@jydb=# begin;
BEGIN
lightdb@jydb=*# INSERT INTO JYDB.tmp_MF_StkHoldingAna SELECT DISTINCT A.InnerCode, A.ReportDate
lightdb@jydb-*# FROM JYDB.MF_InvestIndustry A
lightdb@jydb-*# WHERE A.JSID > 732301709313
lightdb@jydb-*# AND A.JSID <= 732301709313
lightdb@jydb-*# AND NOT EXISTS(SELECT 1 FROM JYDB.tmp_MF_StkHoldingAna B
lightdb@jydb(*# WHERE B.InnerCode = A.InnerCode AND B.ReportDate = A.ReportDate);
INSERT 0 0
复现测试
创建测试表
drop table if exists products;
drop table if exists customers;
drop table if exists orders;
create table customers (customers_id bigint,product_name varchar(100));
alter table customers add primary key (customers_id);
create table orders (order_id bigint,customers_id bigint,order_detail varchar(100));
alter table orders add primary key(order_id,customers_id);
create table products (product_id bigint,customers_id bigint,product_name varchar(100));
alter table products add primary key (product_id);
select create_distributed_table('customers','customers_id');
select create_distributed_table('orders','customers_id',colocate_with=>'customers');
lightdb@test=# select * from citus_tables ;
table_name | citus_tables_type | distribution_column | colocation_id | table_size | shard_count | table_owner | access_method
------------+-------------------+---------------------+---------------+------------+-------------+-------------+---------------
customers | distributed | customers_id | 7 | 32 kB | 4 | lightdb | heap
orders | distributed | customers_id | 7 | 32 kB | 4 | lightdb | heap
(2 rows)
如下是不会报错
lightdb@test=# select p.*,o.* from products p ,orders o where o.customers_id = p.customers_id;
product_id | customers_id | product_name | order_id | customers_id | order_detail
------------+--------------+--------------+----------+--------------+--------------
(0 rows)
但是执行下面SQL便报如下错误
lightdb@test=# select c.* , o.* from customers c , orders o where o.customers_id = c.customers_id
lightdb@test-# and exists (select 1 from products p where p.customers_id = c.customers_id);
ERROR: direct joins between distributed and local tables are not supported
HINT: Use CTE's or subqueries to select from local tables and use them in joins
关于这个错误的解释:
这个错误提示是因为 citus不支持在分布式表和本地表之间进行直接连接操作。
在 citus中,分布式表是由多个节点上的数据子集组成的,而本地表只存在于单个节点上。由于分布式表和本地表之间的数据分布不同,citus不能有效地执行这种连接操作。
相反,您可以使用公共表表达式 (CTE) 或子查询来从本地表中选择数据,然后将其与分布式表连接。例如:
WITH local_data AS (
SELECT *
FROM my_local_table
)
SELECT *
FROM my_distributed_table
JOIN local_data ON my_distributed_table.key = local_data.key
上面的示例中,我们首先使用 CTE 选择本地表中的数据,然后将其与分布式表进行连接。这种方法可以避免直接连接分布式表和本地表,从而避免了错误提示。
总之,在 citus中连接分布式表和本地表时,您应该使用 CTE 或子查询来选择本地表数据,并将其与分布式表进行连接,而不是直接连接它们。这可以确保 citus在分布式集群中正确执行查询,并避免出现错误提示。
所以将原SQL改成
lightdb@test=# with products_local as (select * from products)
lightdb@test-# select c.* , o.*
lightdb@test-# from customers c , orders o
lightdb@test-# where o.customers_id = c.customers_id
lightdb@test-# and exists (select 1 from products_local p where p.customers_id = c.customers_id);
customers_id | product_name | order_id | customers_id | order_detail
--------------+--------------+----------+--------------+--------------
(0 rows)
或者考虑将products转换成分布式参考表
SELECT create_reference_table('products');
lightdb@test=# select c.* , o.*
lightdb@test-# from customers c , orders o
lightdb@test-# where o.customers_id = c.customers_id
lightdb@test-# and exists (select 1 from products p where p.customers_id = c.customers_id);
customers_id | product_name | order_id | customers_id | order_detail
--------------+--------------+----------+--------------+--------------
(0 rows)
错误遍不会存在,正常情况下极少数会使用本地表
最后修改时间:2023-03-02 13:43:47
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。