暂无图片
暂无图片
1
暂无图片
暂无图片
暂无图片

MySQL 8.0.31 集合操作INTERSECT和EXCEPT

原创 CuiHulong 2023-01-10
1859

对于聚合的功能MySQL是都是默默的发展。在最新的8.0.31版本中提供对集合操作INTERSECT和EXCEPT。这样一来,集合操作功能基本圆满了。MySQL5.7.40版本是不支持这个集合的。

In this release MySQL adds support for the SQL standard INTERSECT and EXCEPT table operators.
query_a INTERSECT query_b includes only rows appearing in both result sets.
query_a EXCEPT query_b returns any rows from the result set of query_a which are not in the result of query_b.
INTERSECT and EXCEPT both support DISTINCT and ALL, with DISTINCT the default in both cases. (This is the same as for UNION).
INTERSECT groups before EXCEPT or UNION, so TABLE r EXCEPT TABLE s INTERSECT TABLE t is evaluated as TABLE r EXCEPT (TABLE s INTERSECT TABLE t).
See INTERSECT Clause, and EXCEPT Clause, for additional information and examples. (Bug #1309, Bug #31336, Bug #11747209, Bug #11744757)

目前MySQL里提供的集合操作包含:【UNION ,UNION ALL,INTERSECT,EXCEPT】

集合类型 数据 结果
UNION ALL A={1,2,3},B={3,4,5,6} ALL={1,2,3,3,4,5,6}
UNION A={1,2,3},B={3,4,5,6} AUB={1,2,3,4,5,6}
INTERSECT A={1,2,3},B={3,4,5,6} A∩B={3}
EXCEPT A={1,2,3},B={3,4,5,6} A-B={1,2}

image.png

集合

测试集合

CREATE TABLE `DataSetA` ( `id` bigint NOT NULL , `name` varchar(10) DEFAULT NULL ); INSERT INTO DataSetA(id,name) VALUES(1,'A'),(2,'A'),(3,'C'); CREATE TABLE `DataSetB` ( `id` bigint NOT NULL , `name` varchar(10) DEFAULT NULL ); INSERT INTO DataSetB(id,name) VALUES(3,'C'),(4,'D'),(5,'E'),(6,'F'); CREATE TABLE `DataSetC` ( `id` bigint NOT NULL, `name` varchar(10) DEFAULT NULL ); INSERT INTO DataSetC(id,name) VALUES(6,'F'),(7,'G'),(7,'G'),(8,'H');

不同的结果操作集:

image.png

执行计划:

除了UNION ALL,都需要通过Using temporary 进行运算,所以性能还是存在一定影响。毕竟涉及临时表。
image.png

集合中的DISTINCT和ALL

INTERSECT和EXCEPT都支持DISTINCT和ALL与UNION相同),在这两种情况下默认为DISTINCT。
字面意义相同是否去重或全部显示。

query_block INTERSECT [ALL | DISTINCT] query_block
mysql> table DataSetC INTERSECT DISTINCT table DataSetC; +----+------+ | id | name | +----+------+ | 6 | F | | 7 | G | | 8 | H | +----+------+ 3 rows in set (0.00 sec) mysql> table DataSetC INTERSECT ALL table DataSetC; +----+------+ | id | name | +----+------+ | 6 | F | | 7 | G | | 7 | G | | 8 | H | +----+------+ 4 rows in set (0.00 sec)

备注:对于INTERSECT ALL,左手表中任何唯一行的最大支持的重复数是4294967295。

EXCEPT的问题ALL:无法去除所有匹配值

CREATE TABLE `DataSetD` ( `id` bigint NOT NULL , `name` varchar(10) DEFAULT NULL ); INSERT INTO DataSetD(id,name) VALUES(1,'A'),(1,'A'),(1,'A'),(3,'C'); mysql> SELECT * FROM DataSetD; +----+------+ | id | name | +----+------+ | 1 | A | | 1 | A | | 1 | A | | 3 | C | +----+------+ 3 rows in set (0.00 sec) CREATE TABLE `DataSetE` ( `id` bigint NOT NULL , `name` varchar(10) DEFAULT NULL ); INSERT INTO DataSetE(id,name) VALUES(1,'A'); mysql> SELECT * FROM DataSetE; +----+------+ | id | name | +----+------+ | 1 | A | +----+------+ 1 row in set (0.00 sec) mysql > table DataSetD EXCEPT ALL table DataSetE; +----+------+ | id | name | +----+------+ | 1 | A | | 1 | A | | 3 | C | +----+------+ 2 rows in set (0.00 sec)

备注:上诉中(1,A)信息,只有去掉一个,没有全部去掉。

注意事项

从底层实现和测试现象来看,集合对比方式是用object方式进行对比。只要列数量对就可以。类型无关紧要。但实际业务中肯定是满足字段,类型一致。

1.集合对比必须具有相同的列数,列数不同出现问题:

mysql> SELECT ID FROM DataSetA INTERSECT SELECT ID ,NAME FROM DataSetB; ERROR 1222 (21000): The used SELECT statements have a different number of columns

2. 不同类型对比,可以进行对比。但对于最终结果来说就没意义。

mysql> SELECT ID FROM DataSetA UNION SELECT NAME FROM DataSetA; +------+ | ID | +------+ | 1 | | 2 | | 3 | | A | | C | +------+ 5 rows in set (0.00 sec)

3.版本对比写法不同
MySQL 5.7相比,在MySQL 8.0中,对SELECT和集合的解析器规则进行了重构,使其更加一致并减少了重复.

  • LIMIT 写法
#mysql5.7.40不支持 mysql> (SELECT 1 AS result UNION SELECT 2) LIMIT 1; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'UNION SELECT 2) LIMIT 1' at line 1 #mysql8.0.31支持 mysql> (SELECT 1 AS result UNION SELECT 2) LIMIT 1; +--------+ | result | +--------+ | 1 | +--------+ 1 row in set (0.00 sec)
  • FOR UPDATE 写法
#mysql5.7.40 mysql> SELECT 1 FOR UPDATE UNION SELECT 1 FOR UPDATE; +---+ | 1 | +---+ | 1 | +---+ 1 row in set (0.00 sec) mysql> (SELECT 1 FOR UPDATE) UNION (SELECT 1 FOR UPDATE); +---+ | 1 | +---+ | 1 | +---+ 1 row in set (0.00 sec) #mysql8.0.31 更严谨 mysql> SELECT 1 FOR UPDATE UNION SELECT 1 FOR UPDATE; ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'UNION SELECT 1 FOR UPDATE' at line 1 mysql> (SELECT 1 FOR UPDATE) UNION (SELECT 1 FOR UPDATE); +---+ | 1 | +---+ | 1 | +---+ 1 row in set (0.00 sec)

总结

集合对比操作也算聚合的一种,在MySQL里性能影响还是比较可观的。基本要通过临时表进行处理。所以环境中需要合理使用。
EXCEPT ALL多个相同值无法全部取掉。这个需要注意下。

参考:

https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-31.html

最后修改时间:2023-01-31 11:06:18
「喜欢这篇文章,您的关注和赞赏是给作者最好的鼓励」
关注作者
【版权声明】本文为墨天轮用户原创内容,转载时必须标注文章的来源(墨天轮),文章链接,文章作者等基本信息,否则作者和墨天轮有权追究责任。如果您发现墨天轮中有涉嫌抄袭或者侵权的内容,欢迎发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

评论