## ROLLUP 的语义
ROLLUP 是一种 GROUP BY modifier,在常规的 GROUP BY clause 后面加上 ROLLUP,可以对原先的 group 的 sub-group、sub-sub-group、sub-sub-sub-group 做聚合操作。
ROLLUP 常被用来实现统计报表里面“小计”“合计”等功能。这里用一张零售业报表来说明 ROLLUP 的语义。
下图中,我们可以用 “**SUM(profit) ... GROUP BY 类目, 品牌, 型号**”来统计“小米CC9e”、“魅族 16Xs”和 “HUAWEI P20”的这些具体型号的销量情况;在此基础上,在 GROUP BY 后面加上 “WITH ROLLUP”,就可以分别统计出
1. 各种“品牌”的总销量。比如小米手机的合计销量、魅族手机的合计销量,联想电脑的合计销量。相当于 SUM(profit) ... GROUP BY (类目, 品牌)
2. 各种类目的总销量。比如手机的总销量、电脑的总销量。相当于 SUM(profit) ... GROUP BY (类目)
3. 所用东西合起来的总销量。相当于 "SELECT SUM(profit) FROM ... " ,没有 GROUP BY
相当于层层递进,计算出“小计”、“小计的小计”...“总计”。

关于 ROLLUP 的语义,MySQL 的官方文档写得比较齐全:[https://dev.mysql.com/doc/refman/8.0/en/group-by-modifiers.html](https://dev.mysql.com/doc/refman/8.0/en/group-by-modifiers.html)。
因为是关系型数据库,真实的输出结果当然是表格形式的; 而那些因 ROLLUP 操作而不存在的 cell 则用 NULL 替代:
```sql
mysql> SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+--------+
| year | country | product | profit |
+------+---------+------------+--------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 | <-- 这个 profit 相当于 (2000, Finland) 这个 group 的销量小计
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | India | NULL | 1350 | <-- 这个 profit 相当于 (2000, India) 这个 group 的销量小计
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2000 | USA | NULL | 1575 | <-- 这个 profit 相当于 (2000, USA) 这个 group 的销量小计
| 2000 | NULL | NULL | 4525 | <-- 这个 profit 相当于 (2000, ) 这个 group 的销量小计
| 2001 | Finland | Phone | 10 |
| 2001 | Finland | NULL | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
| 2001 | USA | NULL | 3000 |
| 2001 | NULL | NULL | 3010 |
| NULL | NULL | NULL | 7535 |
+------+---------+------------+--------+
```
## GROUPING SETS & CUBE 的语义
SQL 标准里面,ROLLUP (以及 CUBE),其实是 GROUPING SETS 操作的一种特殊情况。
ROLLUP 只能按照 GROUP BY 表达式的顺序做拓展,比如上文只能按照 group by 表达式的顺序,求 (year, country, product)、(year, country)、(year) 这些 group 的 sum(profit),但是不能倒过来,求 (product, country, year),或者 (country, year) ,的 sum(profit)。而 SQL 标准定义的 GROUPING SETS 操作可以实现这种“任意维度”的 ROLLUP:
```sql
GROUP BY GROUPING SETS (
(product, country, year),
(country, year)
)
```
因此,ROLL UP,以及另外一个 CUBE 操作,其实是 GROUPING SETS 的其中一种特殊情况,可以视为语法糖;比如 SQL 标准里面是如此定义 ROLLUP 的:

ORCALE、PG 这些数据库都有对 GROUPING SETS & CUBE 的支持;MySQL 8.0 版本里面依然没有 GROUPING SETS & CUBE 的实现,因此这里不作展开讨论。具体可以参考 PG 的[文档](https://www.postgresql.org/docs/10/queries-table-expressions.html#QUERIES-GROUPING-SETS)。附录里面有一份 SQL 标准。
## 用 GROUPBY + UNION ALL 实现 ROLLUP
### 等价转换
理解了 ROLLUP 的语义之后,就非常容易想到,其实 ROLLUP 是不同层级的 sub-group 的 union。比如上面的 query,其实等同于:
```sql
SELECT *
FROM
(
SELECT YEAR, country, product, SUM(profit) AS profit
FROM sales GROUP BY YEAR, country, product
UNION ALL
SELECT YEAR, country, NULL, SUM(profit) AS profit
FROM sales GROUP BY YEAR, country
UNION ALL
SELECT YEAR, NULL, NULL, SUM(profit) AS profit
FROM sales GROUP BY YEAR
UNION ALL
SELECT NULL, NULL, NULL, SUM(profit) AS profit
FROM sales
) AS sum_table
ORDER BY YEAR, country, product;
```
换言之,可以通过简单的 rewrite,就可以将 ROLLUP 实现成普通的 GROUPBY + UNION ALL。
### 对 GROUPING 函数的处理
由于 [GROUPING() 表达式](https://dev.mysql.com/doc/refman/8.0/en/miscellaneous-functions.html#function_grouping) 的存在,仅仅是一个单纯的 GROUPBY + UNION ALL 并不足以实现 ROLLUP 的完整语义。
GROUPING() 函数是一个判断函数,用于判断查询结果中的 NULL 到底是“正常查询中产生的 NULL”,还是“由于 ROLLUP 操作产生的 NULL”。这里的正常查询中产生的 NULL,是指数据原本就是 NULL、OUTER JOIN 产生的 NULL,等情况;而 ROLLUP 操作产生的 NULL,是指:
```sql
mysql> SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
+------+---------+------------+--------+
| year | country | product | profit |
+------+---------+------------+--------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | NULL | 1600 | <-- 这个 NULL 就是 ROLLUP 产生的 NULL
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
....
```
只有区别了两种 NULL,才能区别出哪些 sum(profit) 是小计、总计,哪些不是。
对于前者,GROUPING() 函数返回 0,对于后者,返回 1;因此,我们可以使用 GROUPING() 函数得到小计、总计:
```sql
mysql> SELECT
IF(GROUPING(year), 'All years', year) AS year,
IF(GROUPING(country), 'All countries', country) AS country,
IF(GROUPING(product), 'All products', product) AS product,
SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
+-----------+---------------+--------------+--------+
| year | country | product | profit |
+-----------+---------------+--------------+--------+
| 2000 | Finland | Computer | 1500 |
| 2000 | Finland | Phone | 100 |
| 2000 | Finland | All products | 1600 |
| 2000 | India | Calculator | 150 |
| 2000 | India | Computer | 1200 |
| 2000 | India | All products | 1350 |
| 2000 | USA | Calculator | 75 |
| 2000 | USA | Computer | 1500 |
| 2000 | USA | All products | 1575 |
| 2000 | All countries | All products | 4525 |
| 2001 | Finland | Phone | 10 |
| 2001 | Finland | All products | 10 |
| 2001 | USA | Calculator | 50 |
| 2001 | USA | Computer | 2700 |
| 2001 | USA | TV | 250 |
| 2001 | USA | All products | 3000 |
| 2001 | All countries | All products | 3010 |
| All years | All countries | All products | 7535 |
+-----------+---------------+--------------+--------+
```
为了区别这两种 NULL,我们需要在 GROUPBY 的 output grouping columns 增加 GROUPING(out_column) 列。
GROUPING() 函数只能够出现在 SELECT list、HAVING clause 和 ORDER BY clause 里,而且 GROUPING() 函数的直接子节点只能是 output grouping columns(i.e., 只能是出现在 grouping 列中的表达式,而且还不能是子查询和常量表达式),在转换 AST 生成 logical relation tree 的时候(Ast2Opt.cc),我们只需要为每一个 GROUPING(out_column) 列增加一个 output grouping column,然后所有后续的 GROUPING(out_column) 都引用这个 output grouping colum 即可。比如上面的 query 可以这么等价地表示:
```sql
--- 原始 query
SELECT
IF(GROUPING(year), 'All years', year) AS year,
IF(GROUPING(country), 'All countries', country) AS country,
IF(GROUPING(product), 'All products', product) AS product,
SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
--- 原始 plan,所有的 GROUPING(...) 都被替换成对 output grouping column field 的引用
Project | Exprs: IF(temptable1.field0, 'ALL years', temptable1.field1), IF(temptable1.field2, 'All countries', temptable1.field2) ...
GROUPBY | temptable1 | ROLLUP: true | OutputGroupExprs: GROUPING(year), year, GROUPING(country), country, GROUPING(product), product | ...
TABLESCAN | sales
---
然后把上面的
GROUPBY | temptable1 | ROLLUP: true ...
转换成多个 GROUPBY 的 UNION ALL
```
换个角度来看,上面 query 可被视作做如下等价转换:
```sql
--- 原始 query
SELECT
IF(GROUPING(year), 'All years', year) AS year,
IF(GROUPING(country), 'All countries', country) AS country,
IF(GROUPING(product), 'All products', product) AS product,
SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
-- 等价 query (严格来说不是合法的 SQL,这里用来表意)
SELECT year, country, product, profit FROM
(
SELECT
GROUPING(year) as g_year,
GROUPING(country) as g_country,
GROUPING(product) as g_product,
IF(g_year, 'All years', year) AS year,
IF(g_country, 'All countries', country) AS country,
IF(g_product, 'All products', product) AS product,
SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
) as derived0
-- 转换成 UNION 的等价 query (严格来说不是合法的 SQL,这里用来表意)
SELECT year, country, product, profit FROM
(
SELECT
GROUPING(year) as g_year, -- year 在 GROUP BY 中,所以 g_year = 0
GROUPING(country) as g_country, -- country 在 GROUP BY 中,所以 g_country = 0
GROUPING(product) as g_product, -- product 在 GROUP BY 中,所以 g_product = 0
IF(g_year, 'All years', year) AS year,
IF(g_country, 'All countries', country) AS country,
IF(g_product, 'All products', product) AS product,
SUM(profit) AS profit
FROM sales
GROUP BY YEAR, country, product
UNION ALL
SELECT
GROUPING(year) as g_year, -- year 在 GROUP BY 中,所以 g_year = 0
GROUPING(country) as g_country, -- country 在 GROUP BY 中,所以 g_country = 0
GROUPING(product) as g_product, -- product 不在 GROUP BY 中,所以 g_product = 1
IF(g_year, 'All years', year) AS year,
IF(g_country, 'All countries', country) AS country,
IF(g_product, 'All products', product) AS product,
SUM(profit) AS profit
FROM sales
GROUP BY YEAR, country
UNION ALL
SELECT
GROUPING(year) as g_year, -- year 在 GROUP BY 中,所以 g_year = 0
GROUPING(country) as g_country, -- country 不在 GROUP BY 中,所以 g_country = 1
GROUPING(product) as g_product, -- product 不在 GROUP BY 中,所以 g_product = 1
IF(g_year, 'All years', year) AS year,
IF(g_country, 'All countries', country) AS country,
IF(g_product, 'All products', product) AS product,
SUM(profit) AS profit
FROM sales
GROUP BY YEAR
UNION ALL
SELECT
GROUPING(year) as g_year, -- year 不在 GROUP BY 中,所以 g_year = 1
GROUPING(country) as g_country, -- country 不在 GROUP BY 中,所以 g_country = 1
GROUPING(product) as g_product, -- product 不在 GROUP BY 中,所以 g_product = 1
IF(g_year, 'All years', year) AS year,
IF(g_country, 'All countries', country) AS country,
IF(g_product, 'All products', product) AS product,
SUM(profit) AS profit
FROM sales
) AS sum_table
ORDER BY YEAR, country, product;
```
换言之,只需要在 GROUPBY 算子输出 year, country, product 等列的时候,一起把 GRUOPING(year), GROUPING(country), GROUPING(product) 也输出了,然后所有上层对 GRUOPING(year), GROUPING(country), GROUPING(product) 的引用都直接引用这些输出列即可。
### Shared sub-plan
由于上文中说的“等价转换”会将一个 query block 转换成多个 query block 然后再使用 UNION ALL 合并,因此会有一个 shared sub-plan 的问题,这是因为,这种等价转换,会产生 sales 表的多次扫描。
换句话说,如果上文中的例子不是
```sql
SELECT ... FROM sales ... WITH ROLLUP;
```
而是
```sql
SELECT ... FROM ( a_very_compilicated_subquery ) ... WITH ROLLUP;
```
则可能会导致这个 a_very_compilicated_subquery 的多次执行,带来比较大的开销。解决这种 shared sub-plan 的方法可以使用 Table Spool ([https://yuque.antfin-inc.com/nituizi/oncxfu/uegeiy#WddGh](https://yuque.antfin-inc.com/nituizi/oncxfu/uegeiy#WddGh))(其实类似于一个 view 在多个地方被使用,要避免这个 view 被执行多次一样)。
## 使用 EXPAND 算子实现 ROLLUP
ROLLUP 除了可以使用 UNION ALL + GROUP BY 算子实现之外,还可以使用 EXPAND 算子。
### 一个例子
这里先不谈 EXPAND 算子的定义,而是直接展示使用 EXPAND 实现 ROLLUP 的一个例子。
首先假设原始数据:
```sql
+------+---------+------------+--------+
| year | country | product | profit |
+------+---------+------------+--------+
row 1 | 2000 | Finland | Computer | 300 |
row 2 | 2000 | Finland | Computer | 1100 |
row 3 | 2000 | Finland | Phone | 10 |
row 4 | 2000 | USA | Computer | 1500 |
row 5 | 2001 | Finland | TV | 1500 |
row 6 | 2001 | Finland | TV | 2000 |
row 7 | 2001 | USA | TV | 1000 |
+------+---------+------------+--------+
```
对于这样一个带有 ROLLUP 的 SQL:
```sql
SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
```
如前文所述,逻辑上我们需要
1. GROUP BY year, country, product
2. GROUP BY year, country
3. GROUP BY year
4. SCALAR AGGREGATION
四者的 UNION ALL。
为了达到这个目的,我们对 sales 表的每一行做一个“扩张”操作,使之变成:
```sql
+------+---------+------------+--------+
| year | country | product | profit |
+------+---------+------------+--------+
row 1 | 2000 | Finland | Computer | 300 |
| 2000 | Finland | NULL | 300 | --+
| 2000 | NULL | NULL | 300 | +-- These 3 rows are "EXPAND" of row 1
| NULL | NULL | NULL | 300 | --+
row 2 | 2000 | Finland | Computer | 1100 |
| 2000 | Finland | NULL | 1100 | --+
| 2000 | NULL | NULL | 1100 | +-- These 3 rows are "EXPAND" of row 2
| NULL | NULL | NULL | 1100 | --+
row 3 | 2000 | Finland | Phone | 10 |
| 2000 | Finland | NULL | 10 | --+
| 2000 | NULL | NULL | 10 | +-- These 3 rows are "EXPAND" of row3
| NULL | NULL | NULL | 10 | --+
row 4 | 2000 | USA | Computer | 1500 |
| 2000 | USA | NULL | 1500 | --+
| 2000 | NULL | NULL | 1500 | +-- These 3 rows are "EXPAND" of row4
| NULL | NULL | NULL | 1500 | --+
row 5 | 2001 | Finland | TV | 1500 |
| 2001 | Finland | NULL | 1500 | --+
| 2001 | NULL | NULL | 1500 | +-- These 3 rows are "EXPAND" of row5
| NULL | NULL | NULL | 1500 | --+
row 6 | 2001 | Finland | TV | 2000 |
| 2001 | Finland | NULL | 2000 | --+
| 2001 | NULL | NULL | 2000 | +-- These 3 rows are "EXPAND" of row6
| NULL | NULL | NULL | 2000 | --+
row 7 | 2001 | USA | TV | 1000 |
| 2001 | USA | NULL | 1000 | --+
| 2001 | NULL | NULL | 1000 | +-- These 3 rows are "EXPAND" of row7
| NULL | NULL | NULL | 1000 | --+
+------+---------+------------+--------+
```
在这样一张扩张表上,使用原来的 **不带 ROLLUP** 的 query,即可达到与 ROLLUP 的同样效果
```sql
SELECT year, country, product, SUM(profit) AS profit
FROM sales
GROUP BY year, country, product;
```
这是因为,在使用 EXPAND 操作填充了 NULL 之后,等于**新增了这些 GROUP 的聚合 sum(profit)**
1. (year, country, NULL)
2. (year, NULL, NULL)
3. (NULL, NULL, NULL)
而这些新增的 GROUP 的结果,其实就等同于使用 ROLLUP 得到的结果。
### EXPAND 对 GROUPING() 函数的处理
上文提到,SQL 标准里面定义了 GROUPING() 函数,用于区分结果集中的 NULL 到底是由 ROLLUP 操作产生的,还是由其他操作产生的;而且 GROUPING() 函数的子函数只能是 grouping expression(i.e., 在 GROUP BY clause 里的那些表达式)。因此,使用 EXPAND 算子实现 ROLLUP,也必须实现这个功能。
在 EXPAND 算子中处理 GROUPING() 函数,其思想跟上文中**“使用GROUPBY + UNION ALL实现 ROLLUP”**里对 GROUPING() 函数的处理是一样的,就是在 EXPAND 的结果集中,增加 parent relation 需要用到的 GROUPING(output_group_column) 函数的结果。
比如,对于 query
```sql
SELECT
IF(GROUPING(year), 'All years', year) AS year,
IF(GROUPING(country), 'All countries', country) AS country,
IF(GROUPING(product), 'All products', product) AS product,
SUM(profit) AS profit
FROM sales
GROUP BY year, country, product WITH ROLLUP;
```
由于 SELECT list 里面用到 GROUPING(year)、GROUPING(country) 和 GROUPING(product),所以 EXPAND 的结果集里面会包含这三列:
```sql
+------+---------+------------+--------+----------------+-------------------+-------------------+
| year | country | product | profit | GROUPING(year) | GROUPING(country) | GROUPING(product) |
+------+---------+------------+--------+----------------+-------------------+-------------------+
row 1 | 2000 | Finland | Computer | 300 | 0 | 0 | 0 |
| 2000 | Finland | NULL | 300 | 0 | 0 | 1 |
| 2000 | NULL | NULL | 300 | 0 | 1 | 1 |
| NULL | NULL | NULL | 300 | 1 | 1 | 1 |
row 2 | 2000 | Finland | Computer | 1100 | 0 | 0 | 0 |
| 2000 | Finland | NULL | 1100 | 0 | 0 | 1 |
| 2000 | NULL | NULL | 1100 | 0 | 1 | 1 |
| NULL | NULL | NULL | 1100 | 1 | 1 | 1 |
....
```
后续 SELECT list 里的 GROUPING(...) 改成对这些输出列的引用即可:
```sql
SELECT
IF(field_grouping_year, 'All years', year) AS year,
IF(field_grouping_country, 'All countries', country) AS country,
IF(field_grouping_product, 'All products', product) AS product,
SUM(profit) AS profit
FROM sales
...
```
## GROUPBY + UNION ALL v.s. EXPAND
这两种方式实现起来都比较简单:
1. 在不考虑 shared sub-plan 的情况下,GROUP BY + UNION ALL 的方式,只需要在优化器里面实现一个 rewrite 规则即可;
2. EXPAND 没有 shared sub-plan 的问题,只需要实现一个 EXPAND 算子,而这个算子干的事仅仅是对输入的每一行,吐出多行,同时往不同的地方填 NULL 或者 GROUPING()
从实现功能的角度来讲,前者比较简单;从性能的角度来讲,因为填充操作本身非常轻量(只是内存压力大点),而且只需要做一次 GROUP BY,因此后者占优。即使有 shared sub-plan,也是后者占优。
从扩展性来讲,使用 EXPAND 能非常容易地实现出 CUBE 操作(按照排列组合的方式补 NULL),而且对 GROUP BY 的一些操作,比如 AGGR(DISTINCT),也是有用的(使用 EXPAND + 2 * GROUPBY 可以等价实现 GROUPBY + AGGR(DISTINCT) ),因此这个算子后续的扩展性比较强。
- SQL 标准 - 1999 - PART 2 - Foundation
- [ansi-iso-9075-2-1999.pdf](https://yuque.antfin.com/attachments/lark/0/2022/pdf/136668/1641294620352-43cc3ce0-edfa-4273-8079-24c564e026d8.pdf?_lake_card=%7B%22src%22%3A%22https%3A%2F%2Fyuque.antfin.com%2Fattachments%2Flark%2F0%2F2022%2Fpdf%2F136668%2F1641294620352-43cc3ce0-edfa-4273-8079-24c564e026d8.pdf%22%2C%22name%22%3A%22ansi-iso-9075-2-1999.pdf%22%2C%22size%22%3A2651411%2C%22type%22%3A%22application%2Fpdf%22%2C%22ext%22%3A%22pdf%22%2C%22status%22%3A%22done%22%2C%22taskId%22%3A%22u06e6aea1-9575-48d8-936f-9aa1ae224a8%22%2C%22taskType%22%3A%22upload%22%2C%22id%22%3A%22u68191197%22%2C%22card%22%3A%22file%22%7D)




