SQL里的group by分组统计功能很常见,可它的扩展用法多数人并不完全熟悉。本文将介绍group by的扩展用法:rollup、cube、grouping sets。
扩展用法不仅写法简洁,语句运行时不需要多次扫表,在数据库服务端执行也避免了应用层计算,效率极高。
大饼卷一切:rollup
test表有A、B、C、D、E共5列,使用group by rollup(A,B,C),首先对(A,B,C)进行group by,然后对(A,B)进行group by,接着对(A)进行group by,最后对全表进行group by。
参数列从右到左依次减少一列,再进行group by,对于n个参数,有n+1次group by。
select A,B,C,sum(E) from test group by rollup(A,B,C);
rollup写法可展开为如下的union all写法:
select A,B,C,sum(E) from test group by A,B,C
union all
select A,B,null,sum(E) from test group by A,B
union all
select A,null,null,sum(E) from test group by A
union all
select null,null,null,sum(E) from test;
光与暗的交织:cube
cube的每个参数可理解为取值参与分组和不参与分组中的一种形态,所有形态的集合就是cube分组的集合。
使用group by cube(A,B,C),首先会对(A,B,C)进行group by,然后依次是(A,B)、(A,C)、(A)、(B,C)、(B)、(C ),最后对全表进行group by。对于n个参数的cube,有2^n次group by。
select A,B,C,sum(E) from test group by cube(A,B,C);
cube写法可展开为如下的union all写法:
select A,B,C,sum(E) from test group by A,B,C
union all
select A,B,null,sum(E) from test group by A,B
union all
select A,null,C,sum(E) from test group by A,C
union all
select A,null,null,sum(E) from test group by A
union all
select null,B,C,sum(E) from test group by B,C
union all
select null,B,null,sum(E) from test group by B
union all
select null,null,C,sum(E) from test group by C
union all
select null,null,null,sum(E) from test;
专精组合:grouping sets
不如rollup和cube那么花里胡哨,grouping sets对每个参数进行group by,可自定义专属组合:
- group by grouping sets(A,B,C):对(A)、(B)、(C )进行group by
- group by grouping sets((A,B),C):则对(A,B)、(C )进行group by
- group by grouping sets((A,B),()):则对(A,B)、()集合进行group by,()集合是对全表做group by
第三种group by grouping sets((A,B),())专精组合应用较广。
案例实践
创建测试表如下:
create table test(A varchar,B varchar,C varchar,D varchar,E int);
insert into test values('a1','b1','c1','d1',10);
insert into test values('a1','b1','c1','d2',20);
insert into test values('a2','b2','c2','d3',30);
insert into test values('a2','b2','c2','d4',50);
使用grouping sets分组
select A,
(case when B is null then '合计:'
else B || '[' || A || ']' end) sub,
sum(E) total
from test
group by grouping sets((A, B, C),())
order by A;
执行结果如下:
a | sub | total
----+--------+-------
a1 | b1[a1] | 30
a2 | b2[a2] | 80
| 合计: | 110
(3 rows)
当表数据为空时,PG与Oracle在rollup、cube、grouping sets的执行结果上有一点差异:Oracle会返回空行,PG会返回一行。
下面先清空表数据:
truncate table test;
再次执行上面的grouping sets语句,截图如下:





