扩展GROUP BY概述
Oracle扩展GROUP BY允许使用SQL语句对数据汇总结果进行多维展现,从而生成复杂的报表,为决策者提供有效的数据支持。主要表现在:
ROLLUP、CUBE、GROUPING SETS扩展GROUP BY子句提供了不同多维分组统计功能。 3个扩展分组函数:GROUPING、GROUPING_ID、GROUP_ID提供扩展GROUP BY的辅助功能:提供区别结果行属于哪个分组级别,区分NULL值,建立有意义的报表,对汇总结果排序,过滤结果行等功能。 对扩展GROUP BY允许按重复列分组、组合列分组、部分分组、连接分组等复杂功能,另外GROUPING SETS可以接受CUBE、ROLLUP操作作为参数,这些功能使扩展GROUP BY更加强大。
通过SQL语句对上述功能的组合使用,就可以实现制作复杂的多维分析报表的功能。针对不同维度的报表统计,使用扩展GROUP BY的强大功能很容易实现,而且SQL编写更简单,性能也比同等的UNION ALL更好,在后面的内容中,我们会见识到强大的扩展GROUP BY功能。
CUBE
C0n + C1n + C2n + … + Cnn = 2n
2.1 CUBE分组
SELECT … GROUP BY CUBE(grouping_column_reference_list)
SELECT a.dname,b.job, SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY CUBE(a.dname,b.job);
DNAME JOB SUM_SAL
---------------------------- ------------------ ----------
24925
CLERK 3050
ANALYST 3000
MANAGER 8275
SALESMAN 5600
PRESIDENT 5000
SALES 9400
SALES CLERK 950
SALES MANAGER 2850
SALES SALESMAN 5600
RESEARCH 6775
RESEARCH CLERK 800
RESEARCH ANALYST 3000
RESEARCH MANAGER 2975
ACCOUNTING 8750
ACCOUNTING CLERK 1300
ACCOUNTING MANAGER 2450
ACCOUNTING PRESIDENT 5000
已选择18行。
执行计划
----------------------------------------------------------
Plan hash value: 2432972551
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 12 | 336 | 5 (20)| 00:00:01 |
| 1 | SORT GROUP BY | | 12 | 336 | 5 (20)| 00:00:01 |
| 2 | GENERATE CUBE | | 12 | 336 | 5 (20)| 00:00:01 |
| 3 | SORT GROUP BY | | 12 | 336 | 5 (20)| 00:00:01 |
| 4 | NESTED LOOPS | | 12 | 336 | 4 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL | EMP | 12 | 180 | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 7 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
7 - access("A"."DEPTNO"="B"."DEPTNO")
CUBE对两列操作,对应4个分组级别,最终对各种可能性分组进行统计,获得多维度更加精细的数据统计结果。
2.2 部分CUBE分组
SELECT a.dname,b.job, SUM(b.sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY a.dname,CUBE(b.job);
DNAME JOB SUM_SAL
---------------------------- ------------------ ----------
SALES 9400
SALES CLERK 950
SALES MANAGER 2850
SALES SALESMAN 5600
RESEARCH 6775
RESEARCH CLERK 800
RESEARCH ANALYST 3000
RESEARCH MANAGER 2975
ACCOUNTING 8750
ACCOUNTING CLERK 1300
ACCOUNTING MANAGER 2450
ACCOUNTING PRESIDENT 5000
已选择12行。
CUBE像ROLLUP一样,可以实现多维数组分析统计工作,而且CUBE是对所有可能性的组合情况进行统计,从而生成交叉报表,CUBE分组级别更多,结果更精细,从而为决策者提供强大的数据支撑,为实现灵活的报表提供保障。
GROUPING SETS实现小计
3.1 GROUPING SETS分组
SELECT … GROUP BY GROUPING SETS(grouping_column_reference_list)
SELECT to_char(b.hiredate,'yyyy') hire_year,a.dname,b.job, SUM(sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY GROUPING SETS(to_char(b.hiredate,'yyyy'),a.dname,b.job);
HIRE_YEA DNAME JOB SUM_SAL
-------- ---------------------------- ------------------ ----------
CLERK 3050
SALESMAN 5600
PRESIDENT 5000
MANAGER 8275
ANALYST 3000
ACCOUNTING 8750
RESEARCH 6775
SALES 9400
1980 800
1982 1300
1981 22825
已选择11行。
执行计划
----------------------------------------------------------------------------------------------
Plan hash value: 18386332
-------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 12 | 384 | 15 (20)| 00:00:01 |
| 1 | TEMP TABLE TRANSFORMATION | | | | | |
| 2 | LOAD AS SELECT | | | | | |
| 3 | NESTED LOOPS | | 12 | 432 | 4 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | EMP | 12 | 276 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 |
|* 6 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| 00:00:01 |
| 7 | LOAD AS SELECT | | | | | |
| 8 | HASH GROUP BY | | 1 | 19 | 3 (34)| 00:00:01 |
| 9 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6612_4AC9B4F | 1 | 19 | 2 (0)| 00:00:01 |
| 10 | LOAD AS SELECT | | | | | |
| 11 | HASH GROUP BY | | 1 | 22 | 3 (34)| 00:00:01 |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6612_4AC9B4F | 1 | 22 | 2 (0)| 00:00:01 |
| 13 | LOAD AS SELECT | | | | | |
| 14 | HASH GROUP BY | | 1 | 17 | 3 (34)| 00:00:01 |
| 15 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6612_4AC9B4F | 1 | 17 | 2 (0)| 00:00:01 |
| 16 | VIEW | | 1 | 32 | 2 (0)| 00:00:01 |
| 17 | TABLE ACCESS FULL | SYS_TEMP_0FD9D6613_4AC9B4F | 1 | 32 | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("SYS_TBL_$2$"."DEPTNO"="SYS_TBL_$1$"."DEPTNO")
3.2 部分GROUPING SETS分组
SELECT a.dname,to_char(b.hiredate,'yyyy') hire_year,b.job, SUM(sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY a.dname, GROUPING SETS(to_char(b.hiredate,'yyyy'),b.job);
DNAME HIRE_YEA JOB SUM_SAL
---------------------------- -------- ------------------ ----------
SALES MANAGER 2850
SALES CLERK 950
ACCOUNTING MANAGER 2450
ACCOUNTING PRESIDENT 5000
ACCOUNTING CLERK 1300
SALES SALESMAN 5600
RESEARCH MANAGER 2975
RESEARCH ANALYST 3000
RESEARCH CLERK 800
RESEARCH 1981 5975
SALES 1981 9400
ACCOUNTING 1981 7450
ACCOUNTING 1982 1300
RESEARCH 1980 800
已选择14行。
3.3 CUBE、ROLLUP作为GROUPING SETS的参数
SELECT a.dname,b.job, SUM(sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY GROUPING SETS(ROLLUP(a.dname),ROLLUP(b.job));
DNAME JOB SUM_SAL
---------------------------- ------------------ ----------
CLERK 3050
SALESMAN 5600
PRESIDENT 5000
MANAGER 8275
ANALYST 3000
ACCOUNTING 8750
RESEARCH 6775
SALES 9400
24925
24925
SELECT a.dname,NULL job, SUM(sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY ROLLUP(a.dname)
UNION ALL
SELECT NULL dname,b.job, SUM(sal) sum_sal
FROM dept a,emp b
WHERE a.deptno = b.deptno
GROUP BY ROLLUP(b.job);
GROUPING SETS总结

本文作者:丁 俊(上海新炬中北团队)
本文来源:“IT那活儿”公众号

文章转载自IT那活儿,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




