为什么使用Oracle Exadata Smart Scan?因为智能扫描是非常棒的特性,也是所有数据库中独一无二的特性。
什么是Smart Scan以及为什么用Smart Scan?
简而言之,Smart Scan是Oracle将SQL处理任务卸载到Exadata存储服务器上,存储服务器处理大量数据后将一小部分返回给数据库,那么数据库本身要处理的数据就少了,这对于数据仓库/ ODS等OLAP数据库带来的性能提升是非常大的。
为了使Smart Scan正常运行,SQL必须具有以下要求:
您要查询的段必须存储在Exadata中,其中磁盘组cell.smart_scan_capable参数设置为true。
必须进行全表扫描或索引快速全扫描操作。
该段必须足够大以触发直接路径读取操作。
让我们来看一个例子:
这是我的表SALES,它具有1.9GB的数据和2000万行:
SQL> select segment_name, segment_type, bytes/1024/1024 mb from user_segments where segment_name='SALES' SEGMENT_NAME SEGMENT_TYPE MB -------------------- ------------------ ---------- SALES TABLE 1984 SQL> select count(*) from sales COUNT(*) ---------- 20000000 Elapsed: 00:00:03.97
为了使用smart scan功能,首先确认cell_offload_processing参数设置为true:
SQL> show parameter cell_offload_processing NAME TYPE VALUE ------------------------------------ ----------- ------------------------ cell_offload_processing boolean TRUE --也可以指定hint不让SQL走smart scan SQL> select /*+ OPT_PARAM('cell_offload_processing' 'false') */ max(ORDER_DATE) from SALES; MAX(ORDER --------- 28-SEP-19 Elapsed: 00:00:16.52
上面我们可以看到,没有使用smart scan的执行时间为16.52秒。
当我查询未使用smart scan功能的会话的统计信息时,从单元(存储服务器)到数据库服务器返回数据1901MB,物理读1901MB:
SQL> select s.name, m.value/1024/1024 mb from v$mystat m, v$sysstat s where m.statistic#=s.statistic# and (s.name like '%physical IO%' or s.name like '%optimized%' or s.name like 'physical%total bytes'); NAME MB ---------------------------------------------------------------- ---------- physical read requests optimized 7.6294E-06 physical read total bytes optimized .921875 physical read total bytes 1901.14063 physical write requests optimized 0 physical write total bytes optimized 0 physical write total bytes 0 cell physical IO interconnect bytes 1901.14063 cell physical IO bytes saved during optimized file creation 0 cell physical IO bytes saved during optimized RMAN file restore 0 cell physical IO bytes eligible for predicate offload 0 cell physical IO bytes saved by storage index 0 cell physical IO bytes sent directly to DB node to balance CPU 0 cell physical IO interconnect bytes returned by smart scan 0 cell simulated physical IO bytes eligible for predicate offload 0 cell simulated physical IO bytes returned by predicate offload 0 15 rows selected. Elapsed: 00:00:00.01
现在,我断开连接并重新连接以重置会话统计信息,并且去掉hint,确保SQL能使用smart scan:
SQL> select max(ORDER_DATE) from SALES; MAX(ORDER --------- 28-SEP-19 Elapsed: 00:00:04.36
我们可以看到时间减少了四倍。
那是因为使用了smart scan功能,可以通过再次查询会话统计信息来确认:
SQL> select s.name, m.value/1024/1024 mb from v$mystat m, v$sysstat s where m.statistic#=s.statistic# and (s.name like '%physical IO%' or s.name like '%optimized%' or s.name like 'physical%total bytes'); NAME MB ---------------------------------------------------------------- ---------- physical read requests optimized 0 physical read total bytes optimized 0 physical read total bytes 1901.14063 physical write requests optimized 0 physical write total bytes optimized 0 physical write total bytes 0 cell physical IO interconnect bytes 275.122597 cell physical IO bytes saved during optimized file creation 0 cell physical IO bytes saved during optimized RMAN file restore 0 cell physical IO bytes eligible for predicate offload 1901.14063 cell physical IO bytes saved by storage index 0 cell physical IO bytes sent directly to DB node to balance CPU 0 cell physical IO interconnect bytes returned by smart scan 275.122597 cell simulated physical IO bytes eligible for predicate offload 0 cell simulated physical IO bytes returned by predicate offload 0 15 rows selected. Elapsed: 00:00:00.00
我们看到物理读的总数是相同的,但是存储单元返回的数据只有275MB,这与smart scan返回的大小相同。这就是查询运行更快的原因;存储服务器处理数据并仅返回对我的查询重要的内容。smart scan功能具有“列过滤”和“谓词推入”的优势,只返回实际查询的行和列中的数据。
另外,前面提到必须打开直接路径读,数据库才能使用smart scan功能。让我们看一个示例,当我通过将隐藏参数_serial_direct_read更改为never来禁用Direct Path Read时:
SQL> alter session set "_serial_direct_read" = never; Session altered. Elapsed: 00:00:00.00 SQL> select max(ORDER_DATE) from SALES; MAX(ORDER --------- 28-SEP-19 Elapsed: 00:00:25.52
我们可以看到,它比第一次执行花费更多的时间,而且所有本应直接传递到会话PGA的数据都传递给了Buffer Cache:
SQL> select s.name, m.value/1024/1024 mb from v$mystat m, v$sysstat s where m.statistic#=s.statistic# and (s.name like '%physical IO%' or s.name like '%optimized%' or s.name like 'physical%total bytes'); NAME MB ---------------------------------------------------------------- ---------- physical read requests optimized .000020027 physical read total bytes optimized .9609375 physical read total bytes 1901.23438 physical write requests optimized 0 physical write total bytes optimized 0 physical write total bytes 0 cell physical IO interconnect bytes 1901.23438 cell physical IO bytes saved during optimized file creation 0 cell physical IO bytes saved during optimized RMAN file restore 0 cell physical IO bytes eligible for predicate offload 0 cell physical IO bytes saved by storage index 0 cell physical IO bytes sent directly to DB node to balance CPU 0 cell physical IO interconnect bytes returned by smart scan 0 cell simulated physical IO bytes eligible for predicate offload 0 cell simulated physical IO bytes returned by predicate offload 0 15 rows selected. Elapsed: 00:00:00.00
所有,smart scan是非常棒的特性和技术,能够非常有效的提升大数据量的查询SQL。