暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

Oracle数据库中如何用好并行这件利器

甲骨文云技术 2021-01-21
1559

本文以Oracle数据库为例,说明在计算机的世界里,如果能用好并行这件利器,可以大幅提升性能;如果没用好,轻则达不到预期性能,重则会连带影响到整个系统的可用性,也正如本文标题所言:并行,想说爱你不容易。
下面,我们具体来看一些真实的测试场景,演示环境:Oracle RAC 11.2.0.4(3 nodes)。


1.并行insert无效果

测试用例:

    create table Z_OBJ tablespace TBS_1 as select * from dba_objects ;
    insert *+ append parallel(t0,16) */ into Z_OBJ t0 select *+ parallel(t1,16) */ * from Z_OBJ t1;
    commit;
    --多次执行并查询大小
    select owner,segment_name,bytes/1024/1024 from dba_segments where segment_name='Z_OBJ';

    根据测试用例执行,发现实际并没有合理使用到并行度,效率很差(监控到I/O写入每秒只有百兆级别,正常应该是每秒千兆级别)。
    查看执行计划:

      SQL> explain plan for insert *+ append parallel(t0,16) */ into Z_OBJ t0 select *+ parallel(t1,16) */ * from Z_OBJ t1;


      Explained.


      SQL> set lines 1000 pages 200
      SQL> select * from table(dbms_xplan.display());


      PLAN_TABLE_OUTPUT
      ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
      Plan hash value: 1886916412


      ---------------------------------------------------------------------------------------------------------------
      | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
      ---------------------------------------------------------------------------------------------------------------
      | 0 | INSERT STATEMENT | | 91M| 17G| 23842 (1)| 00:00:01 | | | |
      | 1 | LOAD AS SELECT | Z_OBJ | | | | | | | |
      | 2 | PX COORDINATOR | | | | | | | | |
      | 3 | PX SEND QC (RANDOM)| :TQ10000 | 91M| 17G| 23842 (1)| 00:00:01 | Q1,00 | P->S | QC (RAND) |
      | 4 | PX BLOCK ITERATOR | | 91M| 17G| 23842 (1)| 00:00:01 | Q1,00 | PCWC | |
      | 5 | TABLE ACCESS FULL| Z_OBJ | 91M| 17G| 23842 (1)| 00:00:01 | Q1,00 | PCWP | |
      ---------------------------------------------------------------------------------------------------------------


      Note
      -----
      - dynamic sampling used for this statement (level=2)


      16 rows selected.

      可以看到,只有查询部分用到了并行,insert部分并没有使用到并行,尽管我们指定了并行度的hint。
      此时需要显示启用DML的并行:

        alter session enable parallel dml;

        再次查看执行计划,发现insert部分已经可以使用到并行:

          SQL> explain plan for insert *+ append parallel(t0,16) */ into Z_OBJ t0 select *+ parallel(t1,16) */ * from Z_OBJ t1;


          Explained.


          SQL> select * from table(dbms_xplan.display());


          PLAN_TABLE_OUTPUT
          ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
          Plan hash value: 2135351304


          ---------------------------------------------------------------------------------------------------------------
          | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | TQ |IN-OUT| PQ Distrib |
          ---------------------------------------------------------------------------------------------------------------
          | 0 | INSERT STATEMENT | | 91M| 17G| 23842 (1)| 00:00:01 | | | |
          | 1 | PX COORDINATOR | | | | | | | | |
          | 2 | PX SEND QC (RANDOM) | :TQ10000 | 91M| 17G| 23842 (1)| 00:00:01 | Q1,00 | P->S | QC (RAND) |
          | 3 | LOAD AS SELECT | Z_OBJ | | | | | Q1,00 | PCWP | |
          | 4 | PX BLOCK ITERATOR | | 91M| 17G| 23842 (1)| 00:00:01 | Q1,00 | PCWC | |
          | 5 | TABLE ACCESS FULL| Z_OBJ | 91M| 17G| 23842 (1)| 00:00:01 | Q1,00 | PCWP | |
          ---------------------------------------------------------------------------------------------------------------


          Note
          -----
          - dynamic sampling used for this statement (level=2)


          16 rows selected.

          小结1:不仅仅是insert操作,其他DML操作的并行,都需要显示启用DML的并行:alter session enable parallel dml;
          需要注意的是,虽然这里的并行DML测试性能提升的效果显著,但实际生产是需要慎重考虑是否使用并行DML的,因为要考虑TM锁的影响。之前就曾遇到过某客户在开启并行DML的同时,应用程序又大量并行调用,导致严重的TM锁等待,最终还是取消并行DML消除TM锁等待,反而提升了性能。


          2.并行只在本地节点

          默认情况下,并行操作会分发到RAC的各个节点,而很多生产数据库下,我们并不希望并行跨节点执行。此时就需要设置参数:

            alter system set parallel_force_local=true sid='*';

            这样执行插入操作,在各个节点进行dstat监控,就会发现只有本地节点有每秒几百M的写入操作,说明parallel_force_local=true参数动态生效了:

              ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
              usr sys idl wai hiq siq| read writ| recv send| in out | int csw
              1 0 98 0 0 0| 163M 326M| 74k 61k| 0 0 | 17k 51k
              2 0 98 0 0 0| 164M 325M| 479k 29k| 0 0 | 18k 51k
              2 0 98 0 0 0| 165M 330M| 833k 1347k| 0 0 | 21k 54k
              1 0 98 0 0 0| 167M 336M| 47k 58k| 0 0 | 18k 52k
              1 0 98 0 0 0| 173M 340M| 507k 31k| 0 0 | 18k 53k
              1 0 98 0 0 0| 176M 354M| 77k 546k| 0 0 | 18k 54k
              1 0 98 0 0 0| 168M 341M| 43k 44k| 0 0 | 18k 53k
              2 0 98 0 0 0| 177M 353M| 32k 42k| 0 0 | 18k 54k
              2 0 98 0 0 0| 183M 362M| 65k 67k| 0 0 | 17k 54k
              1 0 98 0 0 0| 163M 329M| 44k 44k| 0 0 | 16k 49k
              1 0 98 0 0 0| 165M 328M| 39k 33k| 0 0 | 18k 51k
              1 0 98 0 0 0| 161M 323M| 43k 56k| 0 0 | 17k 50k
              2 0 98 0 0 0| 182M 360M| 44k 49k| 0 0 | 18k 55k
              1 0 98 0 0 0| 166M 331M| 34k 52k| 0 0 | 18k 51k
                2   0  98   0   0   0| 162M  327M|  25k   25k|   0     0 |  18k   51k

              此时再结合1中的经验,启用dml的并行,可以发现效率大幅提升,本地节点有每秒几千M的写入操作:

                ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
                usr sys idl wai hiq siq| read writ| recv send| in out | int csw
                8 1 90 1 0 0|2927M 5882M| 771k 140k| 0 0 | 107k 157k
                9 1 90 1 0 0|3134M 6266M| 759k 1484k| 0 0 | 108k 161k
                8 1 90 1 0 0|3021M 6042M| 154k 178k| 0 0 | 104k 155k
                9 1 90 0 0 0|3000M 6004M| 259k 266k| 0 0 | 106k 156k
                9 1 90 0 0 0|2875M 5754M| 129k 142k| 0 0 | 102k 150k
                9 1 90 0 0 0|3082M 6160M| 127k 135k| 0 0 | 108k 158k
                9 1 90 0 0 0|3044M 6095M| 655k 642k| 0 0 | 107k 158k
                9 1 89 0 0 0|2961M 5923M| 125k 134k| 0 0 | 105k 153k
                9 1 90 0 0 0|2875M 5747M| 137k 168k| 0 0 | 102k 150k
                9 1 90 0 0 0|3156M 6312M| 127k 135k| 0 0 | 109k 163k
                9 1 90 1 0 0|3144M 6291M| 130k 138k| 0 0 | 109k 162k
                9 1 90 1 0 0|3058M 6117M| 125k 143k| 0 0 | 106k 157k
                9 1 90 0 0 0|3138M 6279M| 132k 139k| 0 0 | 108k 161k
                9 1 90 0 0 0|3039M 6074M| 141k 143k| 0 0 | 106k 156k
                  4   1  95   0   0   0|1237M 2615M| 986k   61k|   0     0 |  68k   90k

                小结2:可设置参数parallel_force_local=true强制让并行操作在本地节点执行,这是个动态参数。


                3.增大并行度的效果

                创建大表Z_OBJ_3,使用32个并行度插入数据:

                  create table Z_OBJ_3 tablespace TBS_3 as select * from dba_objects ;


                  insert *+ append parallel(t0,32) */ into Z_OBJ_3 t0 select *+ parallel(t1,32) */ * from Z_OBJ t1;
                  commit;

                  实际花费25s的时间插入完成,并行度提升性能也进一步提升:

                    SQL> insert *+ append parallel(t0,32) */ into Z_OBJ_3 t0 select *+ parallel(t1,32) */ * from Z_OBJ t1;


                    867092478 rows created.


                    Elapsed: 00:00:25.52

                    此时dstat监控,每秒写操作达到8000M+:

                      ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
                      usr sys idl wai hiq siq| read writ| recv send| in out | int csw
                      0 0 100 0 0 0|2489k 1036k| 0 0 | 0 0 | 10k 9766
                      13 1 83 2 0 0|3755M 7542M| 699k 1055k| 0 0 | 143k 210k
                      12 2 84 2 0 0|3634M 7407M| 447k 453k| 0 0 | 147k 209k
                      13 1 83 2 0 0|4202M 8402M| 535k 553k| 0 0 | 141k 215k
                      14 1 82 2 0 0|4168M 8339M| 539k 556k| 0 0 | 144k 214k
                      13 1 82 2 0 1|4109M 8224M| 546k 552k| 0 0 | 142k 210k
                      13 1 83 3 0 0|4209M 8419M| 311k 327k| 0 0 | 138k 213k
                      13 1 83 3 0 0|4237M 8483M| 114k 114k| 0 0 | 136k 210k
                      9 1 88 1 0 1|2709M 5703M| 64k 65k| 0 0 | 156k 203k
                      14 1 82 2 0 0|4189M 8383M| 91k 87k| 0 0 | 136k 205k
                      13 1 82 3 0 0|4237M 8478M| 95k 101k| 0 0 | 136k 208k
                      14 1 82 2 0 0|4242M 8485M| 95k 109k| 0 0 | 139k 208k
                      14 1 82 3 0 0|4202M 8412M| 835k 103k| 0 0 | 137k 208k
                      14 1 82 2 0 0|4288M 8563M|1143k 1930k| 0 0 | 139k 211k
                       14   1  82   2   0   0|4229M 8477M| 101k   97k|   0     0 | 138k  209k

                      再创建大表Z_OBJ_4,使用64个并行度插入数据:

                        create table Z_OBJ_4 tablespace TBS_4 as select * from dba_objects ;


                        insert *+ append parallel(t0,64) */ into Z_OBJ_4 t0 select *+ parallel(t1,64) */ * from Z_OBJ t1;
                        commit;

                        实际花费28s的时间插入完成,发现即使在CPU足够的前提下,并行度提升没有性能提升,说明I/O已达到瓶颈:

                          SQL> insert *+ append parallel(t0,64) */ into Z_OBJ_4 t0 select *+ parallel(t1,64) */ * from Z_OBJ t1;


                          867092478 rows created.


                          Elapsed: 00:00:28.61

                          此时dstat监控,每秒写操作接近8000M:

                            ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
                            usr sys idl wai hiq siq| read writ| recv send| in out | int csw
                            14 2 81 4 0 1|3844M 7711M|3571k 2567k| 0 0 | 130k 197k
                            12 1 83 3 0 0|3810M 7602M| 535k 1885k| 0 0 | 115k 175k
                            13 1 82 3 0 0|3799M 7607M| 603k 654k| 0 0 | 116k 174k
                            14 1 82 3 0 0|3810M 7638M| 550k 602k| 0 0 | 119k 176k
                            13 1 83 3 0 0|3766M 7531M| 630k 651k| 0 0 | 114k 171k
                            13 1 81 4 0 0|3804M 7608M| 620k 669k| 0 0 | 117k 175k
                            13 1 82 3 0 0|3792M 7585M| 581k 616k| 0 0 | 117k 176k
                            13 1 82 3 0 0|3767M 7522M| 561k 612k| 0 0 | 116k 173k
                            12 1 82 3 0 0|3659M 7343M| 553k 601k| 0 0 | 115k 170k
                            13 1 82 3 0 0|3659M 7340M| 609k 668k| 0 0 | 121k 179k
                            13 1 82 3 0 0|3746M 7502M| 609k 644k| 0 0 | 117k 174k
                            13 1 82 3 0 0|3822M 7648M| 675k 773k| 0 0 | 118k 178k
                            13 1 83 3 0 0|3769M 7541M|1191k 632k| 0 0 | 115k 173k
                            13 1 83 3 0 0|3864M 7725M|1749k 2533k| 0 0 | 117k 177k
                             13   1  82   3   0   0|3741M 7481M| 613k  655k|   0     0 | 116k  172k

                            小结3:一般增大并行度可以提升操作返回速度,但同时也受限于整体的系统I/O能力。


                            4.所有节点并行测试

                            同时测试RAC的3个节点:

                              --节点1
                              set time on
                              set timing on
                              drop table Z_OBJ_2 purge;
                              create table Z_OBJ_2 tablespace TBS_2 as select * from dba_objects where 1=2;
                              alter session enable parallel dml;
                              --INSERT Z_OBJ_2
                              insert *+ append parallel(t0,32) */ into Z_OBJ_2 t0 select *+ parallel(t1,32) */ * from Z_OBJ t1;
                              commit;




                              --节点2
                              set time on
                              set timing on
                              drop table Z_OBJ_3 purge;
                              create table Z_OBJ_3 tablespace TBS_3 as select * from dba_objects where 1=2;
                              alter session enable parallel dml;
                              --INSERT Z_OBJ_3
                              insert *+ append parallel(t0,32) */ into Z_OBJ_3 t0 select *+ parallel(t1,32) */ * from Z_OBJ t1;
                              commit;


                              --节点3
                              set time on
                              set timing on
                              drop table Z_OBJ_4 purge;
                              create table Z_OBJ_4 tablespace TBS_4 as select * from dba_objects where 1=2;
                              alter session enable parallel dml;
                              --INSERT Z_OBJ_4
                              insert *+ append parallel(t0,32) */ into Z_OBJ_4 t0 select *+ parallel(t1,32) */ * from Z_OBJ t1;
                              commit;

                              各节点同时观察插入耗时(单个执行时间变长,整体的I/O瓶颈导致):

                                15:26:06 SQL> insert *+ append parallel(t0,32) */ into Z_OBJ_2 t0 select *+ parallel(t1,32) */ * from Z_OBJ t1;


                                867092478 rows created.


                                Elapsed: 00:00:48.53


                                15:25:23 SQL> insert *+ append parallel(t0,32) */ into Z_OBJ_3 t0 select *+ parallel(t1,32) */ * from Z_OBJ t1;


                                867092478 rows created.


                                Elapsed: 00:00:45.84


                                15:25:21 SQL> insert *+ append parallel(t0,32) */ into Z_OBJ_4 t0 select *+ parallel(t1,32) */ * from Z_OBJ t1;


                                867092478 rows created.


                                Elapsed: 00:00:47.63

                                各节点dstat同时观察:

                                  --node1:
                                  ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
                                  usr sys idl wai hiq siq| read writ| recv send| in out | int csw
                                  7 1 82 9 0 0|2110M 4223M| 169k 230k| 0 0 | 78k 122k
                                  7 1 82 9 0 0|2107M 4209M| 176k 178k| 0 0 | 79k 123k
                                  9 1 81 9 0 0|2614M 5237M| 190k 195k| 0 0 | 96k 148k
                                  8 1 81 10 0 0|2171M 4339M| 195k 232k| 0 0 | 84k 127k
                                  7 1 83 9 0 0|1975M 3947M| 220k 184k| 0 0 | 76k 117k
                                  7 1 82 9 0 0|2051M 4099M| 166k 169k| 0 0 | 78k 121k
                                  7 1 82 10 0 0|2059M 4121M|1193k 170k| 0 0 | 79k 121k
                                  7 1 83 9 0 0|2001M 4011M| 384k 1463k| 0 0 | 76k 118k
                                  3 0 93 4 0 0| 802M 1570M| 148k 144k| 0 0 | 36k 53k
                                  2 0 96 2 0 0| 355M 886M| 113k 137k| 0 0 | 47k 61k
                                  8 1 82 9 0 0|2122M 4255M| 189k 202k| 0 0 | 79k 123k
                                  7 1 83 9 0 0|2040M 4069M| 162k 164k| 0 0 | 76k 119k
                                  8 1 82 9 0 0|2208M 4436M| 839k 843k| 0 0 | 83k 130k
                                  9 1 83 7 0 0|2506M 5037M| 305k 307k| 0 0 | 94k 145k
                                  4 0 93 2 0 0|1098M 2273M| 218k 233k| 0 0 | 49k 72k


                                  --node2:
                                  ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
                                  usr sys idl wai hiq siq| read writ| recv send| in out | int csw
                                  6 1 82 11 0 0|2152M 4312M| 221k 224k| 0 0 | 79k 130k
                                  7 1 82 10 0 0|2226M 4447M| 216k 218k| 0 0 | 81k 133k
                                  10 1 81 8 0 0|2775M 5559M| 244k 214k| 0 0 | 100k 159k
                                  7 1 83 9 0 0|2110M 4205M| 220k 221k| 0 0 | 77k 126k
                                  7 1 83 10 0 0|2104M 4219M| 231k 266k| 0 0 | 76k 126k
                                  7 1 83 10 0 0|2158M 4311M| 207k 207k| 0 0 | 78k 129k
                                  7 1 83 10 0 0|2103M 4214M| 877k 849k| 0 0 | 76k 126k
                                  7 1 82 10 0 0|2109M 4214M| 207k 209k| 0 0 | 76k 124k
                                  10 1 81 8 0 0|2934M 5866M| 212k 216k| 0 0 | 102k 165k
                                  7 1 82 10 0 0|2281M 4551M| 207k 227k| 0 0 | 82k 133k
                                  7 1 83 10 0 0|2136M 4281M| 206k 205k| 0 0 | 79k 128k
                                  6 1 84 10 0 0|1951M 3940M| 313k 341k| 0 0 | 73k 120k
                                  4 0 92 4 0 0|1044M 2250M| 672k 642k| 0 0 | 56k 88k
                                  0 0 99 0 0 0| 50M 116M| 258k 276k| 0 0 | 11k 14k
                                  0 0 100 0 0 0| 323k 58k| 208k 202k| 0 0 |8385 10k


                                  --node3:
                                  ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system--
                                  usr sys idl wai hiq siq| read writ| recv send| in out | int csw
                                  6 1 83 10 0 0|2144M 4274M| 149k 156k| 0 0 | 77k 129k
                                  6 1 82 11 0 0|2223M 4452M| 165k 189k| 0 0 | 80k 133k
                                  6 1 82 11 0 0|2203M 4404M| 189k 198k| 0 0 | 79k 131k
                                  7 0 83 10 0 0|2119M 4233M| 140k 211k| 0 0 | 75k 125k
                                  7 1 83 10 0 0|2156M 4311M| 870k 731k| 0 0 | 78k 128k
                                  7 1 82 10 0 0|2157M 4318M| 143k 149k| 0 0 | 79k 129k
                                  7 1 83 9 0 0|2172M 4344M| 165k 170k| 0 0 | 79k 131k
                                  7 1 83 10 0 0|2139M 4283M| 140k 141k| 0 0 | 78k 125k
                                  7 1 83 10 0 0|2145M 4303M| 143k 151k| 0 0 | 78k 129k
                                  7 1 83 10 0 0|2121M 4226M| 146k 450k| 0 0 | 76k 126k
                                  7 1 82 10 0 0|2442M 4884M| 460k 155k| 0 0 | 87k 144k
                                  6 0 83 10 0 0|2083M 4177M| 217k 156k| 0 0 | 76k 126k
                                  4 0 88 7 0 0|1445M 2863M| 130k 126k| 0 0 | 54k 89k
                                  2 0 94 3 0 0| 577M 1341M| 121k 124k| 0 0 | 53k 73k
                                    7   1  82  10   0   0|2219M 4437M| 157k  193k|   0     0 |  81k  133k

                                  测试到这里,还有一个疑惑,为什么不用create?我们来按测试用例试下create操作,很不如人意,只有300多M的写入速度,将近10分钟才创建完成。而上面的并行insert则有8000多M的写入速度,20s+就可以插入完成:

                                    drop table Z_OBJ_2 purge;
                                    create table Z_OBJ_2 tablespace TBS_2 as select *+ parallel(t1,32) */ * from Z_OBJ t1;
                                    Elapsed: 00:09:19.52


                                    15:49:58 SQL> insert *+ append parallel(t0,32) */ into Z_OBJ_2 t0 select *+ parallel(t1,32) */ * from Z_OBJ t1;
                                    867092478 rows created.
                                    Elapsed: 00:00:25.24

                                    很显然,create操作相当于没有用到并行,如何让create操作也用到并行度呢?这就需要将SQL语句改写如下:

                                      --使用到并行,26s就完成了百G大小表的创建:
                                      drop table Z_OBJ_2 purge;
                                      create table Z_OBJ_2 tablespace TBS_2 parallel(degree 32) as select *+ parallel(t1,32) */ * from Z_OBJ t1;
                                      Elapsed: 00:00:26.76


                                      --使用到并行+nologging,差距不大,只需25s就完成了百G大小表的创建:
                                      drop table Z_OBJ_2 purge;
                                      create table Z_OBJ_2 tablespace TBS_2 parallel(degree 32) nologging as select *+ parallel(t1,32) */ * from Z_OBJ t1;
                                      Elapsed: 00:00:25.77

                                      小结4:我们在使用并行的时候,尤其要注意是否各部分都有效的使用到了并行。各节点同时并行操作的整体效率,同样受限于整体的系统I/O能力。


                                      5.RMAN多通道的并行

                                      现象:RMAN分配多个通道,但实际无法使用到并行。
                                      构建测试用例:

                                        create tablespace dbs_d_test;
                                        alter tablespace dbs_d_test add datafile; --这里是11
                                        alter tablespace dbs_d_test add datafile; --这里是12
                                        alter tablespace dbs_d_test add datafile; --这里是13


                                        alter database datafile 11,12,13 resize 1G;


                                        5.1 RMAN多通道但未用到并行

                                        使用RMAN备份

                                          run {
                                          allocate channel c1 device type disk;
                                          allocate channel c2 device type disk;
                                          allocate channel c3 device type disk;


                                          backup as copy datafile 11 format '/tmp/incr/copy11.bak';
                                          backup as copy datafile 12 format '/tmp/incr/copy12.bak';
                                          backup as copy datafile 13 format '/tmp/incr/copy13.bak';


                                          release channel c1;
                                          release channel c2;
                                          release channel c3;
                                          }

                                          使用下面SQL查询长操作:

                                            select inst_id, sid, username, opname, target, sofar, totalwork, sofar * 100 / totalwork from gv$session_longops where sofar < totalwork;

                                            发现上面这种备份写法,虽然分配了多个通道,但实际观察并没有使用到并行。3个文件的备份是串行操作的。这点从上面的长操作中可以看到,同时从RMAN输出日志中同样也可以看到:

                                              RMAN> run {
                                              2> allocate channel c1 device type disk;
                                              3> allocate channel c2 device type disk;
                                              4> allocate channel c3 device type disk;
                                              5>
                                              6> backup as copy datafile 11 format '/tmp/incr/copy11.bak';
                                              7> backup as copy datafile 12 format '/tmp/incr/copy12.bak';
                                              8> backup as copy datafile 13 format '/tmp/incr/copy13.bak';
                                              9>
                                              10> release channel c1;
                                              11> release channel c2;
                                              12> release channel c3;
                                              13> }


                                              using target database control file instead of recovery catalog
                                              allocated channel: c1
                                              channel c1: sid=128 instance=jy1 devtype=DISK


                                              allocated channel: c2
                                              channel c2: sid=117 instance=jy1 devtype=DISK


                                              allocated channel: c3
                                              channel c3: sid=129 instance=jy1 devtype=DISK


                                              Starting backup at 29-AUG-18
                                              channel c1: starting datafile copy
                                              input datafile fno=00011 name=+ZHAOJINGYU/jy/datafile/dbs_d_test.615.985387387
                                              output filename=/tmp/incr/copy11.bak tag=TAG20180829T002101 recid=13 stamp=985393279
                                              channel c1: datafile copy complete, elapsed time: 00:00:25
                                              Finished backup at 29-AUG-18


                                              Starting backup at 29-AUG-18
                                              channel c1: starting datafile copy
                                              input datafile fno=00012 name=+ZHAOJINGYU/jy/datafile/dbs_d_test.613.985387391
                                              output filename=/tmp/incr/copy12.bak tag=TAG20180829T002127 recid=14 stamp=985393305
                                              channel c1: datafile copy complete, elapsed time: 00:00:25
                                              Finished backup at 29-AUG-18


                                              Starting backup at 29-AUG-18
                                              channel c1: starting datafile copy
                                              input datafile fno=00013 name=+ZHAOJINGYU/jy/datafile/dbs_d_test.611.985387395
                                              output filename=/tmp/incr/copy13.bak tag=TAG20180829T002153 recid=15 stamp=985393330
                                              channel c1: datafile copy complete, elapsed time: 00:00:25
                                              Finished backup at 29-AUG-18


                                              released channel: c1


                                              released channel: c2


                                              released channel: c3

                                              实际是串行操作,都是用的通道c1,这3个数据文件的copy备份消耗3个25s=75s。


                                              5.2 备份语句改写使用到并行

                                              改进写法,用到了并行:

                                                run {
                                                allocate channel c1 device type disk;
                                                allocate channel c2 device type disk;
                                                allocate channel c3 device type disk;


                                                backup as copy datafile 11,12,13 format '/tmp/incr/copy_%u.bak';


                                                release channel c1;
                                                release channel c2;
                                                release channel c3;
                                                }

                                                从日志看到:

                                                  RMAN> run {
                                                  2> allocate channel c1 device type disk;
                                                  3> allocate channel c2 device type disk;
                                                  4> allocate channel c3 device type disk;
                                                  5>
                                                  6> backup as copy datafile 11,12,13 format '/tmp/incr/copy_%u.bak';
                                                  7>
                                                  8> release channel c1;
                                                  9> release channel c2;
                                                  10> release channel c3;
                                                  11> }


                                                  using target database control file instead of recovery catalog
                                                  allocated channel: c1
                                                  channel c1: sid=129 instance=jy1 devtype=DISK


                                                  allocated channel: c2
                                                  channel c2: sid=127 instance=jy1 devtype=DISK


                                                  allocated channel: c3
                                                  channel c3: sid=119 instance=jy1 devtype=DISK


                                                  Starting backup at 29-AUG-18
                                                  channel c1: starting datafile copy
                                                  input datafile fno=00011 name=+ZHAOJINGYU/jy/datafile/dbs_d_test.615.985387387
                                                  channel c2: starting datafile copy
                                                  input datafile fno=00012 name=+ZHAOJINGYU/jy/datafile/dbs_d_test.613.985387391
                                                  channel c3: starting datafile copy
                                                  input datafile fno=00013 name=+ZHAOJINGYU/jy/datafile/dbs_d_test.611.985387395
                                                  output filename=/tmp/incr/copy_14tbnq76.bak tag=TAG20180829T002302 recid=16 stamp=985393432
                                                  channel c1: datafile copy complete, elapsed time: 00:00:55
                                                  output filename=/tmp/incr/copy_15tbnq76.bak tag=TAG20180829T002302 recid=17 stamp=985393432
                                                  channel c2: datafile copy complete, elapsed time: 00:00:55
                                                  output filename=/tmp/incr/copy_16tbnq76.bak tag=TAG20180829T002302 recid=18 stamp=985393435
                                                  channel c3: datafile copy complete, elapsed time: 00:00:55
                                                  Finished backup at 29-AUG-18


                                                  released channel: c1


                                                  released channel: c2


                                                  released channel: c3

                                                  实际是并行操作,分别用的通道c1、c2、c3,这3个数据文件的copy备份消耗1个55s=55s。
                                                  那为什么并行没有成倍增加效率?跟上一篇提到的一样,系统的整体I/O能力达到瓶颈了。所以一味的增加并行度并不总是有意义的。


                                                  5.3 备份方式改变提高效率

                                                  如果数据文件很大,但实际使用的并不多,则可以考虑使用备份集的方式,减少备份对空间的占用,一般同时也会加快备份的速度:

                                                    run {
                                                    allocate channel c1 device type disk;
                                                    allocate channel c2 device type disk;
                                                    allocate channel c3 device type disk;


                                                    backup as compressed backupset datafile 11,12,13 format '/tmp/incr/datafile_%u.bak';


                                                    release channel c1;
                                                    release channel c2;
                                                    release channel c3;
                                                    }

                                                    从日志可以看到:

                                                      RMAN> run {
                                                      2> allocate channel c1 device type disk;
                                                      3> allocate channel c2 device type disk;
                                                      4> allocate channel c3 device type disk;
                                                      5>
                                                      6> backup as compressed backupset datafile 11,12,13 format '/tmp/incr/datafile_%u.bak';
                                                      7>
                                                      8> release channel c1;
                                                      9> release channel c2;
                                                      10> release channel c3;
                                                      11> }


                                                      using target database control file instead of recovery catalog
                                                      allocated channel: c1
                                                      channel c1: sid=128 instance=jy1 devtype=DISK


                                                      allocated channel: c2
                                                      channel c2: sid=134 instance=jy1 devtype=DISK


                                                      allocated channel: c3
                                                      channel c3: sid=116 instance=jy1 devtype=DISK


                                                      Starting backup at 29-AUG-18
                                                      channel c1: starting compressed full datafile backupset
                                                      channel c1: specifying datafile(s) in backupset
                                                      input datafile fno=00011 name=+ZHAOJINGYU/jy/datafile/dbs_d_test.615.985387387
                                                      channel c1: starting piece 1 at 29-AUG-18
                                                      channel c2: starting compressed full datafile backupset
                                                      channel c2: specifying datafile(s) in backupset
                                                      input datafile fno=00012 name=+ZHAOJINGYU/jy/datafile/dbs_d_test.613.985387391
                                                      channel c2: starting piece 1 at 29-AUG-18
                                                      channel c3: starting compressed full datafile backupset
                                                      channel c3: specifying datafile(s) in backupset
                                                      input datafile fno=00013 name=+ZHAOJINGYU/jy/datafile/dbs_d_test.611.985387395
                                                      channel c3: starting piece 1 at 29-AUG-18
                                                      channel c1: finished piece 1 at 29-AUG-18
                                                      piece handle=/tmp/incr/datafile_17tbnqi9.bak tag=TAG20180829T002857 comment=NONE
                                                      channel c1: backup set complete, elapsed time: 00:00:02
                                                      channel c3: finished piece 1 at 29-AUG-18
                                                      piece handle=/tmp/incr/datafile_19tbnqia.bak tag=TAG20180829T002857 comment=NONE
                                                      channel c3: backup set complete, elapsed time: 00:00:01
                                                      channel c2: finished piece 1 at 29-AUG-18
                                                      piece handle=/tmp/incr/datafile_18tbnqi9.bak tag=TAG20180829T002857 comment=NONE
                                                      channel c2: backup set complete, elapsed time: 00:00:05
                                                      Finished backup at 29-AUG-18


                                                      released channel: c1


                                                      released channel: c2


                                                      released channel: c3

                                                      由于我这里这几个文件根本没有业务数据,所以效率提升尤为明显,只需要5s钟就完成了备份。
                                                      小结5:除了合理的并行使用,更要考虑当前是否有方案可以少做事,避免并行去做无用功,白白浪费计算资源。
                                                      关于并行,还有些有意思的场景,比如就曾遇到过有开发人员写错SQL并行度的hint导致Oracle采用了自动DOP,即最大并行度执行,导致系统资源基本全被占用,进而其他操作无法高效运行导致性能故障。
                                                      看到这里,发现并行的使用的确是存在很多坑,但我们也不能因噎废食,只要认真掌握并行相关的知识点,就完全可以用好这把利器,使其在合适的场景下大放异彩,骄傲的说:“并行,想说爱你也容易。”


                                                      作者简介

                                                      赵靖宇,英文名Alfred Zhao,专注于数据库相关技术,目前负责金融、证券、保险行业客户的数据库相关技术咨询和解决方案设计。

                                                      编辑:Alfred Zhao

                                                      最后修改时间:2021-01-21 15:53:33
                                                      文章转载自甲骨文云技术,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

                                                      评论