暂无图片
暂无图片
暂无图片
暂无图片
暂无图片

「YashanDB个人版体验」重测崖山Hash Join的性能

563

前面一篇随意的测试,引来了崖山原厂的一些质疑;之前的文章确实不太严谨,当时只是随手一测,没有做其他想法。那么既然要公平的测试,这里为重新来造一下数据。

##崖山数据库

    SQL> begin 
    2 for i in 1 .. 1000000
    loop
    3 4 insert into t19 values('ENMO','TABLE',i);
    5 end loop;
    6 commit;
    7 end;
    8


    PL/SQL Succeed.


    SQL> update t19 set owner='ENMOTECH' where OBJECT_ID > 10000 and OBJECT_ID < 100000;


    89999 rows affected.


    SQL> update t19 set owner='yashan' where OBJECT_ID > 100000;


    900000 rows affected.


    SQL> commit;


    Succeed.


    SQL> begin
    2 for i in 1 .. 500000
    loop
    3 4 insert into t19_1 values('ENMO','TABLE',i);
    end loop;
    commit;
    end;
    5 6 7 8


    PL/SQL Succeed.


    SQL> update t19_1 set owner='ENMOTECH' where OBJECT_ID > 10000 and OBJECT_ID < 50000;


    39999 rows affected.


    SQL> update t19_1 set owner='yashan' where OBJECT_ID > 50000;


    450000 rows affected.


    SQL> commit;


    Succeed.


    SQL> select * from v$version;


    BANNER VERSION_NUMBER
    ---------------------------------------------------------------- -----------------
    Personal Edition Release 23.1.1.100 x86_64 23.1.1.100


    1 row fetched.


    SQL>
    SQL> exec DBMS_STATS.GATHER_TABLE_STATS('ENMO', 'T19', '',1, FALSE, 'FOR ALL COLUMNS SIZE AUTO', 4, 'AUTO', TRUE);


    PL/SQL Succeed.


    SQL> exec DBMS_STATS.GATHER_TABLE_STATS('ENMO', 'T19_1', '',1, FALSE, 'FOR ALL COLUMNS SIZE AUTO', 4, 'AUTO', TRUE);


    PL/SQL Succeed.


    SQL>

    接下来我们来造一下Oracle的数据,这里为了确保测试相对准确,创建一致的表结构,唯一不同就是崖山的测试表一个字断是bigint,oracle中为number。

    ### Oracle 11.2.0.4

      oracle@Ora11g-ogg-Target:/home/oracle $sqlplus roger/roger


      SQL*Plus: Release 11.2.0.4.0 Production on Sun Nov 19 19:05:19 2023


      Copyright (c) 1982, 2013, Oracle. All rights reserved.




      Connected to:
      Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
      With the Partitioning, OLAP, Data Mining and Real Application Testing options


      SQL> desc t19
      Name Null? Type
      ----------------------------------------- -------- ----------------------------
      OWNER VARCHAR2(30)
      OBJECT_TYPE VARCHAR2(19)
      OBJECT_ID NUMBER


      SQL> truncate table t19;


      Table truncated.


      SQL> begin
      2 for i in 1 .. 1000000
      loop
      3 4 insert into t19 values('ENMO','TABLE',i);
      5 end loop;
      commit;
      6 7 end;

      8


      PL/SQL procedure successfully completed.


      SQL> SQL>
      SQL>
      SQL> update t19 set owner='ENMOTECH' where OBJECT_ID > 10000 and OBJECT_ID < 100000;


      89999 rows updated.


      SQL> update t19 set owner='yashan' where OBJECT_ID > 100000;


      900000 rows updated.


      SQL> commit;


      Commit complete.


      SQL> truncate table t19_1;


      Table truncated.


      SQL> begin
      2 for i in 1 .. 500000
      loop
      3 4 insert into t19_1 values('ENMO','TABLE',i);
      5 end loop;
      6 commit;
      7 end;
      8


      PL/SQL procedure successfully completed.


      SQL> update t19_1 set owner='ENMOTECH' where OBJECT_ID > 10000 and OBJECT_ID < 50000;


      39999 rows updated.


      SQL> update t19_1 set owner='yashan' where OBJECT_ID > 50000;


      450000 rows updated.


      SQL> commit;


      Commit complete.


      SQL>
      SQL> analyze table t19 compute statistics;


      Table analyzed.


      SQL> c/t19/t19_1
      1* analyze table t19_1 compute statistics
      SQL>


      Table analyzed.


      SQL>


      万事具备,只欠东风。来对比一下两种数据库的hash join能力:

      ## 崖山

        SQL> set timing on
        SQL> set autot on
        SQL>
        SQL>
        SQL>
        SQL> select b.owner, count(*)
        2 from t19 a, t19_1 b
        3 where a.object_id = b.object_id
        group by b.owner; 4


        OWNER COUNT(*)
        ---------------------------------------------------------------- ---------------------
        ENMO 10001
        ENMOTECH 39999
        yashan 450000








        Execution Plan
        ----------------------------------------------------------------
        SQL hash value: 2421436707
        Optimizer: ADOPT_C

        +----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
        | Id | Operation type | Name | Owner | E - Rows | A - Rows | Cost(%CPU) | A - Time | Loops | Memory | Disk | Partition info |
        +----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+
        | 0 | SELECT STATEMENT | | | | | | | | | | |
        | 1 | SORT GROUP | | | 3| | 1128( 0)| | | | | |
        |* 2 | HASH JOIN INNER | | | 500000| | 1097( 0)| | | | | |
        | 3 | TABLE ACCESS FULL | T19_1 | ENMO | 500000| | 291( 0)| | | | | |
        | 4 | TABLE ACCESS FULL | T19 | ENMO | 1000000| | 437( 0)| | | | | |
        +----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+

        Operation Information (identified by operation id):
        ---------------------------------------------------

        1 - Group Expression: ("B"."OWNER")
        2 - Predicate : access("B"."OBJECT_ID" = "A"."OBJECT_ID")








        Statistics
        ----------------------------------------------------------------------------------------------------


        21 rows fetched.


        Elapsed: 00:00:00.848
        SQL>

        ## Oracle 11204

          SQL> alter system flush buffer_cache;


          System altered.


          Elapsed: 00:00:00.69
          SQL> select b.owner, count(*)
          2 from t19 a, t19_1 b
          3 where a.object_id = b.object_id
          4 group by b.owner;


          OWNER COUNT(*)
          ------------------------------------------------------------ ----------
          yashan 450000
          ENMO 10001
          ENMOTECH 39999


          Elapsed: 00:00:00.47


          Execution Plan
          ----------------------------------------------------------
          Plan hash value: 3112831341


          -------------------------------------------------------------------------------------
          | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
          -------------------------------------------------------------------------------------
          | 0 | SELECT STATEMENT | | 3 | 45 | | 2557 (2)| 00:00:31 |
          | 1 | HASH GROUP BY | | 3 | 45 | | 2557 (2)| 00:00:31 |
          |* 2 | HASH JOIN | | 500K| 7324K| 10M| 2544 (1)| 00:00:31 |
          | 3 | TABLE ACCESS FULL| T19_1 | 500K| 5371K| | 412 (1)| 00:00:05 |
          | 4 | TABLE ACCESS FULL| T19 | 1000K| 3906K| | 824 (1)| 00:00:10 |
          -------------------------------------------------------------------------------------


          Predicate Information (identified by operation id):
          ---------------------------------------------------


          2 - access("A"."OBJECT_ID"="B"."OBJECT_ID")




          Statistics
          ----------------------------------------------------------
          0 recursive calls
          0 db block gets
          4517 consistent gets
          4509 physical reads
          0 redo size
          680 bytes sent via SQL*Net to client
          524 bytes received via SQL*Net from client
          2 SQL*Net roundtrips to/from client
          0 sorts (memory)
          0 sorts (disk)
          3 rows processed


          可以看到结果集完全相同的情况之下,崖山是0.848s,Oracle是0.47s;也就是说崖山数据库SQL运行时间是Oracle的1.8倍。如果数据量更大呢?

          经测当:

          t19数据量变为400w时,崖山是1.976s;Oracle 是1.29s;

          t19数据量变为1600w时,崖山是5.357s;Oracle 是4.3s;(我这里崖山虚拟机是32g内存,oracle虚拟机是8g内存,Oracle sga是3g)

          如果再稍微复杂一点的SQL呢?

          ##崖山

            SQL> SELECT b.owner,
            2 sum(a.object_id)
            FROM t19 a,
            t19_1 b
            3 4 5 WHERE a.object_id = b.object_id
            GROUP BY b.owner; 6


            OWNER SUM(A.OBJECT_ID)
            ---------------------------------------------------------------- ----------------
            ENMO 800880000
            ENMOTECH 1.9200E+10
            yashan 1.9800E+12


            3 rows fetched.


            Elapsed: 00:00:08.485
            SQL>

            ##Oracle 11204

              SQL> SELECT b.owner,
              2 sum(a.object_id)
              3 FROM t19 a,
              t19_1 b
              WHERE a.object_id = b.object_id
              GROUP BY b.owner; 4 5 6


              OWNER SUM(A.OBJECT_ID)
              ------------------------------------------------------------ ----------------
              yashan 1.9800E+12
              ENMO 800880000
              ENMOTECH 1.9200E+10


              Elapsed: 00:00:05.27
              SQL>


              看上去性能还不错,崖山只比Oracle慢了60%左右。


              最后修改时间:2023-11-23 15:23:37
              文章转载自Roger的数据库专栏,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。

              评论