前面一篇随意的测试,引来了崖山原厂的一些质疑;之前的文章确实不太严谨,当时只是随手一测,没有做其他想法。那么既然要公平的测试,这里为重新来造一下数据。
##崖山数据库
SQL> begin2 for i in 1 .. 1000000loop3 4 insert into t19 values('ENMO','TABLE',i);5 end loop;6 commit;7 end;8PL/SQL Succeed.SQL> update t19 set owner='ENMOTECH' where OBJECT_ID > 10000 and OBJECT_ID < 100000;89999 rows affected.SQL> update t19 set owner='yashan' where OBJECT_ID > 100000;900000 rows affected.SQL> commit;Succeed.SQL> begin2 for i in 1 .. 500000loop3 4 insert into t19_1 values('ENMO','TABLE',i);end loop;commit;end;5 6 7 8PL/SQL Succeed.SQL> update t19_1 set owner='ENMOTECH' where OBJECT_ID > 10000 and OBJECT_ID < 50000;39999 rows affected.SQL> update t19_1 set owner='yashan' where OBJECT_ID > 50000;450000 rows affected.SQL> commit;Succeed.SQL> select * from v$version;BANNER VERSION_NUMBER---------------------------------------------------------------- -----------------Personal Edition Release 23.1.1.100 x86_64 23.1.1.1001 row fetched.SQL>SQL> exec DBMS_STATS.GATHER_TABLE_STATS('ENMO', 'T19', '',1, FALSE, 'FOR ALL COLUMNS SIZE AUTO', 4, 'AUTO', TRUE);PL/SQL Succeed.SQL> exec DBMS_STATS.GATHER_TABLE_STATS('ENMO', 'T19_1', '',1, FALSE, 'FOR ALL COLUMNS SIZE AUTO', 4, 'AUTO', TRUE);PL/SQL Succeed.SQL>
接下来我们来造一下Oracle的数据,这里为了确保测试相对准确,创建一致的表结构,唯一不同就是崖山的测试表一个字断是bigint,oracle中为number。
### Oracle 11.2.0.4
oracle@Ora11g-ogg-Target:/home/oracle $sqlplus roger/rogerSQL*Plus: Release 11.2.0.4.0 Production on Sun Nov 19 19:05:19 2023Copyright (c) 1982, 2013, Oracle. All rights reserved.Connected to:Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit ProductionWith the Partitioning, OLAP, Data Mining and Real Application Testing optionsSQL> desc t19Name Null? Type----------------------------------------- -------- ----------------------------OWNER VARCHAR2(30)OBJECT_TYPE VARCHAR2(19)OBJECT_ID NUMBERSQL> truncate table t19;Table truncated.SQL> begin2 for i in 1 .. 1000000loop3 4 insert into t19 values('ENMO','TABLE',i);5 end loop;commit;6 7 end;8PL/SQL procedure successfully completed.SQL> SQL>SQL>SQL> update t19 set owner='ENMOTECH' where OBJECT_ID > 10000 and OBJECT_ID < 100000;89999 rows updated.SQL> update t19 set owner='yashan' where OBJECT_ID > 100000;900000 rows updated.SQL> commit;Commit complete.SQL> truncate table t19_1;Table truncated.SQL> begin2 for i in 1 .. 500000loop3 4 insert into t19_1 values('ENMO','TABLE',i);5 end loop;6 commit;7 end;8PL/SQL procedure successfully completed.SQL> update t19_1 set owner='ENMOTECH' where OBJECT_ID > 10000 and OBJECT_ID < 50000;39999 rows updated.SQL> update t19_1 set owner='yashan' where OBJECT_ID > 50000;450000 rows updated.SQL> commit;Commit complete.SQL>SQL> analyze table t19 compute statistics;Table analyzed.SQL> c/t19/t19_11* analyze table t19_1 compute statisticsSQL>Table analyzed.SQL>
万事具备,只欠东风。来对比一下两种数据库的hash join能力:
## 崖山
SQL> set timing onSQL> set autot onSQL>SQL>SQL>SQL> select b.owner, count(*)2 from t19 a, t19_1 b3 where a.object_id = b.object_idgroup by b.owner; 4OWNER COUNT(*)---------------------------------------------------------------- ---------------------ENMO 10001ENMOTECH 39999yashan 450000Execution Plan----------------------------------------------------------------SQL hash value: 2421436707Optimizer: ADOPT_C+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+| Id | Operation type | Name | Owner | E - Rows | A - Rows | Cost(%CPU) | A - Time | Loops | Memory | Disk | Partition info |+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+| 0 | SELECT STATEMENT | | | | | | | | | | || 1 | SORT GROUP | | | 3| | 1128( 0)| | | | | ||* 2 | HASH JOIN INNER | | | 500000| | 1097( 0)| | | | | || 3 | TABLE ACCESS FULL | T19_1 | ENMO | 500000| | 291( 0)| | | | | || 4 | TABLE ACCESS FULL | T19 | ENMO | 1000000| | 437( 0)| | | | | |+----+--------------------------------+----------------------+------------+----------+----------+-------------+----------+----------+----------+----------+--------------------------------+Operation Information (identified by operation id):---------------------------------------------------1 - Group Expression: ("B"."OWNER")2 - Predicate : access("B"."OBJECT_ID" = "A"."OBJECT_ID")Statistics----------------------------------------------------------------------------------------------------21 rows fetched.Elapsed: 00:00:00.848SQL>
## Oracle 11204
SQL> alter system flush buffer_cache;System altered.Elapsed: 00:00:00.69SQL> select b.owner, count(*)2 from t19 a, t19_1 b3 where a.object_id = b.object_id4 group by b.owner;OWNER COUNT(*)------------------------------------------------------------ ----------yashan 450000ENMO 10001ENMOTECH 39999Elapsed: 00:00:00.47Execution Plan----------------------------------------------------------Plan hash value: 3112831341-------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |-------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 3 | 45 | | 2557 (2)| 00:00:31 || 1 | HASH GROUP BY | | 3 | 45 | | 2557 (2)| 00:00:31 ||* 2 | HASH JOIN | | 500K| 7324K| 10M| 2544 (1)| 00:00:31 || 3 | TABLE ACCESS FULL| T19_1 | 500K| 5371K| | 412 (1)| 00:00:05 || 4 | TABLE ACCESS FULL| T19 | 1000K| 3906K| | 824 (1)| 00:00:10 |-------------------------------------------------------------------------------------Predicate Information (identified by operation id):---------------------------------------------------2 - access("A"."OBJECT_ID"="B"."OBJECT_ID")Statistics----------------------------------------------------------0 recursive calls0 db block gets4517 consistent gets4509 physical reads0 redo size680 bytes sent via SQL*Net to client524 bytes received via SQL*Net from client2 SQL*Net roundtrips to/from client0 sorts (memory)0 sorts (disk)3 rows processed
可以看到结果集完全相同的情况之下,崖山是0.848s,Oracle是0.47s;也就是说崖山数据库SQL运行时间是Oracle的1.8倍。如果数据量更大呢?
经测当:
t19数据量变为400w时,崖山是1.976s;Oracle 是1.29s;
t19数据量变为1600w时,崖山是5.357s;Oracle 是4.3s;(我这里崖山虚拟机是32g内存,oracle虚拟机是8g内存,Oracle sga是3g)
如果再稍微复杂一点的SQL呢?
##崖山
SQL> SELECT b.owner,2 sum(a.object_id)FROM t19 a,t19_1 b3 4 5 WHERE a.object_id = b.object_idGROUP BY b.owner; 6OWNER SUM(A.OBJECT_ID)---------------------------------------------------------------- ----------------ENMO 800880000ENMOTECH 1.9200E+10yashan 1.9800E+123 rows fetched.Elapsed: 00:00:08.485SQL>
##Oracle 11204
SQL> SELECT b.owner,2 sum(a.object_id)3 FROM t19 a,t19_1 bWHERE a.object_id = b.object_idGROUP BY b.owner; 4 5 6OWNER SUM(A.OBJECT_ID)------------------------------------------------------------ ----------------yashan 1.9800E+12ENMO 800880000ENMOTECH 1.9200E+10Elapsed: 00:00:05.27SQL>
看上去性能还不错,崖山只比Oracle慢了60%左右。
最后修改时间:2023-11-23 15:23:37
文章转载自Roger的数据库专栏,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




