问题描述
你好,
我试图找出在sql中包含条件的最佳方法,以检查用户在特定日期之前发生的最新事件。
以下是我的表格设置和简要说明
USER_NP => 列出所有用户的父表 (在原始情况下,它将被连接到多个表以构建结果集)
USER_LOGONS_NP => 记录每个用户登录时间的表; 用于存储有限时间的数据
USER_LOGON_HIST_NP => 存储存档登录详细信息的表 (来自USER_LOGONS_NP)
testcase sql背后的要求是生成上次登录时间早于 (或早于) 特定日期的用户列表。
请你指教好吗?
下面是我的测试案例
我试图找出在sql中包含条件的最佳方法,以检查用户在特定日期之前发生的最新事件。
以下是我的表格设置和简要说明
USER_NP => 列出所有用户的父表 (在原始情况下,它将被连接到多个表以构建结果集)
USER_LOGONS_NP => 记录每个用户登录时间的表; 用于存储有限时间的数据
USER_LOGON_HIST_NP => 存储存档登录详细信息的表 (来自USER_LOGONS_NP)
testcase sql背后的要求是生成上次登录时间早于 (或早于) 特定日期的用户列表。
请你指教好吗?
下面是我的测试案例
drop table user_np purge ;
drop table user_logons_np purge ;
drop table user_logon_hist_np purge ;
create table user_np as select * from all_users ;
create table user_logons_np as select user_id, logon_date, dbms_random.string('a', 50) pad_data from all_users, (select sysdate - level as logon_date from dual connect by level <= 50) ;
create table user_logon_hist_np as select user_id, logon_date, dbms_random.string('a', 50) pad_data from all_users, (select sysdate - (level - 51) as logon_date from dual connect by level <= 500) ;
drop index uhn_idx1 ;
drop index uln_idx1 ;
create index uhn_idx1 on user_logon_hist_np(user_id, logon_date) ;
create index uln_idx1 on user_logons_np(user_id, logon_date) ;
exec dbms_stats.gather_table_stats(user, 'USER_NP', cascade => true) ;
exec dbms_stats.gather_table_stats(user, 'USER_LOGONS_NP', cascade => true) ;
exec dbms_stats.gather_table_stats(user, 'USER_LOGON_HIST_NP', cascade => true) ;
explain plan for SELECT
*
FROM
user_np
WHERE
EXISTS (
SELECT
user_id
FROM
(
SELECT
user_id,
MAX(logon_date)
FROM
user_logons_np
GROUP BY user_id
HAVING
MAX(logon_date) <= SYSDATE
UNION
SELECT
user_id,
MAX(logon_date)
FROM
user_logon_hist_np
GROUP BY user_id
HAVING
MAX(logon_date) <= SYSDATE
) a
WHERE
a.user_id = user_np.user_id
);
Plan hash value: 1460566721
----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34 | 68 (8)| 00:00:01 |
|* 1 | HASH JOIN SEMI | | 1 | 34 | 68 (8)| 00:00:01 |
| 2 | TABLE ACCESS FULL | USER_NP | 32 | 672 | 6 (0)| 00:00:01 |
| 3 | VIEW | | 4 | 52 | 62 (9)| 00:00:01 |
| 4 | SORT UNIQUE | | 4 | 48 | 62 (9)| 00:00:01 |
| 5 | UNION-ALL | | | | | |
|* 6 | FILTER | | | | | |
| 7 | HASH GROUP BY | | 2 | 24 | 7 (15)| 00:00:01 |
| 8 | INDEX FULL SCAN| ULN_IDX1 | 1600 | 19200 | 6 (0)| 00:00:01 |
|* 9 | FILTER | | | | | |
| 10 | HASH GROUP BY | | 2 | 24 | 55 (8)| 00:00:01 |
| 11 | INDEX FULL SCAN| UHN_IDX1 | 16000 | 187K| 52 (2)| 00:00:01 |
----------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("A"."USER_ID"="USER_NP"."USER_ID")
6 - filter(MAX("LOGON_DATE")<=SYSDATE@!)
9 - filter(MAX("LOGON_DATE")<=SYSDATE@!)
explain plan for SELECT
*
FROM
user_np usn
WHERE
EXISTS (
SELECT
user_id,
MAX(logon_date)
FROM
user_logons_np uln
WHERE uln.user_id = usn.user_id
GROUP BY user_id
HAVING
MAX(logon_date) <= SYSDATE
UNION ALL
SELECT
user_id,
MAX(logon_date)
FROM
user_logon_hist_np uhp
WHERE uhp.user_id = usn.user_id
GROUP BY user_id
HAVING
MAX(logon_date) <= SYSDATE
);
Plan hash value: 1665332333
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 21 | 86 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL | USER_NP | 32 | 672 | 6 (0)| 00:00:01 |
| 3 | UNION-ALL | | | | | |
|* 4 | FILTER | | | | | |
| 5 | SORT GROUP BY NOSORT| | 1 | 12 | 2 (0)| 00:00:01 |
|* 6 | INDEX RANGE SCAN | ULN_IDX1 | 50 | 600 | 2 (0)| 00:00:01 |
|* 7 | FILTER | | | | | |
| 8 | SORT GROUP BY NOSORT| | 1 | 12 | 3 (0)| 00:00:01 |
|* 9 | INDEX RANGE SCAN | UHN_IDX1 | 500 | 6000 | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter( EXISTS ( (SELECT "USER_ID",MAX("LOGON_DATE") FROM
"USER_LOGONS_NP" "ULN" WHERE "ULN"."USER_ID"=:B1 GROUP BY "USER_ID" HAVING
MAX("LOGON_DATE")<=SYSDATE@!) UNION ALL (SELECT
"USER_ID",MAX("LOGON_DATE") FROM "USER_LOGON_HIST_NP" "UHP" WHERE
"UHP"."USER_ID"=:B2 GROUP BY "USER_ID" HAVING
MAX("LOGON_DATE")<=SYSDATE@!)))
4 - filter(MAX("LOGON_DATE")<=SYSDATE@!)
6 - access("ULN"."USER_ID"=:B1)
7 - filter(MAX("LOGON_DATE")<=SYSDATE@!)
9 - access("UHP"."USER_ID"=:B1)
explain plan for SELECT
*
FROM
user_np usn
WHERE
CASE WHEN EXISTS (
SELECT
null
FROM
user_logons_np uln
WHERE uln.user_id = usn.user_id
HAVING
MAX(logon_date) <= SYSDATE ) THEN 1
ELSE
CASE WHEN EXISTS (
SELECT
null
FROM
user_logon_hist_np uhp
WHERE uhp.user_id = usn.user_id
HAVING
MAX(logon_date) <= SYSDATE ) THEN 1
ELSE 0 END
END = 1 ;
Plan hash value: 339077023
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 32 | 672 | 38 (0)| 00:00:01 |
|* 1 | FILTER | | | | | |
| 2 | TABLE ACCESS FULL | USER_NP | 32 | 672 | 6 (0)| 00:00:01 |
|* 3 | FILTER | | | | | |
| 4 | SORT AGGREGATE | | 1 | 12 | | |
|* 5 | INDEX RANGE SCAN| ULN_IDX1 | 50 | 600 | 2 (0)| 00:00:01 |
|* 6 | FILTER | | | | | |
| 7 | SORT AGGREGATE | | 1 | 12 | | |
|* 8 | INDEX RANGE SCAN| UHN_IDX1 | 500 | 6000 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(CASE WHEN EXISTS (SELECT MAX("LOGON_DATE") FROM
"USER_LOGONS_NP" "ULN" WHERE "ULN"."USER_ID"=:B1 HAVING
MAX("LOGON_DATE")<=SYSDATE@!) THEN 1 ELSE CASE WHEN EXISTS (SELECT
MAX("LOGON_DATE") FROM "USER_LOGON_HIST_NP" "UHP" WHERE
"UHP"."USER_ID"=:B2 HAVING MAX("LOGON_DATE")<=SYSDATE@!) THEN 1 ELSE 0
END END =1)
3 - filter(MAX("LOGON_DATE")<=SYSDATE@!)
5 - access("ULN"."USER_ID"=:B1)
6 - filter(MAX("LOGON_DATE")<=SYSDATE@!)
8 - access("UHP"."USER_ID"=:B1) 专家解答
以上都不是;)
有了有条款,你就
1分组
2过滤
真的你想在where子句中首先过滤。然后分组。
或者在这种情况下,根本不是团体!
对于您的测试数据,以下最适合我:
为了衡量这一点,你应该比较执行计划,而不是解释计划。这包括关键度量,例如每一步处理的行数、缓冲区 (I/O) 完成、花费的时间等。
通过运行:
这样做,我发现上面最好的替代方案 (通过执行最少数量的缓冲区的操作来衡量) 是最后一个。有这个计划:
注意97个缓冲区 (I/O)
而where子句中的过滤给出:
将近四分之一的工作!只有15个缓冲区,而不是 ~ 100。
也...
因为我希望每个用户都有很多行,所以我还会压缩这些索引的前导列:
这应该会让它们更小 => 阅读效率更高。
有了有条款,你就
1分组
2过滤
真的你想在where子句中首先过滤。然后分组。
或者在这种情况下,根本不是团体!
对于您的测试数据,以下最适合我:
SELECT * FROM user_np usn WHERE EXISTS ( SELECT null FROM user_logons_np uln WHERE uln.user_id = usn.user_id AND logon_date <= SYSDATE UNION ALL SELECT null FROM user_logon_hist_np uhp WHERE uhp.user_id = usn.user_id AND logon_date <= SYSDATE );
为了衡量这一点,你应该比较执行计划,而不是解释计划。这包括关键度量,例如每一步处理的行数、缓冲区 (I/O) 完成、花费的时间等。
通过运行:
set serveroutput off alter session set statistics_level = all; selectselect * from table(dbms_xplan.display_cursor(null, null, 'IOSTATS LAST'));
这样做,我发现上面最好的替代方案 (通过执行最少数量的缓冲区的操作来衡量) 是最后一个。有这个计划:
------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | | 46 |00:00:00.01 | 97 | |* 1 | FILTER | | 1 | | 46 |00:00:00.01 | 97 | | 2 | TABLE ACCESS FULL | USER_NP | 1 | 46 | 46 |00:00:00.01 | 3 | |* 3 | FILTER | | 46 | | 46 |00:00:00.01 | 94 | | 4 | SORT AGGREGATE | | 46 | 1 | 46 |00:00:00.01 | 94 | |* 5 | INDEX RANGE SCAN| ULN_IDX1 | 46 | 50 | 2300 |00:00:00.01 | 94 | |* 6 | FILTER | | 0 | | 0 |00:00:00.01 | 0 | | 7 | SORT AGGREGATE | | 0 | 1 | 0 |00:00:00.01 | 0 | |* 8 | INDEX RANGE SCAN| UHN_IDX1 | 0 | 500 | 0 |00:00:00.01 | 0 | ------------------------------------------------------------------------------------------
注意97个缓冲区 (I/O)
而where子句中的过滤给出:
---------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 46 |00:00:00.02 | 15 | |* 1 | HASH JOIN SEMI | | 1 | 46 | 46 |00:00:00.02 | 15 | | 2 | TABLE ACCESS FULL | USER_NP | 1 | 46 | 46 |00:00:00.01 | 2 | | 3 | VIEW | VW_SQ_1 | 1 | 23042 | 2251 |00:00:00.01 | 13 | | 4 | UNION-ALL | | 1 | | 2251 |00:00:00.01 | 13 | |* 5 | INDEX FAST FULL SCAN| ULN_IDX1 | 1 | 2300 | 2251 |00:00:00.01 | 13 | |* 6 | INDEX FAST FULL SCAN| UHN_IDX1 | 0 | 20742 | 0 |00:00:00.01 | 0 | ----------------------------------------------------------------------------------------------
将近四分之一的工作!只有15个缓冲区,而不是 ~ 100。
也...
因为我希望每个用户都有很多行,所以我还会压缩这些索引的前导列:
alter index uhn_idx1 rebuild compress 1; alter index uln_idx1 rebuild compress 1;
这应该会让它们更小 => 阅读效率更高。
文章转载自ASKTOM,如果涉嫌侵权,请发送邮件至:contact@modb.pro进行举报,并提供相关证据,一经查实,墨天轮将立刻删除相关内容。




