首先我给出两个等价的SQL:
select id,content from player_message where hasread=5 and end_date>sysdate and id not in (select msg_id from messagemapping t where t.playerid =166);
select * from (select id,content from player_message where hasread=5 and end_date>sysdate) pm where not exists (select msg_id from messagemapping t where pm.id=t.msg_id and t.playerid =166);
一般开发人员可能会用SQL的执行时间长短来衡量语句的优劣,这种想法一般来说是没错的,但在表的数据量不是很大时,执行时间就的衡量就不准确了。下面让我们来看这个两SQL的执行计划和统计信息。
l SQL> select id,content from player_message where hasread=5 and end_date>sysdate and id not in (select msg_id from messagemapping t where t.playerid =166);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE (Cost=7 Card=1 Bytes=29)
1 0 FILTER
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'PLAYER_MESSAGE' (Cost=2 Card=1 Bytes=29)
3 2 INDEX (RANGE SCAN) OF 'MESSAGE_END_IDX' (NON-UNIQUE) (Cost=1 Card=1)
4 1 TABLE ACCESS (BY INDEX ROWID) OF 'MESSAGEMAPPING' (Cost=5 Card=1 Bytes=9)
5 4 INDEX (RANGE SCAN) OF 'MSGMAPING_PID_IDX' (NON-UNIQUE) (Cost=3 Card=4)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
49 consistent gets
0 physical reads
0 redo size
531 bytes sent via SQL*Net to client
504 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
l SQL> select * from (select id,content from player_message where hasread=5 and end_date>sysdate) pm where not exists (select msg_id from messagemapping t where pm.id=t.msg_id and t.playerid =166);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT ptimizer=CHOOSE (Cost=5 Card=1 Bytes=29)
1 0 FILTER
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'PLAYER_MESSAGE' (Cost=2 Card=1 Bytes=29)
3 2 INDEX (RANGE SCAN) OF 'MESSAGE_END_IDX' (NON-UNIQUE) (Cost=1 Card=1)
4 1 FILTER
5 4 INDEX (RANGE SCAN) OF 'MSGMAPING_PID_IDX' (NON-UNIQUE) (Cost=3 Card=4 Bytes=20)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2 consistent gets
0 physical reads
0 redo size
531 bytes sent via SQL*Net to client
504 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
由些可看出,带有in的SQL,cost是7,逻辑读是49;带有exists的SQL,cost是5,逻辑读是2。执行次数很少时,两个SQL对系统的开销相差也不是太大,但执行频率增大后,它们之间的开销就很可观了。
下面再通过10046跟踪信息对比:
SQL> alter session set events '10046 trace name context forever,level 12';
SQL> select id,content from player_message where hasread=5 and end_date>sysdate and id not in (select msg_id from messagemapping t where t.playerid =166);
SQL> select * from (select id,content from player_message where hasread=5 and end_date>sysdate) pm where not exists (select msg_id from messagemapping t where pm.id=t.msg_id and t.playerid =166);
SQL> alter session set events '10046 trace name context off';
执行以上SQL,得到trace文件,且通过tkprof格式化后,得到如下片断:
select id,content from
player_message where hasread=5 and end_date>sysdate and id not in (select
msg_id from messagemapping t where t.playerid =166)
Rows Row Source Operation
------- ---------------------------------------------------
1 FILTER
1 TABLE ACCESS BY INDEX ROWID PLAYER_MESSAGE
1 INDEX RANGE SCAN MESSAGE_END_IDX (object id 34556)
0 TABLE ACCESS BY INDEX ROWID MESSAGEMAPPING
57 INDEX RANGE SCAN MSGMAPING_PID_IDX (object id 33172)
这里可以看到有57行数据回表,大部分的逻辑读就是在这里消耗的。
select * from
(select id,content from player_message where hasread=5 and end_date>sysdate)
pm where not exists (select msg_id from messagemapping t where pm.id=
t.msg_id and t.playerid =166)
Rows Row Source Operation
------- ---------------------------------------------------
1 FILTER
1 TABLE ACCESS BY INDEX ROWID PLAYER_MESSAGE
1 INDEX RANGE SCAN MESSAGE_END_IDX (object id 34556)
0 FILTER
0 INDEX RANGE SCAN MSGMAPING_PID_IDX (object id 33172)
对比这两个trace信息,进一步说明了两个SQL的差异。