本文共 15818 字,大约阅读时间需要 52 分钟。
[20150709]慎用标量子查询.txt
--花了几天的时间调试sql语句,最终发现是标量子查询在作怪,原始的语句太复杂,我拿scott做一个例子来说明问题。
1.建立测试环境:
SCOTT@test> @ver1
PORT_STRING VERSION BANNER ------------------------------ -------------- -------------------------------------------------------------------------------- x86_64/Linux 2.4.xx 11.2.0.3.0 Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Productioncreate view v_deptemp1 as select emp.*,dname,loc from emp,dept where dept.deptno=emp.deptno ;
create view v_deptemp2 as select emp.*,(select dname from dept where dept.deptno=emp.deptno) dname ,(select loc from dept where dept.deptno=emp.deptno )loc from emp ;--第2种写法就是标量子查询。
2.测试1:
create index i_dept_dname on scott.dept (dname); create index i_emp_deptno on scott.emp (deptno)SCOTT@test> select * from v_deptemp1 where dname='ACCOUNTING' ;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO DNAME LOC ---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- ------------- 7782 CLARK MANAGER 7839 1981-06-09 00:00:00 2450 10 ACCOUNTING NEW YORK 7839 KING PRESIDENT 1981-11-17 00:00:00 5000 10 ACCOUNTING NEW YORK 7934 MILLER CLERK 7782 1982-01-23 00:00:00 1300 10 ACCOUNTING NEW YORKSCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT ------------------------------------- SQL_ID 09tbznuqjnxa4, child number 0 ------------------------------------- select * from v_deptemp1 where dname='ACCOUNTING' Plan hash value: 2863776355 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 3 (100)| | | 1 | NESTED LOOPS | | | | | | | 2 | NESTED LOOPS | | 5 | 295 | 3 (0)| 00:00:01 | | 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 20 | 2 (0)| 00:00:01 | |* 4 | INDEX RANGE SCAN | I_DEPT_DNAME | 1 | | 1 (0)| 00:00:01 | |* 5 | INDEX RANGE SCAN | I_EMP_DEPTNO | 5 | | 0 (0)| | | 6 | TABLE ACCESS BY INDEX ROWID | EMP | 5 | 195 | 1 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$F5BB74E1 3 - SEL$F5BB74E1 / DEPT@SEL$2 4 - SEL$F5BB74E1 / DEPT@SEL$2 5 - SEL$F5BB74E1 / EMP@SEL$2 6 - SEL$F5BB74E1 / EMP@SEL$2 Predicate Information (identified by operation id): --------------------------------------------------- 4 - access("DNAME"='ACCOUNTING') 5 - access("DEPT"."DEPTNO"="EMP"."DEPTNO")--oracle 很好的分析确定索引,并且对emp表也可以使用i_emp_deptno。
--看看第2种情况:
SCOTT@test> select * from v_deptemp2 where dname='ACCOUNTING' ; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO DNAME LOC ---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- ------------- 7782 CLARK MANAGER 7839 1981-06-09 00:00:00 2450 10 ACCOUNTING NEW YORK 7839 KING PRESIDENT 1981-11-17 00:00:00 5000 10 ACCOUNTING NEW YORK 7934 MILLER CLERK 7782 1982-01-23 00:00:00 1300 10 ACCOUNTING NEW YORKSCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT ------------------------------------- SQL_ID 45acjyf0f8x54, child number 0 ------------------------------------- select * from v_deptemp2 where dname='ACCOUNTING' Plan hash value: 2698003519 ------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | ------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 3 (100)| | | 1 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| | | 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 11 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| | |* 5 | VIEW | V_DEPTEMP2 | 14 | 1456 | 3 (0)| 00:00:01 | | 6 | TABLE ACCESS FULL | EMP | 14 | 546 | 3 (0)| 00:00:01 | ------------------------------------------------------------------------------------------- Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$3 / DEPT@SEL$3 2 - SEL$3 / DEPT@SEL$3 3 - SEL$4 / DEPT@SEL$4 4 - SEL$4 / DEPT@SEL$4 5 - SEL$2 / V_DEPTEMP2@SEL$1 6 - SEL$2 / EMP@SEL$2 Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("DEPT"."DEPTNO"=:B1) 4 - access("DEPT"."DEPTNO"=:B1) 5 - filter("DNAME"='ACCOUNTING')--这样无论如何都不会使用emp的i_emp_deptno。当然我实际上的情况实际上开发想避免dname 为null的情况。
--因为标量子查询如果查询dname为null不是他所需要的。3.测试2:
--再建立一个表tx create table tx ( id number,name varchar2(20)); insert into tx values (7654,'MARTIN'); commit;SCOTT@test> select * from v_deptemp1 v ,tx where tx.id=v.empno and dname is not null;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO DNAME LOC ID NAME ---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- ------------- ---------- ------------ 7654 MARTIN SALESMAN 7698 1981-09-28 00:00:00 1250 1400 30 SALES CHICAGO 7654 MARTINSCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT ------------------------------------- SQL_ID gskw0uw49rw0k, child number 0 ------------------------------------- select * from v_deptemp1 v ,tx where tx.id=v.empno and dname is not null Plan hash value: 2230440165 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 5 (100)| | | 1 | NESTED LOOPS | | | | | | | 2 | NESTED LOOPS | | 1 | 70 | 5 (0)| 00:00:01 | | 3 | NESTED LOOPS | | 1 | 50 | 4 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL | TX | 1 | 11 | 3 (0)| 00:00:01 | | 5 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 39 | 1 (0)| 00:00:01 | |* 6 | INDEX UNIQUE SCAN | PK_EMP_EMPNO | 1 | | 0 (0)| | |* 7 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| | |* 8 | TABLE ACCESS BY INDEX ROWID | DEPT | 1 | 20 | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------ Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$F5BB74E1 4 - SEL$F5BB74E1 / TX@SEL$1 5 - SEL$F5BB74E1 / EMP@SEL$2 6 - SEL$F5BB74E1 / EMP@SEL$2 7 - SEL$F5BB74E1 / DEPT@SEL$2 8 - SEL$F5BB74E1 / DEPT@SEL$2 Predicate Information (identified by operation id): --------------------------------------------------- 6 - access("TX"."ID"="EMP"."EMPNO") 7 - access("DEPT"."DEPTNO"="EMP"."DEPTNO") 8 - filter("DNAME" IS NOT NULL)--而如果换成v_deptemp2
SCOTT@test> select * from v_deptemp2 v ,tx where tx.id=v.empno and dname is not null; EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO DNAME LOC ID NAME ---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- ------------- ---------- -------- 7654 MARTIN SALESMAN 7698 1981-09-28 00:00:00 1250 1400 30 SALES CHICAGO 7654 MARTINSCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT ------------------------------------- SQL_ID 7b8950at9688x, child number 0 ------------------------------------- select * from v_deptemp2 v ,tx where tx.id=v.empno and dname is not null Plan hash value: 2887512270 ---------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | OMem | 1Mem | Used-Mem | ---------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 7 (100)| | | | | | 1 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 13 | 1 (0)| 00:00:01 | | | | |* 2 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| | | | | | 3 | TABLE ACCESS BY INDEX ROWID| DEPT | 1 | 11 | 1 (0)| 00:00:01 | | | | |* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| | | | | |* 5 | HASH JOIN | | 1 | 106 | 7 (15)| 00:00:01 | 1156K| 1156K| 383K (0)| | 6 | TABLE ACCESS FULL | TX | 1 | 11 | 3 (0)| 00:00:01 | | | | |* 7 | VIEW | V_DEPTEMP2 | 14 | 1330 | 3 (0)| 00:00:01 | | | | | 8 | TABLE ACCESS FULL | EMP | 14 | 546 | 3 (0)| 00:00:01 | | | | ----------------------------------------------------------------------------------------------------------------------Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------1 - SEL$3 / DEPT@SEL$3
2 - SEL$3 / DEPT@SEL$3 3 - SEL$4 / DEPT@SEL$4 4 - SEL$4 / DEPT@SEL$4 5 - SEL$1 6 - SEL$1 / TX@SEL$1 7 - SEL$2 / V@SEL$1 8 - SEL$2 / EMP@SEL$2Predicate Information (identified by operation id):
---------------------------------------------------2 - access("DEPT"."DEPTNO"=:B1)
4 - access("DEPT"."DEPTNO"=:B1) 5 - access("TX"."ID"="V"."EMPNO") 7 - filter("DNAME" IS NOT NULL)--emp始终选择的是全表扫描。
4.继续测试:
--当把dname is not null删除后,结果呢?SCOTT@test> select * from v_deptemp2 v ,tx where tx.id=v.empno ;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO DNAME LOC ID NAME ---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- ------------- ---------- --------- 7654 MARTIN SALESMAN 7698 1981-09-28 00:00:00 1250 1400 30 SALES CHICAGO 7654 MARTINSCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT ------------------------------------- SQL_ID 1sd5umnwjt823, child number 0 ------------------------------------- select * from v_deptemp2 v ,tx where tx.id=v.empno Plan hash value: 3698375752 ---------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | ---------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 4 (100)| | | 1 | TABLE ACCESS BY INDEX ROWID | DEPT | 1 | 13 | 1 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| | | 3 | TABLE ACCESS BY INDEX ROWID | DEPT | 1 | 11 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| | | 5 | NESTED LOOPS | | | | | | | 6 | NESTED LOOPS | | 1 | 50 | 4 (0)| 00:00:01 | | 7 | TABLE ACCESS FULL | TX | 1 | 11 | 3 (0)| 00:00:01 | |* 8 | INDEX UNIQUE SCAN | PK_EMP_EMPNO | 1 | | 0 (0)| | | 9 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 39 | 1 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------- Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$3 / DEPT@SEL$3 2 - SEL$3 / DEPT@SEL$3 3 - SEL$4 / DEPT@SEL$4 4 - SEL$4 / DEPT@SEL$4 5 - SEL$F5BB74E1 7 - SEL$F5BB74E1 / TX@SEL$1 8 - SEL$F5BB74E1 / EMP@SEL$2 9 - SEL$F5BB74E1 / EMP@SEL$2 Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("DEPT"."DEPTNO"=:B1) 4 - access("DEPT"."DEPTNO"=:B1) 8 - access("TX"."ID"="EMP"."EMPNO")--可以发现没有这个条件(dname is not null),emp可以很好的选择索引。
--实际我们的视图定义有多处使用标量子查询,当然我只要把在where出现的改写成不使用标量子查询。 --如果不愿意修改还有1个简单的改写方法,就是:SCOTT@test> select * from (select * from v_deptemp2 v ,tx where tx.id=v.empno ) where dname is not null ;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO DNAME LOC ID NAME ---------- ---------- --------- ---------- ------------------- ---------- ---------- ---------- -------------- ------------- ---------- -------- 7654 MARTIN SALESMAN 7698 1981-09-28 00:00:00 1250 1400 30 SALES CHICAGO 7654 MARTINSCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT ------------------------------------- SQL_ID 8yy4n9t1dqhmm, child number 0 ------------------------------------- select * from (select * from v_deptemp2 v ,tx where tx.id=v.empno ) where dname is not null Plan hash value: 3563298867 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 4 (100)| | | 1 | TABLE ACCESS BY INDEX ROWID | DEPT | 1 | 13 | 1 (0)| 00:00:01 | |* 2 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| | | 3 | TABLE ACCESS BY INDEX ROWID | DEPT | 1 | 11 | 1 (0)| 00:00:01 | |* 4 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| | |* 5 | VIEW | | 1 | 129 | 4 (0)| 00:00:01 | | 6 | NESTED LOOPS | | | | | | | 7 | NESTED LOOPS | | 1 | 50 | 4 (0)| 00:00:01 | | 8 | TABLE ACCESS FULL | TX | 1 | 11 | 3 (0)| 00:00:01 | |* 9 | INDEX UNIQUE SCAN | PK_EMP_EMPNO | 1 | | 0 (0)| | | 10 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 39 | 1 (0)| 00:00:01 | ----------------------------------------------------------------------------------------------- Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$4 / DEPT@SEL$4 2 - SEL$4 / DEPT@SEL$4 3 - SEL$5 / DEPT@SEL$5 4 - SEL$5 / DEPT@SEL$5 5 - SEL$335DD26A / from$_subquery$_001@SEL$1 6 - SEL$335DD26A 8 - SEL$335DD26A / TX@SEL$2 9 - SEL$335DD26A / EMP@SEL$3 10 - SEL$335DD26A / EMP@SEL$3 Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("DEPT"."DEPTNO"=:B1) 4 - access("DEPT"."DEPTNO"=:B1) 5 - filter("DNAME" IS NOT NULL) 9 - access("TX"."ID"="EMP"."EMPNO")--这样改写后emp可以使用索引!看来oracle cbo 还是不够成熟,无法识别这种情况。
5.也许有人问v_deptemp1 与 v_deptemp2是不等价的。继续测试:
SCOTT@test> create or replace view v_deptemp3 as select emp.*,dname,loc from emp,dept where dept.deptno(+)=emp.deptno ;
View created.alter table emp modify constraint fk_deptno disable novalidate;
--取消主外键约束。SCOTT@test> update emp set deptno=90 where empno=7654;
1 row updated.SCOTT@test> commit ;
Commit complete.SCOTT@test> select * from v_deptemp3 v ,tx where tx.id=v.empno and dname is not null;
no rows selectedSCOTT@test> @dpc '' ''
PLAN_TABLE_OUTPUT ------------------------------------- SQL_ID bwytb3gt4vtx1, child number 0 ------------------------------------- select * from v_deptemp3 v ,tx where tx.id=v.empno and dname is not null Plan hash value: 2230440165 ------------------------------------------------------------------------------------------------ | Id | Operation | Name | E-Rows |E-Bytes| Cost (%CPU)| E-Time | ------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | | | 5 (100)| | | 1 | NESTED LOOPS | | | | | | | 2 | NESTED LOOPS | | 1 | 70 | 5 (0)| 00:00:01 | | 3 | NESTED LOOPS | | 1 | 50 | 4 (0)| 00:00:01 | | 4 | TABLE ACCESS FULL | TX | 1 | 11 | 3 (0)| 00:00:01 | | 5 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 39 | 1 (0)| 00:00:01 | |* 6 | INDEX UNIQUE SCAN | PK_EMP_EMPNO | 1 | | 0 (0)| | |* 7 | INDEX UNIQUE SCAN | PK_DEPT | 1 | | 0 (0)| | |* 8 | TABLE ACCESS BY INDEX ROWID | DEPT | 1 | 20 | 1 (0)| 00:00:01 | ------------------------------------------------------------------------------------------------ Query Block Name / Object Alias (identified by operation id): ------------------------------------------------------------- 1 - SEL$EB0D93E2 4 - SEL$EB0D93E2 / TX@SEL$1 5 - SEL$EB0D93E2 / EMP@SEL$2 6 - SEL$EB0D93E2 / EMP@SEL$2 7 - SEL$EB0D93E2 / DEPT@SEL$2 8 - SEL$EB0D93E2 / DEPT@SEL$2 Predicate Information (identified by operation id): --------------------------------------------------- 6 - access("TX"."ID"="EMP"."EMPNO") 7 - access("DEPT"."DEPTNO"="EMP"."DEPTNO") 8 - filter("DNAME" IS NOT NULL)--就是这样定义,oracle 一样选择合理的执行方式。
总结:
1.不要大量不顾各种情况的使用标量子查询,实际上标量子查询存在许多缺陷,仅仅能查询1个字段,仅仅返回1行。 2.并且导致内部嵌套连接,而不能采用hash join等连接方式。假如emp表1万行,这样查询dept表要有1万次(我上面执行2次标量子查询,也就是2万次)。 即使dept有索引,逻辑读的数量不可小看. 3.另外一个原因我觉得可能是数据结构问题,这种情况如果太遵守范式,数据结构设计耦合太紧,开发很容易选择标量子查询的写法。 例如:1个表有6个字段有科室代码字段,这样要显示科室名称,这样from中写6次科室表,做连接6次查询,这样许多开发更愿意选择标量子查询写sql语句。 4.总之要开发合理选择这种写法,不能到处滥用。转载地址:http://nzpfa.baihongyu.com/