调星娱乐
您的当前位置:首页oralcedistinct去除重复记录的方法

oralcedistinct去除重复记录的方法

来源:调星娱乐



看过滤重复的sql语句

SQL> select * from Employee
2 /

ID FIRST_NAME LAST_NAME START_DAT END_DATE SALARY CITY DESCRIPTION
---- ---------- ---------- --------- --------- ---------- ---------- ---------------
01 Jason Martin 25-JUL-96 25-JUL-06 1234.56 Toronto Programmer
02 Alison Mathews 21-MAR-76 21-FEB-86 6661.78 Vancouver Tester
03 James Smith 12-DEC-78 15-MAR-90 6544.78 Vancouver Tester
04 Celia Rice 24-OCT-82 21-APR-99 2344.78 Vancouver Manager
05 Robert Black 15-JAN-84 08-AUG-98 2334.78 Vancouver Tester
06 Linda Green 30-JUL-87 04-JAN-96 4322.78 New York Tester
07 David Larry 31-DEC-90 12-FEB-98 77.78 New York Manager
08 James Cat 17-SEP-96 15-APR-02 1232.78 Vancouver Tester

8 rows selected.

SQL>
SQL>
SQL> SELECT Description FROM employee
2 /

DESCRIPTION
---------------
Programmer
Tester
Tester
Manager
Tester
Tester
Manager
Tester

8 rows selected.

SQL> SELECT DISTINCT Description FROM employee
2 /

DESCRIPTION
---------------
Programmer
Manager
Tester

如果要处理多列表的话,字段以“,"分开就要以了,如下

SQL> SELECT DISTINCT city, state
2 FROM employees


关于增加了distinct后查询的效率测试

只有增加DISTINCT关键字,Oracle必然需要对后面的所有字段进行排序。以前也经常发现由于开发人员对SQL不是很理解,在SELECT列表的20多个字段前面添加了DISTINCT,造成查询基本上不可能执行完成,甚至产生ORA-7445错误。所以一直向开发人员强调DISTINCT给性能带来的影响。

  没想到开发人员在测试一条大的SQL的时候,告诉我如果加上了DISTINCT,则查询大概需要4分钟左右可以执行完,如果不加DISTINCT,则查询执行了10多分钟,仍然得不到结果。

  首先想到的是可能DISTINCT是在子查询中,由于加上了DISTINCT,将第一步结果集缩小了,导致查询性能提高,结果一看SQL,发现DISTINCT居然是在查询的最外层。

  由于原始SQL太长,而且牵扯的表太多,很难说清楚,这里模拟了一个例子,这个例子由于数据量和SQL的复杂程度,无法看出二者执行时间上的明显差别。这里从两种情况的逻辑读对比来说明问题。

  首先建立模拟环境:

  SQL> CREATE TABLE T1 AS SELECT * FROM DBA_OBJECTS
  2 WHERE OWNER = 'SYS'
  3 AND OBJECT_TYPE NOT LIKE '%BODY'
  4 AND OBJECT_TYPE NOT LIKE 'JAVA%';
  Table created.
  SQL> CREATE TABLE T2 AS SELECT * FROM DBA_SEGMENTS WHERE OWNER = 'SYS';
  Table created.
  SQL> CREATE TABLE T3 AS SELECT * FROM DBA_INDEXES WHERE OWNER = 'SYS';
  Table created.
  SQL> ALTER TABLE T1 ADD CONSTRAINT PK_T1 PRIMARY KEY (OBJECT_NAME);
  Table altered.
  SQL> CREATE INDEX IND_T2_SEGNAME ON T2(SEGMENT_NAME);
  Index created.
  SQL> CREATE INDEX IND_T3_TABNAME ON T3(TABLE_NAME);
  Index created.
  SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T1', METHOD_OPT => 'FOR ALL INDEXED COLUMNS SIZE 100', CASCADE => TRUE)
  PL/SQL procedure successfully completed.
  SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T2', METHOD_OPT => 'FOR ALL INDEXED COLUMNS SIZE 100', CASCADE => TRUE)
  PL/SQL procedure successfully completed.
  SQL> EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'T3', METHOD_OPT => 'FOR ALL INDEXED COLUMNS SIZE 100', CASCADE => TRUE)
  PL/SQL procedure successfully completed.
显示全文