本文共 22845 字,大约阅读时间需要 76 分钟。
在Oracle世界里,索引主要有B树索引,位图索引和全文索引等,其中最常见于OLTP系统的是B树索引,位图索引和全文索引则多用于OLAP系统中;索引就类似于一本书的目录,若只想了解某个章节,则在不需要通读全书,可以通过索引迅速的找到需要的内容;若需要通读全书,恐怕没有人会看一页文章内容,再回头看下目录这样阅读吧,这2次I/O,必然浪费时间和资源;B树索引比较适合值分布比较均匀的场合,因而普遍用于OLTP系统中;位图索引则适用于取值范围相对较小,且更改频率低的场合,相比B树索引,位图索引将占用更少的存储空间;全文索引则类似分词系统,一般用在CLOB字段或者TEXT字段上,全文索引会占用很大的储存空间,有可能索引的大小要远大于基表本身;索引会随着基表的更改而被oracle自动维护!下面就来简要的谈谈B树索引!
一:下列几种情况将不使用索引
1:使用不等于操作符 SQL> create table t1 as select * from dba_source; Table created. SQL> create index ind_t1 on t1(name); Index created. SQL> set autot trace exp stat SQL> select * from t1 where name <> 'STANDARD'; 634915 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 565K| 1108M| 2948 (1)| 00:00:36 | |* 1 | TABLE ACCESS FULL| T1 | 565K| 1108M| 2948 (1)| 00:00:36 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("NAME"<>'STANDARD') Note ----- - dynamic sampling used for this statement (level=2) Statistics ---------------------------------------------------------- 10 recursive calls 0 db block gets 52377 consistent gets 10977 physical reads 0 redo size 52132322 bytes sent via SQL*Net to client 466016 bytes received via SQL*Net from client 42329 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 634915 rows processed |
2: 使用IS NULL或者IS NOT NULL
SQL> exec dbms_stats.gather_table_stats('HR','T1',cascade=>true); PL/SQL procedure successfully completed. SQL> select * from t1 where name is null; no rows selected Execution Plan ---------------------------------------------------------- Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 100 | 2947 (1)| 00:00:36 | |* 1 | TABLE ACCESS FULL| T1 | 1 | 100 | 2947 (1)| 00:00:36 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("NAME" IS NULL) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 10709 consistent gets 10706 physical reads 0 redo size 517 bytes sent via SQL*Net to client 408 bytes received via SQL*Net from client 1 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed |
3:不使用基于函数的索引条件下,使用函数
SQL> select * from t1 where lower(name)='standard'; 3966 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 6389 | 623K| 2950 (1)| 00:00:36 | |* 1 | TABLE ACCESS FULL| T1 | 6389 | 623K| 2950 (1)| 00:00:36 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(LOWER("NAME")='standard') Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 10968 consistent gets 10706 physical reads 0 redo size 237461 bytes sent via SQL*Net to client 3323 bytes received via SQL*Net from client 266 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 3966 rows processed |
4:比较不匹配的数据类型
SQL> create table t2 (id varchar2(10),name varchar2(10)); Table created. SQL> begin 2 for i in 1..100000 3 loop 4 insert into t2 values (i,'t2'); 5 end loop; 6 commit; 7 end; 8 / PL/SQL procedure successfully completed. SQL> create index ind_t2 on t2(id); Index created. SQL> exec dbms_stats.gather_table_stats('HR','T2',cascade=>true); PL/SQL procedure successfully completed. SQL> select * from t2 where id=100; Execution Plan ---------------------------------------------------------- Plan hash value: 1513984157 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 9 | 69 (2)| 00:00:01 | |* 1 | TABLE ACCESS FULL| T2 | 1 | 9 | 69 (2)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter(TO_NUMBER("ID")=100) Statistics ---------------------------------------------------------- 1 recursive calls 1 db block gets 255 consistent gets 0 physical reads 0 redo size 478 bytes sent via SQL*Net to client 419 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed |
二:使用B树索引
1:新建2张表,并分别对name字段创建索引,收集表统计信息;t2表的name字段为倾斜型,t1表则相对比较均匀;2:对t1表进行查询,可以看到,查询1使用索引ind_t1,查询2根据hint提示使用全表扫描,对比两次查询的CPU代价和一致性读,可以看出索引的优势!
SQL> set autot trace exp stat SQL> select * from t1 where name='STANDARD'; 3933 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 634656657 -------------------------------------------------------------------------------- ------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- ------ | 0 | SELECT STATEMENT | | 476 | 55692 | 18 (0)| 00:0 0:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 476 | 55692 | 18 (0)| 00:0 0:01 | |* 2 | INDEX RANGE SCAN | IND_T1 | 476 | | 4 (0)| 00:0 0:01 | -------------------------------------------------------------------------------- ------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("NAME"='STANDARD') Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 575 consistent gets 0 physical reads 0 redo size 336844 bytes sent via SQL*Net to client 3351 bytes received via SQL*Net from client 264 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 3933 rows processed SQL> select /*+full(t1)*/ * from t1 where name='STANDARD'; 3933 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 3617692013 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 476 | 55692 | 1389 (1)| 00:00:17 | |* 1 | TABLE ACCESS FULL| T1 | 476 | 55692 | 1389 (1)| 00:00:17 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("NAME"='STANDARD') Statistics ---------------------------------------------------------- 802 recursive calls 0 db block gets 6580 consistent gets 6134 physical reads 0 redo size 250150 bytes sent via SQL*Net to client 3351 bytes received via SQL*Net from client 264 SQL*Net roundtrips to/from client 20 sorts (memory) 0 sorts (disk) 3933 rows processed |
3:对t2表进行查询,可以看到,查询1使用索引ind_t2,查询2则使用全表扫描,因为name字段基于所以的值均为SALE,所以这个时候使用全表扫描反而更快,查询3使用hint提示使用ind_t2索引,将产生更大的开销,由此可见,B树索引适合于值分布比较均匀的场合,否则2次I/O将导致查询更慢!
SQL> set autot trace exp stat SQL> select * from t2 where name='T2'; Execution Plan ---------------------------------------------------------- Plan hash value: 591771271 -------------------------------------------------------------------------------- ------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- ------ | 0 | SELECT STATEMENT | | 1 | 107 | 4 (0)| 00:0 0:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T2 | 1 | 107 | 4 (0)| 00:0 0:01 | |* 2 | INDEX RANGE SCAN | IND_T2 | 1 | | 3 (0)| 00:0 0:01 | -------------------------------------------------------------------------------- ------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("NAME"='T2') Statistics ---------------------------------------------------------- 183 recursive calls 0 db block gets 28 consistent gets 4 physical reads 0 redo size 875 bytes sent via SQL*Net to client 469 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 5 sorts (memory) 0 sorts (disk) 1 rows processed SQL> select * from t2 where name='SALE'; 292435 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1513984157 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 291K| 29M| 1391 (2)| 00:00:17 | |* 1 | TABLE ACCESS FULL| T2 | 291K| 29M| 1391 (2)| 00:00:17 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("NAME"='SALE') Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 25293 consistent gets 605 physical reads 0 redo size 30733266 bytes sent via SQL*Net to client 214914 bytes received via SQL*Net from client 19497 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 292435 rows processed SQL> select /*+index(t2 ind_t2)*/ * from t2 where name='SALE'; 292435 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 591771271 -------------------------------------------------------------------------------- ------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- ------ | 0 | SELECT STATEMENT | | 291K| 29M| 6850 (1)| 00:0 1:23 | | 1 | TABLE ACCESS BY INDEX ROWID| T2 | 291K| 29M| 6850 (1)| 00:0 1:23 | |* 2 | INDEX RANGE SCAN | IND_T2 | 292K| | 662 (2)| 00:0 0:08 | -------------------------------------------------------------------------------- ------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("NAME"='SALE') Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 45391 consistent gets 2454 physical reads 0 redo size 36955099 bytes sent via SQL*Net to client 214914 bytes received via SQL*Net from client 19497 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 292435 rows processed |
三:表和索引的压缩,在大数据量条件下,除了使用分区技术来裁剪I/O外,压缩也是个不错的选择,但压缩比较适合于更新不频繁的场景
1:查看T1表和IND_T1索引的DDL创建语句和存储占用情况2:新建表t3和索引ind_t3,带压缩属性,对比表t1所占的存储,表为50M:37M,索引为9M:4M
3:分区表的本地索引压缩,需要先在对象级别上启用压缩
四:对象分析与执行计划,oracle 10g以后关于sql语句的执行,默认都是使用cbo,即基于代价的优化器,而不是基于规则的,这个代价则是有对象分析而来,也就是我们平常说的统计信息,下面来看下当统计信息未及时更新的情况下,sql执行计划的选择!
1:创建基表和索引,可以看到,当没有统计信息的时候,oracle将采取动态采样方式,也可以获得正确的执行计划 SQL> create table t4 as select 100 object_id,object_name from dba_objects; Table created. SQL> update t4 set object_id=1 where rownum=1; 1 row updated. SQL> commit; Commit complete SQL> create index ind_t4 on t4(object_id); Index created. SQL> set autot trace exp stat SQL> select * from t4 where object_id=100; 50414 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2560505625 -------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 54273 | 4187K| 56 (4)| 00:00:01 | |* 1 | TABLE ACCESS FULL| T4 | 54273 | 4187K| 56 (4)| 00:00:01 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("OBJECT_ID"=100) Note ----- - dynamic sampling used for this statement Statistics ---------------------------------------------------------- 52 recursive calls 0 db block gets 3658 consistent gets 5 physical reads 0 redo size 1691795 bytes sent via SQL*Net to client 37429 bytes received via SQL*Net from client 3362 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 50414 rows processed SQL> select * from t4 where object_id=1; Execution Plan ---------------------------------------------------------- Plan hash value: 4096627024 -------------------------------------------------------------------------------- ------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- ------ | 0 | SELECT STATEMENT | | 1 | 79 | 2 (0)| 00:0 0:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T4 | 1 | 79 | 2 (0)| 00:0 0:01 | |* 2 | INDEX RANGE SCAN | IND_T4 | 1 | | 1 (0)| 00:0 0:01 | -------------------------------------------------------------------------------- ------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"=1) Note ----- - dynamic sampling used for this statement Statistics ---------------------------------------------------------- 7 recursive calls 0 db block gets 76 consistent gets 0 physical reads 0 redo size 595 bytes sent via SQL*Net to client 469 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed |
2:删除表中所有的数据后,收集统计信息,然后再插入原先一样的数据
3:这个时候,统计信息未及时更新,认为下面的查询语句选择走索引会更优,则会产生错误的执行计划,导致查询语句缓慢!在生产环境,大数据量情况下尤为明显!
SQL> set autot trace exp stat SQL> select * from t4 where object_id=100; 50415 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 4096627024 -------------------------------------------------------------------------------- ------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------- ------ | 0 | SELECT STATEMENT | | 1 | 79 | 1 (0)| 00:0 0:01 | | 1 | TABLE ACCESS BY INDEX ROWID| T4 | 1 | 79 | 1 (0)| 00:0 0:01 | |* 2 | INDEX RANGE SCAN | IND_T4 | 1 | | 1 (0)| 00:0 0:01 | -------------------------------------------------------------------------------- ------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("OBJECT_ID"=100) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 3581 consistent gets 0 physical reads 0 redo size 1692701 bytes sent via SQL*Net to client 37429 bytes received via SQL*Net from client 3362 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 50415 rows processed |
转载地址:http://jpivo.baihongyu.com/