SQL优化最佳实践:构建高效率Oracle数据库的方法与技巧
上QQ阅读APP看书,第一时间看更新

 

案例5 COUNT(*)到底能有多快

1.案例说明

一个大表的COUNT,究竟能有多快?除类似物化视图的做法,我们所能做到的极限能有多快?这不是一个真实的案例,而是根据笔者在网上发的一篇帖子整理而来。通过对一条SQL,采用多种方式持续优化过程,表明SQL优化的手段随着优化者掌握的技能增多,其可能存在的手段也在不断增多。

(1)数据准备

数据准备的代码如下:

create table t2 select * from dba_objects
insert into t2 select * from t2
...
select count* from t2 =>102400000      --数据量有1亿多条
select bytes/1024/1024 from user_segments where segment_name='T2'  => 10972    --数据对象大小有10GB

(2)全表扫描

全表扫描的代码如下(共用124秒,好慢呀):

select count* from t2
Elapsed 000204.09
-------------------------------------------------------------------
| Id  | Operation          | Name | Rows  | Cost %CPU| Time     |
-------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |     1 |   381K 1 | 011619 |
|   1 |  SORT AGGREGATE    |      |     1 |            |          |
|   2 |   TABLE ACCESS FULL| T2   |   102M|   381K  1| 011619 |
-------------------------------------------------------------------
Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
1400379  consistent gets
1068862  physical reads

由上可知,全表扫描耗时较长。

(3)主键索引

主键索引的代码如下:

alter table t2 add constraint pk_t2 primary keyid);
execdbms_stats.gather_index_stats'hf' 'pk_t2' estimate_percent =>10);
select count* from t2
Elapsed 000033.18
-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows | Cost %CPU | Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |    1 | 64271   2 | 001252 |
|   1 |  SORT AGGREGATE       |       |    1 |             |          |
|   2 |   INDEX FAST FULL SCAN| PK_T2 |  102M| 64271   2 | 001252 |
-----------------------------------------------------------------------
Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
228654  consistent gets
205137  physical reads

通过引入索引,执行计划变成索引快速全扫描,因扫描块数较少,因此耗时也大大减少,共用33秒,快多了。

(4)常数索引

常数索引的代码如下:

create index idx_0 on t20);
execdbms_stats.gather_index_stats'hf' 'idx_0' estimate_percent =>10);
select count* from t2
Elapsed 000028.92
-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows | Cost %CPU | Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |    1 | 49601   2 | 000956 |
|   1 |  SORT AGGREGATE       |       |    1 |             |          |
|   2 |   INDEX FAST FULL SCAN| IDX_0 |  102M| 49601   2 | 000956 |
-----------------------------------------------------------------------
Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
185899  consistent gets
167726  physical reads

常数索引在存储密度上要高于普通字段索引,因此扫描块数更少,耗时也更少,共耗时29秒。

(5)常数压缩索引

常数压缩索引的代码如下:

create index idx_0 on t20 compress
execdbms_stats.gather_index_stats'hf' 'idx_0' estimate_percent =>10);
select count* from t2
Elapsed 000027.85
-----------------------------------------------------------------------
| Id  | Operation             | Name  | Rows  | Cost %CPU| Time     |
-----------------------------------------------------------------------
|   0 | SELECT STATEMENT      |       |     1 | 43812   3| 000846 |
|   1 |  SORT AGGREGATE       |       |     1 |            |          |
|   2 |   INDEX FAST FULL SCAN| IDX_0 |   102M| 43812   3| 000846 |
-----------------------------------------------------------------------
Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
     157636  consistent gets     //压缩后,减少了
141651  physical reads

索引压缩进一步减少了扫描规模,耗时缩减到27秒。

(6)位图索引

位图索引的代码如下:

create bitmap index idx_status2 on t2status);
execdbms_stats.gather_index_stats'hf' 'idx_status2' estimate_percent=> 10);
select count* from t2
Elapsed 000000.9
--------------------------------------------------------------------------------
| Id  | Operation                | Name        |  Rows  | Cost%CPU| Time     |
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT         |             |     1  | 2262  1 | 000028 |
|   1 |  SORT AGGREGATE          |             |     1  |           |          |
|   2 |   BITMAP CONVERSION COUNT
                                 |             |   102M |  2262  1| 000028 |
|   3 |    BITMAP INDEX FAST FULL SCAN
                                 | IDX_STATUS2 |        |           |          |
--------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1  recursive calls
0  db block gets
       2502  consistent gets     //大大减少
351  physical reads

位图索引不同于B树索引,其存储密度更高。这里是采用status字段,如果使用常数索引,其规模将更小。这种手段用时0.9秒,这是质的飞跃。

(7)位图索引+并行

alter index idx_status2 parallel 8
select count* from t2
Elapsed 000000.03
--------------------------------------------------------------------------------
| Id  | Operation         | Name       |Rows  |Time    |  TQ |IN-OUT|PQ Distrib|
--------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |            |    1 |000027|     |      |          |
|   1 |  SORT AGGREGATE   |            |    1 |        |     |      |          |
|   2 |   PX COORDINATOR  |            |      |        |     |      |          |
|   3 |    PX SEND QC RANDOM
                          | TQ10000   |    1 |        |Q100| P->S | QC RAND|
|   4 |     SORT AGGREGATE
                          |            |    1 |        |Q100| PCWP |          |
|   5 |      PX BLOCK ITERATOR 
                          |            |  102M|000027|Q100| PCWC |          |
|   6 |       BITMAP CONVERSION COUNT 
                          |            |  102M|000027|Q100| PCWP |          |
|   7 |        BITMAP INDEX FAST FULL SCAN
                          |IDX_STATUS2 |      |        |Q100| PCWP |          |
--------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
265  recursive calls
3  db block gets
3059  consistent gets
0  physical reads

并行技术可以较快执行速度。一致性读有所增加,但并行还是能加快整体运行速度,这种手段耗时0.03秒,竟然又快了不少。

(8)分析结论

  • 位图索引可以按很高密度存储数据,因此往往比B树索引小很多。前提是在基数比较小的情况下。
  • 位图索引是保存空值的,因此可以在COUNT中利用。
  • 众所周知,位图索引不太适合OLTP类型数据库。该实例仅为了测试展示。
2.给我们的启示

优化没有止境,对数据库了解越多,你能想到的方法就越多。