ORACLE统计信息收集不及时优化案例一则
某业务系统有表T1,id是主键,用户id和下单日期等三个主要栏位.分别在userid和order_date上存在索引.
create table t1(id number,userid number,order_date date, primary key (id));
create index ix_order_date on t1(order_date);
create index ix_userid on t1(userid);
前端的主查询方式为根据用户id查当前时间的24小时以内的数据.
select *
from t1
where userid=:1
and order_date >=trunc(sysdate)
and order_date <=trunc(sysdate)+1
数据的特征是,单个用户平均在1000条记录左右,而每天产生的数据量大概在2-3万条,每个用户24小时内产生数据大概1-10条
.统计信息使用提oracle默认的收集策略,每天凌晨4点钟收集;
优化器的主要参数如下;
DONGDONGTANG> show parameter optimizer;
NAME TYPE VALUE
------------------------------------ ----------- ------------------------------
optimizer_capture_sql_plan_baselines boolean FALSE
optimizer_dynamic_sampling integer 2
optimizer_features_enable string 11.1.0.7
optimizer_index_caching integer 0
optimizer_index_cost_adj integer 100
optimizer_mode string ALL_ROWS
optimizer_secure_view_merging boolean TRUE
optimizer_use_invisible_indexes boolean FALSE
optimizer_use_pending_statistics boolean FALSE
optimizer_use_sql_plan_baselines boolean TRUE
从数据特征描术来看,业务高峰时大部分可能走userid的索引性能更加稳定一些.
但是当系统第一次收集完成统计信息以后,走的是ix_order_date的索引
DONGDONGTANG>select count(id)
from t1
where 1=1
and userid=6775
and order_date >=trunc(sysdate)
and order_date <=trunc(sysdate)+1 2 3 4 5 6 ;
Execution Plan
----------------------------------------------------------
Plan hash value: 2961821412
-----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 12 | 18 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 12 | | |
|* 2 | FILTER | | | | | |
|* 3 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 12 | 18 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | IX_ORDER_DATE | 57 | | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
由于在06:00收集后,这一段时间产生的数据比较少,对比order_date评估是57行,而userid为6775的值有2418行数据
所以优化器选择了orader_date列的索引.
随着到业务高峰期的到来,在06:00-12:00这段时间内新增了1万条数据左右.而且因为绑定变量(没有启用ACS))的原因,
这个时候再走order_date上索引变得很不合适了.
DONGDONGTANG>select /*+ GATHER_PLAN_STATISTICS */ count(id)
from t1
where 1=1
and userid=6775
and order_date >=trunc(sysdate)
and order_date <=trunc(sysdate)+1
COUNT(ID)
----------
1
DONGDONGTANG>select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS'));
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.02 | 105 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.02 | 105 |
|* 2 | FILTER | | 1 | | 1 |00:00:00.02 | 105 |
|* 3 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 1 |00:00:00.02 | 105 |
|* 4 | INDEX RANGE SCAN | IX_ORDER_DATE | 1 | 57 | 10300 |00:00:00.01 | 59 |
---------------------------------------------------------------------------------------------------------
通过对比E_ROWS和A_ROWS的栏位可以发现,优化器的评估信息已经和现实的中数据已经差别非常大了.
强制优化器使用ix_userid索引
select /*+ GATHER_PLAN_STATISTICS INDEX(T1 IX_USERID) */ count(id)
from t1
where 1=1
and userid=6775
and order_date >=trunc(sysdate)
and order_date <=trunc(sysdate)+1
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | | 2 |00:00:00.01 | 36 |
| 1 | SORT AGGREGATE | | 2 | 1 | 2 |00:00:00.01 | 36 |
|* 2 | FILTER | | 2 | | 2 |00:00:00.01 | 36 |
|* 3 | TABLE ACCESS BY INDEX ROWID| T1 | 2 | 1 | 2 |00:00:00.01 | 36 |
|* 4 | INDEX RANGE SCAN | IX_USERID | 2 | 377 | 2840 |00:00:00.01 | 22 |
-----------------------------------------------------------------------------------------------------
而使用userid索引栏位,评估相对接近,所以优化器在后续的过程中继续使用ix_order_date索引不合适
(即使order_date的cluster factor相比userid已经比较低)
由于对userid的order_date两个栏位比较少更新,我们决定使用复合索引的方式(userid+order_date)或是(order_date+userid).
创建复合索引根据选择性最高的栏位排在最前的原则,我们认为userid的选择性相对高,所以先把userid放在前面
DONGDONGTANG>create index ix_u_o on t1(userid,order_date);
Index created.
Plan hash value: 1669135283
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 5 |
|* 1 | FILTER | | 1 | | 1 |00:00:00.01 | 5 |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 1 |00:00:00.01 | 5 |
|* 3 | INDEX RANGE SCAN | IX_U_O | 1 | 1 | 1 |00:00:00.01 | 4 |
-------------------------------------------------------------------------------------------------
这个时候优化器对评估的行接近于等实际的行,所以这个计划比较合理.
如果我们选择把order_date放在前面呢,来看看执行计划是怎么样的
DONGDONGTANG>drop index ix_u_o;
Index dropped.
DONGDONGTANG>create index ix_o_u on t1(order_date,userid);
Index created.
Plan hash value: 445006054
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-------------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 38 |
|* 1 | FILTER | | 1 | | 1 |00:00:00.01 | 38 |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1 | 1 | 1 |00:00:00.01 | 38 |
|* 3 | INDEX RANGE SCAN | IX_O_U | 1 | 1 | 1 |00:00:00.01 | 37 |
-------------------------------------------------------------------------------------------------
但是在这里我们发现使用order_date+userid的复合索引的buffers读取数要高,达到了36,相比(userid,order_date)
只有5,高出了将近7倍.所以采用userid+order_date的复合索引相对来说资源开销要少.
最后我们认为在这其中有以下的总结要点:1,如果频繁手动收集该表的统计信息可能会有好处,但是考虑到该表是
主要业务表之一,重新收集再重新对SQL进行硬解析,可能会带来较大的稳定性风险;2,在原来的userid索引后面
增加一列order_date列变成复合索引,仅会带来很小的空间开销,原有的order_date索引继续保持不变;3,使用多个
栏位可以让oracle评估出更加精确的行信息;4,不同时间段好的执行计划是不同的(ACS根本解决?)