• 一个实际工作中的sql的模拟


    create table t as select * from dba_objects;

    update t set LAST_DDL_TIME=to_date('20040101','yyyymmdd')+trunc(dbms_random.value(1,360))

    SQL> desc x
     Name                          Null?    Type
     ----------------------------- -------- --------------------
     OBJECT_ID                     NOT NULL NUMBER
     TYPE                                   CHAR(2)
     NAME                                   VARCHAR2(128)

    alter table t add constraint ind_t_pk primary key(object_id);
    alter table x add constraint ind_x_pk primary key(object_id);
    create index ind_t on t(LAST_DDL_TIME);
     
    insert into x  select * from  (select object_id,'11',object_name from t order by dbms_random.value)  where rownum<100;
    update x set type='22' where rownum<40;
    commit;
    execute dbms_stats.gather_schema_stats('QDL');
    到此就利用dba_objects构造了模拟环境。
    t在现实中有4900万纪录,x现实中有500万纪录。
    程序要求找出x里面类型为'22'的并且在t里面时间在某个范围的所有x的记录
    其sql为:
    select x.object_id, x.name, x.type
      from x, t
     where x.object_id = t.object_id
       and x.type = '22'
       and t.LAST_DDL_TIME > to_date('20041101', 'yyyymmdd');
    不加任何hint,计划为下面,应该是由于t内的数据量不够,优化器自动走了全表
    -----------------------------------------------------------------------------------------
    | Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |          |    49 |  1862 |    34   (3)| 00:00:01 |
    |   1 |  NESTED LOOPS                |          |    49 |  1862 |    34   (3)| 00:00:01 |
    |*  2 |   TABLE ACCESS FULL          | T        |  1635 | 21255 |    33   (4)| 00:00:01 |
    |*  3 |   TABLE ACCESS BY INDEX ROWID| X        |     1 |    25 |     1   (0)| 00:00:01 |
    |*  4 |    INDEX UNIQUE SCAN         | IND_X_PK |     1 |       |     0   (0)| 00:00:01 |
    -----------------------------------------------------------------------------------------


    首先想到的就是object_id为2个表各自的主健,当然要用
      
    select /*+ INDEX(T,IND_T_PK) */
    x.object_id,x.name,x.type
    from x,t
    where x.object_id =t.object_id
    and t.LAST_DDL_TIME >to_date('20040801','yyyymmdd');

    -----------------------------------------------------------------------------------------
    | Id  | Operation                    | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |          |    99 |  3762 |   101   (0)| 00:00:02 |
    |   1 |  NESTED LOOPS                |          |    99 |  3762 |   101   (0)| 00:00:02 |
    |   2 |   TABLE ACCESS FULL          | X        |    99 |  2475 |     2   (0)| 00:00:01 |
    |*  3 |   TABLE ACCESS BY INDEX ROWID| T        |     1 |    13 |     1   (0)| 00:00:01 |
    |*  4 |    INDEX UNIQUE SCAN         | IND_T_PK |     1 |       |     0   (0)| 00:00:01 |
    -----------------------------------------------------------------------------------------
    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
            208  consistent gets
              0  physical reads
              0  redo size
           1877  bytes sent via SQL*Net to client
            407  bytes received via SQL*Net from client
              4  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
             38  rows processed
    随后反复调整时间范围,consistent gets没有任何变化,由于现时中表记录很多,响应时间在20多秒,无法接受

    调整SQL使用基于时间的索引
     select /*+ INDEX(T,IND_T) */
      x.object_id, x.name, x.type
       from x, t
      where x.object_id = t.object_id
        and x.type = '22'
        and t.LAST_DDL_TIME > to_date('20040101', 'yyyymmdd');

    --------------------------------------------------------------------------------------
    | Id  | Operation                    | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |       |    50 |  1900 |  9777   (1)| 00:01:58 |
    |*  1 |  HASH JOIN                   |       |    50 |  1900 |  9777   (1)| 00:01:58 |
    |*  2 |   TABLE ACCESS FULL          | X     |    50 |  1250 |     2   (0)| 00:00:01 |
    |   3 |   TABLE ACCESS BY INDEX ROWID| T     | 10838 |   137K|  9774   (1)| 00:01:58 |
    |*  4 |    INDEX RANGE SCAN          | IND_T | 10838 |       |    30   (0)| 00:00:01 |
    --------------------------------------------------------------------------------------
    Statistics
    ----------------------------------------------------------
              1  recursive calls
              0  db block gets
           9767  consistent gets
             14  physical reads
              0  redo size
           1439  bytes sent via SQL*Net to client
            396  bytes received via SQL*Net from client
              3  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
             29  rows processed
    当时间条件为20040101时(包含全部记录),得到了一个恐怖的consistent gets
    但是现实中的时间窗口跨度不大

     select /*+ INDEX(T,IND_T) */
      x.object_id, x.name, x.type
       from x, t
      where x.object_id = t.object_id
        and x.type = '22'
        and t.LAST_DDL_TIME > to_date('20041220', 'yyyymmdd');
    Statistics
    ----------------------------------------------------------
              1  recursive calls
              0  db block gets
            149  consistent gets
              0  physical reads
              0  redo size
            386  bytes sent via SQL*Net to client
            374  bytes received via SQL*Net from client
              1  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
              0  rows processed
    在取出的记录数少的时候,用时间的索引可以控制consistent gets达到更少的值

    由此想到建立一个基于时间和object_id的联合唯一索引,可能会获得更好的性能
    create unique index ind_t_unique on t(LAST_DDL_TIME,object_id);

     select /*+ INDEX(T,IND_T_UNIQUE) */
      x.object_id, x.name, x.type
       from x, t
      where x.object_id = t.object_id
        and x.type = '22'
    -----------------------------------------------------------------------------------
    | Id  | Operation          | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
    -----------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT   |              |    50 |  1500 |    38   (3)| 00:00:01 |
    |*  1 |  HASH JOIN         |              |    50 |  1500 |    38   (3)| 00:00:01 |
    |*  2 |   TABLE ACCESS FULL| X            |    50 |  1250 |     2   (0)| 00:00:01 |
    |   3 |   INDEX FULL SCAN  | IND_T_UNIQUE | 10838 | 54190 |    35   (0)| 00:00:01 |
    -----------------------------------------------------------------------------------
    Statistics
    ----------------------------------------------------------
              1  recursive calls
              0  db block gets
             40  consistent gets
             13  physical reads
              0  redo size
           1439  bytes sent via SQL*Net to client
            396  bytes received via SQL*Net from client
              3  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
             29  rows processed
    不加任何时间条件,索引使用了INDEX FULL SCAN,而这时的consistent gets也很小。加上时间条件就会变成ind_t_unique的INDEX UNIQUE SCAN,结果有望更加惊人。
     select /*+ INDEX(T,IND_T_UNIQUE) */
      x.object_id, x.name, x.type
       from x, t
      where x.object_id = t.object_id
        and x.type = '22'
        and t.LAST_DDL_TIME > to_date('20041220', 'yyyymmdd');
    ---------------------------------------------------------------------------------------------
    | Id  | Operation                    | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
    ---------------------------------------------------------------------------------------------
    |   0 | SELECT STATEMENT             |              |    49 |  1862 |     3   (0)| 00:00:01 |
    |   1 |  NESTED LOOPS                |              |    49 |  1862 |     3   (0)| 00:00:01 |
    |*  2 |   INDEX RANGE SCAN           | IND_T_UNIQUE |   151 |  1963 |     2   (0)| 00:00:01 |
    |*  3 |   TABLE ACCESS BY INDEX ROWID| X            |     1 |    25 |     1   (0)| 00:00:01 |
    |*  4 |    INDEX UNIQUE SCAN         | IND_X_PK     |     1 |       |     0   (0)| 00:00:01 |
    ---------------------------------------------------------------------------------------------   
    Statistics
    ----------------------------------------------------------
              0  recursive calls
              0  db block gets
              6  consistent gets
              0  physical reads
              0  redo size
            386  bytes sent via SQL*Net to client
            374  bytes received via SQL*Net from client
              1  SQL*Net roundtrips to/from client
              0  sorts (memory)
              0  sorts (disk)
              0  rows processed

    经过这样的测试,就可以在运行环境建立对应索引测试性能了,现实环境比这个测试复杂得多,表的个数8个,而且大量的嵌套。而且也武斗正在执行,很难随便建立索引进行各种测试。
    所以根据现实的sql搭建一个模拟的环境进行测试是sql调整的一个有效方法。

  • 相关阅读:
    【Linux】gcc/g++ && gdb 使用
    Mysql 数据库基础(重点)
    TypeScript系列之类型 void
    【云原生】原来2020.0.X版本开始的OpenFeign底层不再使用Ribbon了
    Ubuntu系统中tree的用法
    自定义注解使用
    Java BigDecimal计算及转换
    Android 10 如何自定义屏保功能
    mac帧 arp
    java基础10题
  • 原文地址:https://blog.csdn.net/eeeeety6208/article/details/126831774