• MySQL 中的索引


    在这里插入图片描述


    一、索引的创建

    创建索引的方式包括两种:

    • 隐式创建:数据库一般会在创建 PRIMARY KEYUNIQUE 约束列时自动创建索引。
    • 显示创建:使用 CREAT INDEX 语句创建,建立之后由数据库负责使用和维护索引。

    下面展示了 MySQL 中隐式创建的索引和显示创建的索引,不过需要注意的是索引查询时显示的索引类型为 BTREE,但实际上 MySQL 默认使用的是 B+ 树。

    DROP TABLE IF EXISTS student;
    
    CREATE TABLE student
    (
        student_id   INT PRIMARY KEY,
        student_name VARCHAR(50) UNIQUE,
        gender       ENUM ('Male', 'Female') DEFAULT 'Male',
        major        VARCHAR(50) NOT NULL,
        birthdate    DATE        NOT NULL
    );
    
    INSERT INTO student (student_id, student_name, gender, major, birthdate)
    VALUES (1, 'Alice', 'Female', 'Computer Science', '2000-05-15'),
           (2, 'Andrew', 'Male', 'Engineering', '1999-09-20'),
           (3, 'Maria', 'Female', 'Mathematics', '2001-03-10'),
           (4, 'Samantha', 'Female', 'Physics', '2002-01-25'),
           (5, 'Michael', 'Male', 'Biology', '1998-11-05'),
           (6, 'Jessica', 'Female', 'History', '1997-06-30'),
           (7, 'William', 'Male', 'Geology', '1996-04-15'),
           (8, 'John', 'Male', 'Computer Science', '1995-08-08');
    
    SHOW INDEXES FROM student;
    # +-------+----------+------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
    # |Table  |Non_unique|Key_name    |Seq_in_index|Column_name |Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|Index_comment|Visible|Expression|
    # +-------+----------+------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
    # |student|0         |PRIMARY     |1           |student_id  |A        |4          |null    |null  |    |BTREE     |       |             |YES    |null      |
    # |student|0         |student_name|1           |student_name|A        |4          |null    |null  |YES |BTREE     |       |             |YES    |null      |
    # +-------+----------+------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
    
    # 在 birthdate 字段上创建唯一索引
    CREATE UNIQUE INDEX idx_major ON student (birthdate);
    
    # 在 birthdate 字段上创建普通索引
    CREATE INDEX idx_birthdate ON student (birthdate);
    # +-------+----------+-------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
    # |Table  |Non_unique|Key_name     |Seq_in_index|Column_name |Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|Index_comment|Visible|Expression|
    # +-------+----------+-------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
    # |student|0         |PRIMARY      |1           |student_id  |A        |4          |null    |null  |    |BTREE     |       |             |YES    |null      |
    # |student|0         |idx_major    |1           |birthdate   |A        |8          |null    |null  |    |BTREE     |       |             |YES    |null      |
    # |student|0         |student_name |1           |student_name|A        |4          |null    |null  |YES |BTREE     |       |             |YES    |null      |
    # |student|1         |idx_birthdate|1           |birthdate   |A        |8          |null    |null  |    |BTREE     |       |             |YES    |null      |
    # +-------+----------+-------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
    
    SHOW TABLE STATUS LIKE 'student';
    # +-------+------+-------+----------+----+--------------+-----------+---------------+------------+---------+--------------+-------------------+-------------------+----------+------------------+--------+--------------+-------+
    # |Name   |Engine|Version|Row_format|Rows|Avg_row_length|Data_length|Max_data_length|Index_length|Data_free|Auto_increment|Create_time        |Update_time        |Check_time|Collation         |Checksum|Create_options|Comment|
    # +-------+------+-------+----------+----+--------------+-----------+---------------+------------+---------+--------------+-------------------+-------------------+----------+------------------+--------+--------------+-------+
    # |student|InnoDB|10     |Dynamic   |8   |2048          |16384      |0              |16384       |0        |null          |2023-09-24 21:09:40|2023-09-24 21:09:04|null      |utf8mb4_0900_ai_ci|null    |              |       |
    # +-------+------+-------+----------+----+--------------+-----------+---------------+------------+---------+--------------+-------------------+-------------------+----------+------------------+--------+--------------+-------+
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49

    二、聚簇索引与非聚簇索引

    聚簇索引:聚簇索引(clustered index)也叫聚集索引、聚类索引,一张表有且仅有一个聚簇索引。

    默认情况下,MySQL 在建表时会根据不同的场景选择不同的列建立聚簇索引:

    • 聚簇索引默认是主键索引,会在定义主键时自动创建聚簇索引。
    • 如果表中没定义主键,那么就选取首个非空且唯一的索引列来建立聚簇索引。
    • 如果前两条都不满足,那么数据库就创建一个隐藏的 row-id 作为聚簇索引。

    对于一个主键聚簇索引来说,叶子结点中按照主键顺序依次存放着整张表的全部记录。这个特性决定了索引组织表中的数据本身就是索引的一部分,每张表也只能拥有一个聚簇索引。


    非聚簇索引:非聚簇索引(secondary index)也叫辅助索引、二级索引,一张表可以有多个非聚簇索引。

    辅助索引不是建立在记录上的,而是建立在上述的聚簇索引上的。对于一个非聚簇索引,叶子结点中只存放当前关键字以及对应的聚簇索引关键字(见下例),当我们要查找整条记录时,需要在访问非聚簇索引的叶结点后继续访问聚簇索引查找完整信息,这个过程也叫做回表,总计需要查找两次索引。

    不过,如果非聚簇索引存在索引覆盖也能避免回表。索引覆盖指的是一个索引覆盖或者说包含了所有需要查询的字段,即需要查询的字段正好是索引的字段,那么直接根据该索引就可以查到数据,无需回表查询。


    基于上面示例中的 student 表,如果 student_id 聚簇索引采用 B+ 树来实现,那么这个聚簇索引的四个叶子结点中会按 student_id 递增依次存放对应记录。当我们通过 student_id 聚簇索引查询住址时,会直接在叶子结点中查找成功。如果我们再在 studnet_name 字段上建立一个非聚簇索引,叶子结点只会存放 studnet_namestudent_id ,不会存储完整记录。当我们通过 studnet_name 非聚簇索引查询住址时,在非聚簇索引中根据 studnet_name 查找到 student_id 后,要继续根据 student_id 回表查找地址。

    s t u d e n t _ i d (主键聚簇索引): 8 { 8 { [ ( 7 , W i l l i a m , M a l e , . . . ) , ( 8 , J o h n , M a l e , . . . ) ] [ ( 5 , M i c h a e l , M a l e , . . . ) , ( 6 , J e s s i c a , F e m a l e , . . . ) ] 4 { [ ( 3 , M a r i a , F e m a l e , . . . ) , ( 4 , S a m a n t h a , F e m a l e , . . . ) ] [ ( 1 , A l i c e , F e m a l e , . . . ) , ( 2 , A n d r e w , M a l e , . . . ) ] s t u d e n t _ n a m e (非聚簇索引): W i l l i a m { W i l l i a m { [ ( S a m a n t h a , 4 ) , ( W i l l i a m , 7 ) ] [ ( M a r i a , 3 ) , ( M i c h a e l , 5 ) ] , J o h n { [ ( J e s s i c a , 6 ) , ( J o h n , 8 ) ] [ ( A l i c e , 1 ) , ( A n d r e w , 2 ) ] student_id8{8{[(7,William,Male,...),(8,John,Male,...)][(5,Michael,Male,...),(6,Jessica,Female,...)]4{[(3,Maria,Female,...),(4,Samantha,Female,...)][(1,Alice,Female,...),(2,Andrew,Male,...)]

    \\\\ student_nameWilliam{William{[(Samantha,4),(William,7)][(Maria,3),(Michael,5)],John{[(Jessica,6),(John,8)][(Alice,1),(Andrew,2)]
    student_id(主键聚簇索引):8 8 [(7,William,Male,...),(8,John,Male,...)][(5,Michael,Male,...),(6,Jessica,Female,...)]4 [(3,Maria,Female,...),(4,Samantha,Female,...)][(1,Alice,Female,...),(2,Andrew,Male,...)]student_name(非聚簇索引):William William [(Samantha,4),(William,7)][(Maria,3),(Michael,5)],John [(Jessica,6),(John,8)][(Alice,1),(Andrew,2)]


    三、联合索引和索引下推

    联合索引即由多个列组成的索引,而索引下推是一种基于联合索引的查询优化策略。

    在下面的例子中,需要查询计算机专业所有女生,如果不采用索引下推,因为最左前缀匹配原则,首先会根据辅助索引 idx_major_gender 匹配到四条 major = 'Computer Science' 的记录,然后根据这四条记录中的主键值依次回表获取对应的四条完整记录,最后通过 gender = 'Female' 对这四条记录进行后过滤,累计需要四次回表。

    如果采用索引下推,gender 字段的过滤操作也会在联合索引中完成,即在辅助索引中完全匹配所有 major = 'Computer Science AND gender = 'Female' 的记录,这样只需要回表两次。可以说索引下推是对联合索引更充分的利用。

    此外,我们可以使用 EXPLAIN 命令来查看查询执行计划,Using index condition 表明索引下推发挥了作用,Using where 则表明采用的是上述的后过滤。

    DROP TABLE IF EXISTS student;
    
    CREATE TABLE student
    (
        student_id   INT PRIMARY KEY,
        student_name VARCHAR(50) UNIQUE,
        gender       ENUM ('Male', 'Female') DEFAULT 'Male',
        major        VARCHAR(50) NOT NULL,
        birthdate    DATE        NOT NULL
    );
    
    INSERT INTO student (student_id, student_name, gender, major, birthdate)
    VALUES (1, 'Alice', 'Female', 'Computer Science', '2000-05-15'),
           (2, 'Andrew', 'Male', 'Engineering', '1999-09-20'),
           (3, 'Maria', 'Female', 'Mathematics', '2001-03-10'),
           (4, 'Samantha', 'Female', 'Physics', '2002-01-25'),
           (5, 'Michael', 'Male', 'Biology', '1998-11-05'),
           (6, 'Jessica', 'Female', 'Computer Science', '1997-06-30'),
           (7, 'William', 'Male', 'Computer Science', '1996-04-15'),
           (8, 'John', 'Male', 'Computer Science', '1995-08-08');
    
    EXPLAIN
    SELECT student_id, student_name
    FROM student
    WHERE major = 'Computer Science'
      AND gender = 'Female';
    # +--+-----------+-------+----------+----+-------------+----+-------+----+----+--------+-----------+
    # |id|select_type|table  |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra      |
    # +--+-----------+-------+----------+----+-------------+----+-------+----+----+--------+-----------+
    # |1 |SIMPLE     |student|null      |ALL |null         |null|null   |null|8   |12.5    |Using where|
    # +--+-----------+-------+----------+----+-------------+----+-------+----+----+--------+-----------+
    
    # 建立在 major 上的普通索引
    CREATE INDEX idx_major ON student (major);
    
    EXPLAIN
    SELECT student_id, student_name
    FROM student
    WHERE major = 'Computer Science'
      AND gender = 'Female';
    # +--+-----------+-------+----------+----+-------------+---------+-------+-----+----+--------+-----------+
    # |id|select_type|table  |partitions|type|possible_keys|key      |key_len|ref  |rows|filtered|Extra      |
    # +--+-----------+-------+----------+----+-------------+---------+-------+-----+----+--------+-----------+
    # |1 |SIMPLE     |student|null      |ref |idx_major    |idx_major|202    |const|4   |50      |Using where|
    # +--+-----------+-------+----------+----+-------------+---------+-------+-----+----+--------+-----------+
    
    # 建立在 major 和 gender 上的联合索引
    CREATE INDEX idx_major_gender ON student (major, gender);
    
    EXPLAIN
    SELECT student_id, student_name
    FROM student
    WHERE major = 'Computer Science'
      AND gender = 'Female';
    # +--+-----------+-------+----------+----+--------------------------+----------------+-------+-----------+----+--------+---------------------+
    # |id|select_type|table  |partitions|type|possible_keys             |key             |key_len|ref        |rows|filtered|Extra                |
    # +--+-----------+-------+----------+----+--------------------------+----------------+-------+-----------+----+--------+---------------------+
    # |1 |SIMPLE     |student|null      |ref |idx_major,idx_major_gender|idx_major_gender|204    |const,const|2   |100     |Using index condition|
    # +--+-----------+-------+----------+----+--------------------------+----------------+-------+-----------+----+--------+---------------------+
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49
    • 50
    • 51
    • 52
    • 53
    • 54
    • 55
    • 56
    • 57
    • 58
    • 59

    四、B+ 树索引

    下面是对 m m m 阶 B+ 树的一些说明,不同资料上的定义不尽相同,这里参照维基百科对于 B+ 树的定义:

    • 每个结点至多有 m m m 棵子树, m − 1 m - 1 m1 个关键字。
    • 除根结点外所有内部结点(非叶节点)至少有 ⌊ m / 2 ⌋ \lfloor m/2 \rfloor m/2 棵子树, ⌊ m / 2 ⌋ − 1 \lfloor m/2 \rfloor - 1 m/21个关键字。
    • 所有叶节点都在相同的高度上(绝对平衡),且结构为 [ P 0 k 0 P 1 k 1 … P n − 1 K n − 1 P n ] [P_0 k_0 P_1 k_1 \dots P_{n-1} K_{n-1} P_n] [P0k0P1k1Pn1Kn1Pn](键值交错)。
    • 叶结点本身按关键字大小从小到大链接。

    B+ 树相比于 B 树的优势:

    • B+ 树只有叶子节点同时存放键和数据,内部节点只存放键,所以同样大小的磁盘页上可以容纳更多键值对。在相同的数据量下,B+ 树通常具有更扁平的树结构,这意味着在查找操作中需要更少的磁盘 I/O 操作,因此查找效率更高
    • B+ 树的叶子节点之间以双向链表的形式顺序连接,便于范围查找和遍历

    五、索引失效

    索引失效主要以以下几种情况:

    • 使用联合索引时未遵循最左前缀法则。
    • 在索引字段上进行了计算或者调用了函数。如:SELECT * FROM tb_user WHERE SUBSTRING(address, 1, 2) = '上海';
    • 字符串字段没有加引号,发生隐式转换。如:SELECT * FROM tb_user WHERE phone = 12345678910;
    • 采用模糊查询,且为头部模糊匹配。如:SELECT * FROM tb_user WHERE address like '%海';
    • 查询条件中使用 OR,且 OR 前后任意一个条件字段没有索引。
    • 由于数据分布等原因,使用索引后仍需要回表查询大量数据,此时全表扫描效率高于使用索引。

    最左前缀匹配法则:在使用联合索引时,数据库会根据联合索引中的字段顺序,从左到右依次到查询条件中去匹配,索引字段顺序与条件字段顺序必须完全一致。如果在匹配过程中发生了跳跃就会使后面的字段索引失效。但如果查询条件由 AND 连接,优化器会自动调整顺序以避免索引失效,此时的索引失效主要考虑字段缺失

    最左前缀示例:

    create index index_a_b_c on wide_table (a, b, c);
    
    -- 索引生效
    explain select * from wide_table where a = '1' and b = '1';
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
    # |id|select_type|table     |partitions|type|possible_keys|key        |key_len|ref        |rows|filtered|Extra|
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
    # |1 |SIMPLE     |wide_table|null      |ref |index_a_b_c  |index_a_b_c|2046   |const,const|1   |100     |null |
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
    
    -- 索引生效
    explain select * from wide_table where b = '1' and a = '1';
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
    # |id|select_type|table     |partitions|type|possible_keys|key        |key_len|ref        |rows|filtered|Extra|
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
    # |1 |SIMPLE     |wide_table|null      |ref |index_a_b_c  |index_a_b_c|2046   |const,const|1   |100     |null |
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
    
    -- 索引生效
    explain select * from wide_table where b > '1' and a = '1';
    # +--+-----------+----------+----------+-----+-------------+-----------+-------+----+----+--------+---------------------+
    # |id|select_type|table     |partitions|type |possible_keys|key        |key_len|ref |rows|filtered|Extra                |
    # +--+-----------+----------+----------+-----+-------------+-----------+-------+----+----+--------+---------------------+
    # |1 |SIMPLE     |wide_table|null      |range|index_a_b_c  |index_a_b_c|2046   |null|1   |100     |Using index condition|
    # +--+-----------+----------+----------+-----+-------------+-----------+-------+----+----+--------+---------------------+
    
    -- 未走索引
    explain select * from wide_table where b = '1';
    # +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----------+
    # |id|select_type|table     |partitions|type|possible_keys|key |key_len|ref |rows |filtered|Extra      |
    # +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----------+
    # |1 |SIMPLE     |wide_table|null      |ALL |null         |null|null   |null|99551|10      |Using where|
    # +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----------+
    
    -- 索引部分失效
    explain select * from wide_table where a = '1' and c = '1';
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
    # |id|select_type|table     |partitions|type|possible_keys|key        |key_len|ref  |rows|filtered|Extra                |
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
    # |1 |SIMPLE     |wide_table|null      |ref |index_a_b_c  |index_a_b_c|1023   |const|1   |10      |Using index condition|
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------
    
    -- 索引部分失效
    explain select * from wide_table where c = '1' and a = '1';
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
    # |id|select_type|table     |partitions|type|possible_keys|key        |key_len|ref  |rows|filtered|Extra                |
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
    # |1 |SIMPLE     |wide_table|null      |ref |index_a_b_c  |index_a_b_c|1023   |const|1   |10      |Using index condition|
    # +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
    
    • 1
    • 2
    • 3
    • 4
    • 5
    • 6
    • 7
    • 8
    • 9
    • 10
    • 11
    • 12
    • 13
    • 14
    • 15
    • 16
    • 17
    • 18
    • 19
    • 20
    • 21
    • 22
    • 23
    • 24
    • 25
    • 26
    • 27
    • 28
    • 29
    • 30
    • 31
    • 32
    • 33
    • 34
    • 35
    • 36
    • 37
    • 38
    • 39
    • 40
    • 41
    • 42
    • 43
    • 44
    • 45
    • 46
    • 47
    • 48
    • 49

    在这里插入图片描述

  • 相关阅读:
    springboot+vue校园篮球比赛预约报名平台java maven
    【华为OD机试真题 JS】找城市
    2023最新SSM计算机毕业设计选题大全(附源码+LW)之java学生出国境学习交流管理87153
    强连通分量+缩点
    Python技能树评测
    【如何让图片自适应盒子大小】
    前后端交互常见的几种数据传输格式 form表单+get请求 form表单+post请求 json键值对格式
    双软企业认定需要什么条件
    论文理解与代码解析 :VoxelNet pytorch版本
    条码工具 Dynamic Web TWAIN HTML5 版本的工作原理
  • 原文地址:https://blog.csdn.net/qq_43686863/article/details/133242907