
创建索引的方式包括两种:
PRIMARY KEY 和 UNIQUE 约束列时自动创建索引。CREAT INDEX 语句创建,建立之后由数据库负责使用和维护索引。下面展示了 MySQL 中隐式创建的索引和显示创建的索引,不过需要注意的是索引查询时显示的索引类型为 BTREE,但实际上 MySQL 默认使用的是 B+ 树。
DROP TABLE IF EXISTS student;
CREATE TABLE student
(
student_id INT PRIMARY KEY,
student_name VARCHAR(50) UNIQUE,
gender ENUM ('Male', 'Female') DEFAULT 'Male',
major VARCHAR(50) NOT NULL,
birthdate DATE NOT NULL
);
INSERT INTO student (student_id, student_name, gender, major, birthdate)
VALUES (1, 'Alice', 'Female', 'Computer Science', '2000-05-15'),
(2, 'Andrew', 'Male', 'Engineering', '1999-09-20'),
(3, 'Maria', 'Female', 'Mathematics', '2001-03-10'),
(4, 'Samantha', 'Female', 'Physics', '2002-01-25'),
(5, 'Michael', 'Male', 'Biology', '1998-11-05'),
(6, 'Jessica', 'Female', 'History', '1997-06-30'),
(7, 'William', 'Male', 'Geology', '1996-04-15'),
(8, 'John', 'Male', 'Computer Science', '1995-08-08');
SHOW INDEXES FROM student;
# +-------+----------+------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
# |Table |Non_unique|Key_name |Seq_in_index|Column_name |Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|Index_comment|Visible|Expression|
# +-------+----------+------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
# |student|0 |PRIMARY |1 |student_id |A |4 |null |null | |BTREE | | |YES |null |
# |student|0 |student_name|1 |student_name|A |4 |null |null |YES |BTREE | | |YES |null |
# +-------+----------+------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
# 在 birthdate 字段上创建唯一索引
CREATE UNIQUE INDEX idx_major ON student (birthdate);
# 在 birthdate 字段上创建普通索引
CREATE INDEX idx_birthdate ON student (birthdate);
# +-------+----------+-------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
# |Table |Non_unique|Key_name |Seq_in_index|Column_name |Collation|Cardinality|Sub_part|Packed|Null|Index_type|Comment|Index_comment|Visible|Expression|
# +-------+----------+-------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
# |student|0 |PRIMARY |1 |student_id |A |4 |null |null | |BTREE | | |YES |null |
# |student|0 |idx_major |1 |birthdate |A |8 |null |null | |BTREE | | |YES |null |
# |student|0 |student_name |1 |student_name|A |4 |null |null |YES |BTREE | | |YES |null |
# |student|1 |idx_birthdate|1 |birthdate |A |8 |null |null | |BTREE | | |YES |null |
# +-------+----------+-------------+------------+------------+---------+-----------+--------+------+----+----------+-------+-------------+-------+----------+
SHOW TABLE STATUS LIKE 'student';
# +-------+------+-------+----------+----+--------------+-----------+---------------+------------+---------+--------------+-------------------+-------------------+----------+------------------+--------+--------------+-------+
# |Name |Engine|Version|Row_format|Rows|Avg_row_length|Data_length|Max_data_length|Index_length|Data_free|Auto_increment|Create_time |Update_time |Check_time|Collation |Checksum|Create_options|Comment|
# +-------+------+-------+----------+----+--------------+-----------+---------------+------------+---------+--------------+-------------------+-------------------+----------+------------------+--------+--------------+-------+
# |student|InnoDB|10 |Dynamic |8 |2048 |16384 |0 |16384 |0 |null |2023-09-24 21:09:40|2023-09-24 21:09:04|null |utf8mb4_0900_ai_ci|null | | |
# +-------+------+-------+----------+----+--------------+-----------+---------------+------------+---------+--------------+-------------------+-------------------+----------+------------------+--------+--------------+-------+
聚簇索引:聚簇索引(clustered index)也叫聚集索引、聚类索引,一张表有且仅有一个聚簇索引。
默认情况下,MySQL 在建表时会根据不同的场景选择不同的列建立聚簇索引:
对于一个主键聚簇索引来说,叶子结点中按照主键顺序依次存放着整张表的全部记录。这个特性决定了索引组织表中的数据本身就是索引的一部分,每张表也只能拥有一个聚簇索引。
非聚簇索引:非聚簇索引(secondary index)也叫辅助索引、二级索引,一张表可以有多个非聚簇索引。
辅助索引不是建立在记录上的,而是建立在上述的聚簇索引上的。对于一个非聚簇索引,叶子结点中只存放当前关键字以及对应的聚簇索引关键字(见下例),当我们要查找整条记录时,需要在访问非聚簇索引的叶结点后继续访问聚簇索引查找完整信息,这个过程也叫做回表,总计需要查找两次索引。
不过,如果非聚簇索引存在索引覆盖也能避免回表。索引覆盖指的是一个索引覆盖或者说包含了所有需要查询的字段,即需要查询的字段正好是索引的字段,那么直接根据该索引就可以查到数据,无需回表查询。
基于上面示例中的 student 表,如果 student_id 聚簇索引采用 B+ 树来实现,那么这个聚簇索引的四个叶子结点中会按 student_id 递增依次存放对应记录。当我们通过 student_id 聚簇索引查询住址时,会直接在叶子结点中查找成功。如果我们再在 studnet_name 字段上建立一个非聚簇索引,叶子结点只会存放 studnet_name 与 student_id ,不会存储完整记录。当我们通过 studnet_name 非聚簇索引查询住址时,在非聚簇索引中根据 studnet_name 查找到 student_id 后,要继续根据 student_id 回表查找地址。
s
t
u
d
e
n
t
_
i
d
(主键聚簇索引):
8
{
8
{
[
(
7
,
W
i
l
l
i
a
m
,
M
a
l
e
,
.
.
.
)
,
(
8
,
J
o
h
n
,
M
a
l
e
,
.
.
.
)
]
[
(
5
,
M
i
c
h
a
e
l
,
M
a
l
e
,
.
.
.
)
,
(
6
,
J
e
s
s
i
c
a
,
F
e
m
a
l
e
,
.
.
.
)
]
4
{
[
(
3
,
M
a
r
i
a
,
F
e
m
a
l
e
,
.
.
.
)
,
(
4
,
S
a
m
a
n
t
h
a
,
F
e
m
a
l
e
,
.
.
.
)
]
[
(
1
,
A
l
i
c
e
,
F
e
m
a
l
e
,
.
.
.
)
,
(
2
,
A
n
d
r
e
w
,
M
a
l
e
,
.
.
.
)
]
s
t
u
d
e
n
t
_
n
a
m
e
(非聚簇索引):
W
i
l
l
i
a
m
{
W
i
l
l
i
a
m
{
[
(
S
a
m
a
n
t
h
a
,
4
)
,
(
W
i
l
l
i
a
m
,
7
)
]
[
(
M
a
r
i
a
,
3
)
,
(
M
i
c
h
a
e
l
,
5
)
]
,
J
o
h
n
{
[
(
J
e
s
s
i
c
a
,
6
)
,
(
J
o
h
n
,
8
)
]
[
(
A
l
i
c
e
,
1
)
,
(
A
n
d
r
e
w
,
2
)
]
student_id(主键聚簇索引):8{8{[(7,William,Male,...),(8,John,Male,...)][(5,Michael,Male,...),(6,Jessica,Female,...)]4{[(3,Maria,Female,...),(4,Samantha,Female,...)][(1,Alice,Female,...),(2,Andrew,Male,...)]
联合索引即由多个列组成的索引,而索引下推是一种基于联合索引的查询优化策略。
在下面的例子中,需要查询计算机专业所有女生,如果不采用索引下推,因为最左前缀匹配原则,首先会根据辅助索引 idx_major_gender 匹配到四条 major = 'Computer Science' 的记录,然后根据这四条记录中的主键值依次回表获取对应的四条完整记录,最后通过 gender = 'Female' 对这四条记录进行后过滤,累计需要四次回表。
如果采用索引下推,gender 字段的过滤操作也会在联合索引中完成,即在辅助索引中完全匹配所有 major = 'Computer Science AND gender = 'Female' 的记录,这样只需要回表两次。可以说索引下推是对联合索引更充分的利用。
此外,我们可以使用 EXPLAIN 命令来查看查询执行计划,Using index condition 表明索引下推发挥了作用,Using where 则表明采用的是上述的后过滤。
DROP TABLE IF EXISTS student;
CREATE TABLE student
(
student_id INT PRIMARY KEY,
student_name VARCHAR(50) UNIQUE,
gender ENUM ('Male', 'Female') DEFAULT 'Male',
major VARCHAR(50) NOT NULL,
birthdate DATE NOT NULL
);
INSERT INTO student (student_id, student_name, gender, major, birthdate)
VALUES (1, 'Alice', 'Female', 'Computer Science', '2000-05-15'),
(2, 'Andrew', 'Male', 'Engineering', '1999-09-20'),
(3, 'Maria', 'Female', 'Mathematics', '2001-03-10'),
(4, 'Samantha', 'Female', 'Physics', '2002-01-25'),
(5, 'Michael', 'Male', 'Biology', '1998-11-05'),
(6, 'Jessica', 'Female', 'Computer Science', '1997-06-30'),
(7, 'William', 'Male', 'Computer Science', '1996-04-15'),
(8, 'John', 'Male', 'Computer Science', '1995-08-08');
EXPLAIN
SELECT student_id, student_name
FROM student
WHERE major = 'Computer Science'
AND gender = 'Female';
# +--+-----------+-------+----------+----+-------------+----+-------+----+----+--------+-----------+
# |id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra |
# +--+-----------+-------+----------+----+-------------+----+-------+----+----+--------+-----------+
# |1 |SIMPLE |student|null |ALL |null |null|null |null|8 |12.5 |Using where|
# +--+-----------+-------+----------+----+-------------+----+-------+----+----+--------+-----------+
# 建立在 major 上的普通索引
CREATE INDEX idx_major ON student (major);
EXPLAIN
SELECT student_id, student_name
FROM student
WHERE major = 'Computer Science'
AND gender = 'Female';
# +--+-----------+-------+----------+----+-------------+---------+-------+-----+----+--------+-----------+
# |id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra |
# +--+-----------+-------+----------+----+-------------+---------+-------+-----+----+--------+-----------+
# |1 |SIMPLE |student|null |ref |idx_major |idx_major|202 |const|4 |50 |Using where|
# +--+-----------+-------+----------+----+-------------+---------+-------+-----+----+--------+-----------+
# 建立在 major 和 gender 上的联合索引
CREATE INDEX idx_major_gender ON student (major, gender);
EXPLAIN
SELECT student_id, student_name
FROM student
WHERE major = 'Computer Science'
AND gender = 'Female';
# +--+-----------+-------+----------+----+--------------------------+----------------+-------+-----------+----+--------+---------------------+
# |id|select_type|table |partitions|type|possible_keys |key |key_len|ref |rows|filtered|Extra |
# +--+-----------+-------+----------+----+--------------------------+----------------+-------+-----------+----+--------+---------------------+
# |1 |SIMPLE |student|null |ref |idx_major,idx_major_gender|idx_major_gender|204 |const,const|2 |100 |Using index condition|
# +--+-----------+-------+----------+----+--------------------------+----------------+-------+-----------+----+--------+---------------------+
下面是对 m m m 阶 B+ 树的一些说明,不同资料上的定义不尽相同,这里参照维基百科对于 B+ 树的定义:
B+ 树相比于 B 树的优势:
索引失效主要以以下几种情况:
SELECT * FROM tb_user WHERE SUBSTRING(address, 1, 2) = '上海';SELECT * FROM tb_user WHERE phone = 12345678910;。SELECT * FROM tb_user WHERE address like '%海';OR,且 OR 前后任意一个条件字段没有索引。最左前缀匹配法则:在使用联合索引时,数据库会根据联合索引中的字段顺序,从左到右依次到查询条件中去匹配,索引字段顺序与条件字段顺序必须完全一致。如果在匹配过程中发生了跳跃就会使后面的字段索引失效。但如果查询条件由 AND 连接,优化器会自动调整顺序以避免索引失效,此时的索引失效主要考虑字段缺失。
最左前缀示例:
create index index_a_b_c on wide_table (a, b, c);
-- 索引生效
explain select * from wide_table where a = '1' and b = '1';
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
# |id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra|
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
# |1 |SIMPLE |wide_table|null |ref |index_a_b_c |index_a_b_c|2046 |const,const|1 |100 |null |
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
-- 索引生效
explain select * from wide_table where b = '1' and a = '1';
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
# |id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra|
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
# |1 |SIMPLE |wide_table|null |ref |index_a_b_c |index_a_b_c|2046 |const,const|1 |100 |null |
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----------+----+--------+-----+
-- 索引生效
explain select * from wide_table where b > '1' and a = '1';
# +--+-----------+----------+----------+-----+-------------+-----------+-------+----+----+--------+---------------------+
# |id|select_type|table |partitions|type |possible_keys|key |key_len|ref |rows|filtered|Extra |
# +--+-----------+----------+----------+-----+-------------+-----------+-------+----+----+--------+---------------------+
# |1 |SIMPLE |wide_table|null |range|index_a_b_c |index_a_b_c|2046 |null|1 |100 |Using index condition|
# +--+-----------+----------+----------+-----+-------------+-----------+-------+----+----+--------+---------------------+
-- 未走索引
explain select * from wide_table where b = '1';
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----------+
# |id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows |filtered|Extra |
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----------+
# |1 |SIMPLE |wide_table|null |ALL |null |null|null |null|99551|10 |Using where|
# +--+-----------+----------+----------+----+-------------+----+-------+----+-----+--------+-----------+
-- 索引部分失效
explain select * from wide_table where a = '1' and c = '1';
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
# |id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra |
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
# |1 |SIMPLE |wide_table|null |ref |index_a_b_c |index_a_b_c|1023 |const|1 |10 |Using index condition|
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------
-- 索引部分失效
explain select * from wide_table where c = '1' and a = '1';
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
# |id|select_type|table |partitions|type|possible_keys|key |key_len|ref |rows|filtered|Extra |
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
# |1 |SIMPLE |wide_table|null |ref |index_a_b_c |index_a_b_c|1023 |const|1 |10 |Using index condition|
# +--+-----------+----------+----------+----+-------------+-----------+-------+-----+----+--------+---------------------+
