开发工程师必备————【Day16】数据库知识补充

今日内容概要

视图
触发器
存储过程
事务（重要）
内置函数
流程控制，循环结构
索引与慢查询优化

视图

1.视图的概念：
SQL语句的执行结果是一张虚拟表，我们可以基于该表做其它操作；
如果这张虚拟表需要频繁使用，那么为了方便可以将虚拟表保存起来，保存起来之后就成为“视图”（本质就是一张虚拟表）

2.代码展示：

create view 视图名 as SQL语句;

create view teacher2course as select * from teacher inner join course on teacher.tid = course.teacher_id;
1
2
3

3.视图作用：
（1）再硬盘中，视图只有表结构文件，没有表数据文件；
（2）视图通常是用于查询，尽量不要修改视图中的数据。
（3）对于视图，我们要做到尽量少用就尽量少用。

触发器

1.触发器概念：
针对数据的增，改，删自动触发的功能（增前，增后，改前，改后，删前，删后）

2.语法结构：
create trigger 触发器的名字 before/after insert/update/delete on 表名 for each row
begin
	SQL语句
end
1
2
3
4
注意触发器内部的SQL语句需要用到分号，但是分号又是SQL语句默认的结束符；
所以为了能够完整的写出触发器的代码，需要临时修改SQL语句默认的结束符：
delimiter $$
编写需要用到分号的各种语句；
delimiter ;
1
2
3

3.举个栗子：

CREATE TABLE cmd (
   id INT PRIMARY KEY auto_increment,
   USER CHAR (32),
  priv CHAR (10),
   cmd CHAR (64),
   sub_time datetime, #提交时间
   success enum ('yes', 'no') #0代表执行失败
);

CREATE TABLE errlog (
   id INT PRIMARY KEY auto_increment,
   err_cmd CHAR (64),
   err_time datetime
);
delimiter $$ # 将mysql默认的结束符由;换成$$
create tigger tri_after_insert_cmd after insert on cmd for each row begin
	if NEW.success = 'no' then    # 新记录都会被MySQL封装成NEW对象
	insert into errlog(err_cmd,err_time)
values(NEW.cmd,NEW.sub_time);
	end if;
end $$
delimiter;    # 结束之后记得再改回来，不然后面结束符就都是$$了
# 往表cmd中插入记录，触发触发器，根据IF的条件决定是否插入错误日志
INSERT INTO cmd(
	USER,
	priv,
	cmd,
	sub_time,
	success
)
VALUES
	('kevin','0755','ls -l /etc',NOW(),'yes'),
	('kevin','0755','cat /etc/passwd',NOW(),'no'),
	('kevin','0755','useradd xxx',NOW(),'no'),
	('kevin','0755','ps aux',NOW(),'yes');

# 查询errlog表记录
select * from errlog;
# 删除触发器
drop trigger tri_after_insert_cmd;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

事务（重要）

1.事务的四大特征（ACID）：
A：原子性
C：一致性
I：隔离性
D：持久性

原子性

整个事务中的所有操作，要么全部完成，要么全部不完成，不可能停滞在中间某个环节。
事务在执行过程中发生错误，会被回滚（Rollback）到事务开始前的状态，就像这个事务从来没有执行过一样。
1
2

一致性

在事务开始之前和事务结束以后，数据库的完整性约束没有被破坏。
1

隔离性

隔离状态执行事务，使他们好像是系统在给定时间内执行的唯一操作。
如果两个事务，运行在相同的时间内，执行相同的功能，事务的隔离性将确保每一事务在系统中认为只有该事务在使用系统。
这种属性有时称为串行化，为了防止事务操作间的混淆，必须串行化或序列化请求，使得在同一时间仅有一个请求在用一数据。
1
2
3

持久性

在事务完成以后，该事务所对数据库所做的更改便持久的保存在数据库之中，并不会被回滚。
1

2.代码展示：

create table user(
id int primary key auto_increment,
name char(32),
balance int
);

insert into user(name,balance)
values
('jason',1000),
('kevin',1000),
('tank',1000);

# 修改数据之前先开启事务操作
start transaction;

# 修改操作
update user set balance=900 where name = 'jason';    # 买支付100元
update user set balance=1010 where name='kevin';     #中介拿走10元
update user set balance=1090 where name='tank';      #卖家拿到90元

# 回滚到上一个状态
rollback;

# 开启事务之后，只要没有执行commit操作，数据其实都没有真正刷新到硬盘
commit;
"""开启事务检测操作是否完整，不完整主动回滚到上一个状态，如果完整就应该执行commit操作"""

# 站在python代码的角度，应该实现的伪代码逻辑，
try:
   update user set balance=900 where name='jason'; #买支付100元
   update user set balance=1010 where name='kevin'; #中介拿走10元
   update user set balance=1090 where name='tank'; #卖家拿到90元
except 异常:
   rollback;
else:
   commit;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

3.扩展知识点：
MySQL提供两种事务型存储引擎InnoDB和NDB cluster及第三方XtraDB，PBXT
事务处理中有几个关键词汇会反复出现：
事务（transaction）
回退（rollback）
提交（commit）
保留点（savepoint）
为了支持回退部分事务处理，必须能在事务处理块中合适的位置放置占位符，这样如果需要回退可以回退到某个占位符（保留点）
创建占位符可以使用savepoint
savepoint sp01；
回退到占位符地址：
rollback to sp01；

保留点在执行rollback或者commit之后自动释放

4.四种隔离级别：
在SQL标准中定义了四种隔离级别，每一种级别都规定了一个事务中所做的修改
InnoDB支持所有隔离级别：
set transaction isolation level 级别
1
read uncommitted(未提交读)
事务中的修改即使没有提交，对其他事务也都是可见的，事务可以读取未提交的数据，这一现象也称之为"脏读"
read committed(提交读)
大多数数据库系统默认的隔离级别；
一个事物从开始直到提交之前所做的任何修改对其他事务都是不可见的，这种级别也叫作“不可重复读”
repeatable read(可重复读) # MySQL默认隔离级别
能够解决“脏读”问题，但是无法解决“幻读”；
所谓幻读指的是当某个事务在读取某个范围内的记录时另外一个事务又在该范围内插入了新的记录，当之前的事务再次读取该范围的记录会产生幻行，InnoDB和XtraDB通过多版本并发控制(MVCC)及间隙锁策略解决该问题。
serializable(可串行读)
强制事务串行执行，很少使用该级别

5.事务日志可以帮助提高事务的效率：
（1）存储引擎在修改表的数据时只需要修改其内存拷贝再把该修改记录到持久在硬盘上的事务日志中，而不用每次都将修改的数据本身持久到磁盘；
（2）事务日志采用的是追加方式因此写日志操作是磁盘上一小块区域内的顺序IO而不像随机IO需要次哦按的多个地方移动磁头所以采用事务日志的方式相对来说要快的多；
（3）事务日志持久之后内存中被修改的数据再后台可以慢慢刷回磁盘，目前大多数存储引擎都是这样实现的，通常称之为"预写式日志"修改数据需要写两次磁盘。

6.MVCC多版本并发控制
MVCC只能在read committed(提交读)、repeatable read(可重复读)两种隔离级别下工作，其他两个不兼容(read uncommitted:总是读取最新 serializable:所有的行都加锁)
InnoDB的MVCC通过在每行记录后面保存两个隐藏的列来实现MVCC：
（1）一个列保存了行的创建时间
（2）一个列保存了行的过期时间(或删除时间) # 本质是系统版本号
每开始一个新的事务版本号都会自动递增，事务开始时刻的系统版本号会作为事务的版本号用来和查询到的每行记录版本号进行比较。

例如
刚插入第一条数据的时候，我们默认事务id为1，实际是这样存储的
username create_version delete_version
jason 1
可以看到，我们在content列插入了kobe这条数据，在create_version这列存储了1，1是这次插入操作的事务id。
然后我们将jason修改为jason01，实际存储是这样的
username create_version delete_version
jason 1 2
jason01 2
可以看到，update的时候，会先将之前的数据delete_version标记为当前新的事务id，也就是2，然后将新数据写入，将新数据的create_version标记为新的事务id
当我们删除数据的时候，实际存储是这样的
username create_version delete_version
jason01 2 3
“”"
由此当我们查询一条记录的时候，只有满足以下两个条件的记录才会被显示出来：
1.当前事务id要大于或者等于当前行的create_version值，这表示在事务开始前这行数据已经存在了。
2.当前事务id要小于delete_version值，这表示在事务开始之后这行记录才被删除。
“”"

存储过程

类似于python中的自定义函数

delimiter 临时结束符
create procedure 名字(参数,参数)
begin
	sql语句;
end 临时结束符
delimiter ;

delimiter $$
create procedure p1(
   in m int,  # in表示这个参数必须只能是传入不能被返回出去
   in n int,  
   out res int  # out表示这个参数可以被返回出去，还有一个inout表示即可以传入也可以被返回出去
)
begin
   select tname from teacher where tid > m and tid < n;
   set res=0;  # 用来标志存储过程是否执行
end $$
delimiter ;


# 针对res需要先提前定义
set @res=10;  定义
select @res;  查看
call p1(1,5,@res)  调用
select @res  查看

"""
查看存储过程具体信息
	show create procedure pro1;
查看所有存储过程
	show procedure status;
删除存储过程
	drop procedure pro1;
"""
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34

内置函数

“ps:可以通过help 函数名查看帮助信息!”

1.移除指定字符
Trim、LTrim、RTrim

# 2.大小写转换
Lower、Upper

# 3.获取左右起始指定个数字符
Left、Right

# 4.返回读音相似值(对英文效果)
Soundex

5.举个栗子：
eg:客户表中有一个顾客登记的用户名为J.Lee
但如果这是输入错误真名其实叫J.Lie,可以使用soundex匹配发音类似的
where Soundex(name)=Soundex(‘J.Lie’)

# 6.日期格式:date_format

'''在MySQL中表示时间格式尽量采用2022-11-11形式'''
CREATE TABLE blog (
   id INT PRIMARY KEY auto_increment,
   NAME CHAR (32),
   sub_time datetime
);
INSERT INTO blog (NAME, sub_time)
VALUES
   ('第1篇','2015-03-01 11:31:21'),
   ('第2篇','2015-03-11 16:31:21'),
   ('第3篇','2016-07-01 10:21:31'),
   ('第4篇','2016-07-22 09:23:21'),
   ('第5篇','2016-07-23 10:11:11'),
   ('第6篇','2016-07-25 11:21:31'),
   ('第7篇','2017-03-01 15:33:21'),
   ('第8篇','2017-03-01 17:32:21'),
   ('第9篇','2017-03-01 18:31:21');
select date_format(sub_time,'%Y-%m'),count(id) from blog group by >date_format(sub_time,'%Y-%m');

1.where Date(sub_time) = '2015-03-01'
2.where Year(sub_time)=2016 AND Month(sub_time)=07;
# 更多日期处理相关函数 
	adddate	增加一个日期 
	addtime	增加一个时间
	datediff	计算两个日期差值
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

流程控制

# if条件语句
delimiter //
CREATE PROCEDURE proc_if ()
BEGIN
   
   declare i int default 0;
   if i = 1 THEN
       SELECT 1;
   ELSEIF i = 2 THEN
       SELECT 2;
   ELSE
       SELECT 7;
   END IF;

END //
delimiter ;


# while循环
delimiter //
CREATE PROCEDURE proc_while ()
BEGIN

   DECLARE num INT ;
   SET num = 0 ;
   WHILE num < 10 DO
       SELECT
           num ;
       SET num = num + 1 ;
   END WHILE ;

END //
delimiter ;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33

索引

1）索引就好比一本书的目录，它能让你更快的找到自己想要的内容。
2）让获取的数据更有目的性，从而提高数据库检索数据的性能

索引在MySQL中也叫做“键”，是存储引擎用于快速找到记录的一种数据结构
primary key
unique key
index key
上述的三种键在数据查询的时候使用都可以加快查询的速度
primary key、unique key除了可以加快数据查询还有额外的限制
index key只能加快数据查询本身没有任何的额外限制
真正理解索引加快数据查询的含义
索引的存在可以加快数据的查询但是会减慢数据的增删
索引相关概念

2.索引底层原理
树:是一种数据结构主要用于优化数据查询的操作

二叉树:两个分支
B树(B-树)、B+树、B树
B树:
除了叶子节点可以有多个分支其他节点最多只能两个分支
所有的节点都可以直接存放完整数据(每一个数据块是有固定大小的)
B+树:
只有叶子节点存放真正的数据其他节点只存主键值(辅助索引值)

B树
在树节点添加了通往其他节点的通道减少查询次数

3.慢查询优化
explain SQL语句
从上到下，性能从最差到最好，我们认为至少要达到range级别

DBMS自动完成维护和自动选择是否使用索引以及使用哪些索引

4.测试索引
准备

#1. 准备表
create table s1(
id int,
name varchar(20),
gender char(6),
email varchar(50)
);

#2. 创建存储过程，实现批量插入记录
delimiter $$ #声明存储过程的结束符号为$$
create procedure auto_insert1()
BEGIN
   declare i int default 1;
   while(i<3000000)do
       insert into s1 values(i,'jason','male',concat('jason',i,'@oldboy'));
       set i=i+1;
   end while;
END$$ #$$结束
delimiter ; #重新声明分号为结束符号

#3. 查看存储过程
show create procedure auto_insert1\G 

#4. 调用存储过程
call auto_insert1();
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

# 表没有任何索引的情况下
select * from s1 where id=30000;
# 避免打印带来的时间损耗
select count(id) from s1 where id = 30000;
select count(id) from s1 where id = 1;

# 给id做一个主键
alter table s1 add primary key(id);  # 速度很慢

select count(id) from s1 where id = 1;  # 速度相较于未建索引之前两者差着数量级
select count(id) from s1 where name = 'jason'  # 速度仍然很慢


"""
范围问题
"""
# 并不是加了索引，以后查询的时候按照这个字段速度就一定快   
select count(id) from s1 where id > 1;  # 速度相较于id = 1慢了很多
select count(id) from s1 where id >1 and id < 3;
select count(id) from s1 where id > 1 and id < 10000;
select count(id) from s1 where id != 3;

alter table s1 drop primary key;  # 删除主键 单独再来研究name字段
select count(id) from s1 where name = 'jason';  # 又慢了

create index idx_name on s1(name);  # 给s1表的name字段创建索引
select count(id) from s1 where name = 'jason'  # 仍然很慢！！！
"""
再来看b+树的原理，数据需要区分度比较高，而我们这张表全是jason，根本无法区分
那这个树其实就建成了“一根棍子”
"""
select count(id) from s1 where name = 'xxx';  
# 这个会很快，我就是一根棍，第一个不匹配直接不需要再往下走了
select count(id) from s1 where name like 'xxx';
select count(id) from s1 where name like 'xxx%';
select count(id) from s1 where name like '%xxx';  # 慢 最左匹配特性

# 区分度低的字段不能建索引
drop index idx_name on s1;

# 给id字段建普通的索引
create index idx_id on s1(id);
select count(id) from s1 where id = 3;  # 快了
select count(id) from s1 where id*12 = 3;  # 慢了  索引的字段一定不要参与计算

drop index idx_id on s1;
select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';
# 针对上面这种连续多个and的操作，mysql会从左到右先找区分度比较高的索引字段，先将整体范围降下来再去比较其他条件
create index idx_name on s1(name);
select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';  # 并没有加速

drop index idx_name on s1;
# 给name，gender这种区分度不高的字段加上索引并不难加快查询速度

create index idx_id on s1(id);
select count(id) from s1 where name='jason' and gender = 'male' and id = 3 and email = 'xxx';  # 快了  先通过id已经讲数据快速锁定成了一条了
select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 慢了  基于id查出来的数据仍然很多，然后还要去比较其他字段

drop index idx_id on s1

create index idx_email on s1(email);
select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 快 通过email字段一剑封喉 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62

5.联合索引

select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  
# 如果上述四个字段区分度都很高，那给谁建都能加速查询
# 给email加然而不用email字段
select count(id) from s1 where name='jason' and gender = 'male' and id > 3; 
# 给name加然而不用name字段
select count(id) from s1 where gender = 'male' and id > 3; 
# 给gender加然而不用gender字段
select count(id) from s1 where id > 3; 

# 带来的问题是所有的字段都建了索引然而都没有用到，还需要花费四次建立的时间
create index idx_all on s1(email,name,gender,id);  # 最左匹配原则，区分度高的往左放
select count(id) from s1 where name='jason' and gender = 'male' and id > 3 and email = 'xxx';  # 速度变快
1
2
3
4
5
6
7
8
9
10
11
12

相关阅读:
05.JavaScript（防抖节流、视频播放定位上次位置）
C++面向对象 _ 成绩单系统
 浅学了波ZooKeeper，我来做个总结
 【活动报名】11月19日 | AI GC 的技术与应用构建
 华为数通方向HCIP-DataCom H12-831题库(多选题：121-140）
微服务: xxl-job的安装(docker),使用及springboot整合[完整版详解]
Spring Boot中异步消息JMS的讲解与通信实例
 浏览器插件有什么作用，怎么安装浏览器扩展插件
 二、Git本地仓库基本操作——创建Git仓库、提交更新或删除文件
 【Flink实战】玩转Flink里面核心的Source Operator实战
原文地址：https://blog.csdn.net/DiligentGG/article/details/126425273