a
=
1
1
+
e
−
Z
a = \frac{1}{1 + e^{-Z}}
a=1+e−Z1 ,值域介于0和1之间

import numpy as np
import matplotlib.pyplot as plt
Z = np.linspace(-10, 10, 100)
a = 1 / (1 + np.exp(-Z))
fig = plt.figure()
plt.plot(Z, a, color="blue", linewidth=1, linestyle="-", label="sigmoid")
plt.legend(loc="upper left")
plt.xlabel("Z", x=1)
plt.ylabel("a", y=1)
plt.xticks([-10, 0, 10])
plt.yticks([0.5, 1])
ax = plt.gca()
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
ax.spines['left'].set_position(('data', 0))
ax.spines['bottom'].set_position(('data', 0))
plt.show()
a = tanh ( Z ) = sinh Z cosh Z = e Z − e − Z e x + e − Z a = \tanh(Z)=\frac{\sinh Z}{\cosh Z}=\frac{e^Z-e^{-Z}}{e^x+e^{-Z}} a=tanh(Z)=coshZsinhZ=ex+e−ZeZ−e−Z ,值域介于+1和-1之间,斜率[0, 1),并且使得数据的平均值更接近0而不是0.5

import numpy as np
import matplotlib.pyplot as plt
Z = np.linspace(-5, 5, 100)
a = np.tanh(Z)
fig = plt.figure()
plt.plot(Z, a, color="blue", linewidth=1, linestyle="-", label="tanh")
plt.legend(loc="upper left")
plt.xlabel("Z", x=1)
plt.ylabel("a", y=1)
plt.xticks([-5, 5])
plt.yticks([-1, 1])
ax = plt.gca()
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
ax.spines['left'].set_position(('data', 0))
ax.spines['bottom'].set_position(('data', 0))
plt.show()
修正线性单元的函数(ReLu):
a
=
m
a
x
(
0
,
Z
)
a = max(0, Z)
a=max(0,Z) ,只要z是正值的情况下,导数恒等于1,当z是负值的时候,导数恒等于0

import numpy as np
import matplotlib.pyplot as plt
Z = np.linspace(-5, 5, 11)
a = np.maximum(0, Z)
fig = plt.figure()
plt.plot(Z, a, color="blue", linewidth=1, linestyle="-", label="Relu")
plt.legend(loc="upper left")
plt.xlabel("Z", x=1)
plt.ylabel("a", y=1)
plt.xticks([-5, 5])
plt.yticks([5])
ax = plt.gca()
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
ax.spines['left'].set_position(('data', 0))
ax.spines['bottom'].set_position(('data', 0))
plt.show()
s o f t m a x ( Z i ) = e Z i ∑ 1 C e Z c softmax(Z_{i}) = \frac{e^{Z_{i}}}{\sum_{1}^{C}e^{Z_{c}}} softmax(Zi)=∑1CeZceZi,又称归一化指数函数
它是二分类函数sigmoid在多分类上的推广,目的是将多分类的结果以概率的形式展现出来
假如模型对一个三分类问题的预测结果为a, b, c
softmax将差距大的数值距离拉的更大
在深度学习中通常使用反向传播求解梯度进而使用梯度下降进行参数更新的过程,而指数函数在求导的时候比较方便

import numpy as np
import matplotlib.pyplot as plt
def hardmax(Z):
return Z / sum(Z)
def softmax(Z):
return np.exp(Z) / sum(np.exp(Z))
pred = np.array([0.5, 1.5, 4])
a = np.array([hardmax(pred)[0], softmax(pred)[0]])
b = np.array([hardmax(pred)[1], softmax(pred)[1]])
c = np.array([hardmax(pred)[2], softmax(pred)[2]])
x = np.arange(2)
x_labels = ["hardmax", "softmax"]
plt.xticks(x, x_labels)
total_width, n = 0.8, 3
width = total_width / n
x = x - (total_width - width) / 2
plt.bar(x, a, width=width, label="a")
plt.bar(x + width, b, width=width, label="b")
plt.bar(x + 2*width, c, width=width, label="c")
plt.legend()
plt.show()
谢谢