本系列博客旨在为机器学习(深度学习)提供数学理论基础。因此内容更为精简,适合二次学习的读者快速学习或查阅。
假设矩阵
A
m
,
n
−
1
′
A'_{m,n-1}
Am,n−1′ 为自变量数据矩阵,
b
m
b_{m}
bm 为因变量向量,令
A
m
,
n
=
[
A
′
,
1
]
A_{m,n}=[A',1]
Am,n=[A′,1],目标是找到一条直线
f
(
x
)
=
A
x
f(x)=Ax
f(x)=Ax 使得值到直线的误差最小,因此我们需要采用梯度下降算法来找到最小化下式的
x
x
x 的值:
f
(
x
)
=
1
2
∣
∣
A
x
−
b
∣
∣
2
2
f(x)=\frac{1}{2}||Ax-b||^{2}_{2}
f(x)=21∣∣Ax−b∣∣22 本次带领读者完成一次完整的求导过程,之后就直接给结论了,首先将原函数进行展开:
f
(
x
)
=
1
2
[
(
(
a
11
x
1
+
⋯
+
a
1
n
x
n
)
−
b
1
)
2
+
⋯
+
(
(
a
m
1
x
1
+
⋯
+
a
m
n
x
n
)
−
b
m
)
2
]
f(x)=\frac{1}{2}[((a_{11}x_{1}+\dots+a_{1n}x_{n})-b_{1})^{2} + \dots + ((a_{m1}x_{1}+\dots+a_{mn}x_{n})-b_{m})^{2}]
f(x)=21[((a11x1+⋯+a1nxn)−b1)2+⋯+((am1x1+⋯+amnxn)−bm)2] 根据向量微积分
∇
x
f
(
x
)
=
1
2
∗
[
∂
[
(
(
a
11
x
1
+
⋯
+
a
1
n
x
n
)
−
b
1
)
2
+
⋯
+
(
(
a
m
1
x
1
+
⋯
+
a
m
n
x
n
)
−
b
m
)
2
]
∂
x
1
⋮
∂
[
(
(
a
11
x
1
+
⋯
+
a
1
n
x
n
)
−
b
1
)
2
+
⋯
+
(
(
a
m
1
x
1
+
⋯
+
a
m
n
x
n
)
−
b
m
)
2
]
∂
x
n
]
=
[
2
a
11
(
(
a
11
x
1
+
⋯
+
a
1
n
x
n
)
−
b
1
)
+
⋯
+
2
a
m
1
(
(
a
m
1
x
1
+
⋯
+
a
m
n
x
n
)
−
b
m
)
⋮
2
a
1
n
(
(
a
11
x
1
+
⋯
+
a
1
n
x
n
)
−
b
1
)
+
⋯
+
2
a
m
n
(
(
a
m
1
x
1
+
⋯
+
a
m
n
x
n
)
−
b
m
)
]
=
1
2
∗
2
[
a
11
…
a
m
1
⋮
⋮
a
1
n
…
a
n
m
]
[
(
a
11
x
1
+
⋯
+
a
1
n
x
n
)
−
b
1
⋮
(
a
m
1
x
1
+
⋯
+
a
m
n
x
n
)
−
b
m
]
\nabla_{x}f(x)=\frac{1}{2}*
import numpy as np
from random import randint
from functools import reduce
from sklearn.metrics import mean_squared_error
class GLSModel:
def __init__(self):
self.x = None
def fit(self, x, y, e, epochs):
"""
填充训练数据进行梯度更新
:param x: 自变量数据
:param y: 因变量数据
:param e: 学习率
:param epochs: 迭代次数
"""
if x.shape[0] != y.shape[0]:
raise ValueError("quantity of x must same as y")
A = np.concatenate((x, np.ones((x.shape[0], 1))), axis=1)
if self.x is None:
self.x = np.random.random((A.shape[1], 1))
for _ in range(epochs):
self.x -= e * (reduce(np.matmul, (A.T, A, self.x)) - np.matmul(A.T, y))
def predict(self, x):
"""
进行预测
:param x: 待预测数据
:return: 预测数据
"""
return np.matmul(np.concatenate((x, np.ones((x.shape[0], 1))), axis=1), self.x)
def evaluate(self, x, y):
y_ = self.predict(x)
return mean_squared_error(y, y_)
train_x, train_y = [], []
for _ in range(1000):
# 2 个自变量
train_xi = [randint(-100, 100) for _ in range(2)]
# 因变量
train_yi = 3 * train_xi[0] + 5 * train_xi[1] + 4
train_x.append(train_xi)
train_y.append([train_yi])
train_x, train_y = np.array(train_x), np.array(train_y)
model = GLSModel()
model.fit(train_x, train_y, 2e-7, 100)
print(f'mse: {model.evaluate(train_x, train_y)}')
# mse: 9.080036995109756
test_x = [randint(-100, 100) for _ in range(2)]
test_y = 3 * test_x[0] + 5 * test_x[1] + 4
print(f'real: {test_y}, pred: {model.predict(np.array([test_x]))[0][0]}')
# real: 13, pred: 10.01059089304455
读者可以自行调参尝试,以获得更优解。