活动地址:CSDN21天学习挑战赛
对偶问题可以降低问题的求解难度,可由线性分类问题推广到非线性分类问题.
这是最大间隔算法的原始形式

构造拉格朗日函数将目标函数和约束条件联系起来 ,原始形式约束条件式子中有n个不等式,所以这里我们对每一个不等式约束引入一个拉格朗日乘子
α
i
≥
0
,
i
=
1
,
2
,
.
.
.
,
N
\alpha_i \ge0,i=1,2,...,N
αi≥0,i=1,2,...,N.
该问题的拉格朗日函数如下:
L
(
w
,
b
,
α
)
=
1
2
∣
∣
w
∣
∣
2
−
∑
i
N
α
i
y
i
(
w
⋅
x
i
+
b
)
+
∑
i
N
α
i
L(w,b,\alpha)=\frac{1}{2}||w||^2-\sum_i^N \alpha_i y_i(w \cdot x_i+b)+\sum_i^N \alpha_i
L(w,b,α)=21∣∣w∣∣2−i∑Nαiyi(w⋅xi+b)+i∑Nαi
原始问题的对偶问题为该拉格朗日函数的极大极小问题,即先求
L
(
w
,
b
,
α
)
L(w,b,\alpha)
L(w,b,α)对w,b的极小值,再对
α
\alpha
α求极大值.
即:
max
α
min
w
,
b
L
(
w
,
b
,
α
)
\mathop{\max}\limits_{\alpha}\min\limits_{w,b}L(w,b,\alpha)
αmaxw,bminL(w,b,α)
(1)求
min
w
,
b
L
(
w
,
b
,
α
)
\min\limits_{w,b}L(w,b,\alpha)
w,bminL(w,b,α)
令
θ
0
(
α
)
=
min
w
,
b
L
(
w
,
b
,
α
)
\theta_0(\alpha)=\min\limits_{w,b}L(w,b,\alpha)
θ0(α)=w,bminL(w,b,α)
求极值就分别对w,b求偏导等于0.
∇
x
L
(
w
,
b
,
α
)
=
w
−
∑
i
N
α
i
y
i
x
i
=
0
∇
b
L
(
w
,
b
,
α
)
=
−
∑
i
N
α
i
y
i
=
0
\nabla_xL(w,b,\alpha)=w-\sum_i^N \alpha_i y_ix_i=0\\\nabla_bL(w,b,\alpha)=-\sum_i^N \alpha_i y_i=0
∇xL(w,b,α)=w−i∑Nαiyixi=0∇bL(w,b,α)=−i∑Nαiyi=0
解得w,b.
{
w
=
∑
i
N
α
i
y
i
x
i
∑
i
N
α
i
y
i
=
0
\left\{
则
θ
0
(
α
)
=
1
2
∑
i
=
1
N
∑
j
=
1
N
α
i
α
j
y
i
y
j
(
x
i
⋅
x
j
)
−
∑
i
=
1
N
α
i
y
i
(
∑
j
=
1
N
α
i
y
j
∗
(
(
x
i
⋅
x
j
)
)
+
∑
j
=
1
N
α
i
=
∑
j
=
1
N
α
i
−
1
2
∑
i
=
1
N
∑
j
=
1
N
α
i
α
j
y
i
y
j
(
x
i
⋅
x
j
)
\theta_0(\alpha)=\frac{1}{2}\sum_{i=1}^N\sum_{j=1}^N\alpha_i\alpha_j y_i y_j(x_i \cdot x_j)-\sum_{i=1}^N\alpha_iy_i(\sum_{j=1}^N\alpha_iy_j*((x_i \cdot x_j))+\sum_{j=1}^N\alpha_i \\ =\sum_{j=1}^N\alpha_i-\frac{1}{2}\sum_{i=1}^N\sum_{j=1}^N\alpha_i\alpha_j y_i y_j(x_i \cdot x_j)
θ0(α)=21∑i=1N∑j=1Nαiαjyiyj(xi⋅xj)−∑i=1Nαiyi(∑j=1Nαiyj∗((xi⋅xj))+∑j=1Nαi=∑j=1Nαi−21∑i=1N∑j=1Nαiαjyiyj(xi⋅xj)
(2 )求
max
α
θ
0
(
α
)
\max\limits_{\alpha}\theta_0(\alpha)
αmaxθ0(α)
{
max
α
θ
0
(
α
)
s
.
t
.
∑
i
=
1
N
α
i
y
i
=
0
,
α
i
≥
0
,
i
=
1
,
2
,
.
.
.
,
N
\left\{
上式约束条件中,拉格朗日乘子大于等于0
KKT条件
则该问题中KKT条件第一条为:
∇
w
L
=
w
∗
−
∑
i
N
α
i
∗
y
i
x
i
=
0
(
1
)
\nabla_wL=w^*-\sum_i^N \alpha_i^* y_ix_i=0 \quad(1)
∇wL=w∗−i∑Nαi∗yixi=0(1)
对于互补松弛条件有
α
i
(
1
−
y
i
(
w
∗
⋅
x
j
+
b
∗
)
)
=
0
,
α
i
≥
0
\alpha_i(1-y_i(w^*\cdot x_j+b^*))=0,\alpha_i \ge0
αi(1−yi(w∗⋅xj+b∗))=0,αi≥0
则
1
−
y
i
(
w
∗
⋅
x
j
+
b
∗
)
=
0
(
2
)
1-y_i(w^*\cdot x_j+b^*)=0\quad(2)
1−yi(w∗⋅xj+b∗)=0(2)
由(1)(2)解的原始问题中w,b
{
w
∗
=
∑
i
=
1
N
α
i
∗
y
i
x
i
b
∗
=
y
j
−
∑
i
=
1
N
α
i
∗
y
i
(
x
i
⋅
x
j
)
\left\{
此时就得到分类超平面:
∑
i
=
1
N
α
i
∗
y
i
(
x
i
⋅
x
)
−
∑
i
=
1
N
α
i
∗
y
i
(
x
i
⋅
x
j
)
+
y
i
=
0
\sum_{i=1}^N \alpha_i^* y_i(x_i \cdot x)-\sum_{i=1}^N \alpha_i^* y_i(x_i\cdot x_j)+y_i=0
i=1∑Nαi∗yi(xi⋅x)−i=1∑Nαi∗yi(xi⋅xj)+yi=0
分类决策函数为:
f
(
x
)
=
s
i
g
n
(
∑
i
=
1
N
α
i
∗
y
i
(
x
i
⋅
x
)
−
∑
i
=
1
N
α
i
∗
y
i
(
x
i
⋅
x
j
)
+
y
i
)
f(x)=sign(\sum_{i=1}^N \alpha_i^* y_i(x_i \cdot x)-\sum_{i=1}^N \alpha_i^* y_i(x_i\cdot x_j)+y_i)
f(x)=sign(i=1∑Nαi∗yi(xi⋅x)−i=1∑Nαi∗yi(xi⋅xj)+yi)
输入:数据集
T
=
{
(
x
1
,
y
1
)
,
(
x
2
,
y
2
)
,
.
.
.
,
(
x
N
,
y
N
)
}
,
x
i
∈
R
n
,
y
i
∈
{
−
1
,
1
}
,
i
=
1
,
2
,
.
.
.
,
N
.
T=\{(x_1,y_1),(x_2,y_2),...,(x_N,y_N)\},x_i \in R^n,y_i \in\{-1,1\},i=1,2,...,N.
T={(x1,y1),(x2,y2),...,(xN,yN)},xi∈Rn,yi∈{−1,1},i=1,2,...,N.
输出:分离超平面、分类决策函数


https://mp.weixin.qq.com/s/886_EdhRtRFCeof0xaPDhw
https://www.bilibili.com/video/BV1HP4y1Y79e?spm_id_from=333.337.search-card.all.click&vd_source=893fb409f9a0bd0a8c04972fb40b53b3