矩阵求导的本质 :
d
A
d
B
\frac{dA}{dB}
dBdA :矩阵A的每个元素对矩阵B的每个元素进行求导。
假设矩阵A为
1
×
1
1\times 1
1×1,矩阵B为
1
×
n
1\times n
1×n, 则
d
A
d
B
\frac{dA}{dB}
dBdA为
1
×
n
1\times n
1×n。
假设矩阵A为
q
×
p
q\times p
q×p,矩阵B为
m
×
n
m\times n
m×n, 则
d
A
d
B
\frac{dA}{dB}
dBdA为
q
×
p
×
m
×
n
q\times p\times m\times n
q×p×m×n。
标量不变,向量拉伸
前面横向拉伸,后面纵面拉伸
例1. 求
d
f
(
x
)
d
x
\frac{df(x)}{dx}
dxdf(x),其中
f
(
x
)
f(x)
f(x)为标量函数,
f
(
x
)
=
f
(
x
1
,
x
2
,
x
3
.
.
.
x
n
)
f(x)=f(x_1, x_2,x_3...x_n)
f(x)=f(x1,x2,x3...xn),x为向量函数。
解:
d
f
(
x
)
d
x
=
[
∂
f
(
x
)
∂
x
1
∂
f
(
x
)
∂
x
2
.
.
.
∂
f
(
x
)
∂
x
n
]
\frac{df(x)}{dx}=
标量 f ( x ) f(x) f(x)不变,向量 x x x纵向拉伸
例2. 求
d
f
(
x
)
d
x
\frac{df(x)}{dx}
dxdf(x),其中
f
(
x
)
f(x)
f(x)为向量函数,
f
(
x
)
=
[
f
1
(
x
)
f
2
(
x
)
.
.
.
f
n
(
x
)
]
f(x)=
解:
d
f
(
x
)
d
x
=
[
∂
f
1
(
x
)
∂
x
∂
f
2
(
x
)
∂
x
.
.
.
∂
f
n
(
x
)
∂
x
]
\frac{df(x)}{dx}=
向量函数横向拉伸,标量x不变
例3. 求
d
f
(
x
)
d
x
\frac{df(x)}{dx}
dxdf(x),其中
f
(
x
)
f(x)
f(x)为向量函数,
f
(
x
)
=
[
f
1
(
x
)
f
2
(
x
)
.
.
.
f
n
(
x
)
]
f(x)=
解:
d
f
(
x
)
d
x
=
[
∂
f
(
x
)
∂
x
1
∂
f
(
x
)
∂
x
2
.
.
.
∂
f
(
x
)
∂
x
n
]
=
[
∂
f
1
(
x
)
∂
x
1
∂
f
2
(
x
)
∂
x
1
.
.
.
∂
f
n
(
x
)
∂
x
1
∂
f
1
(
x
)
∂
x
2
∂
f
2
(
x
)
∂
x
2
.
.
.
∂
f
n
(
x
)
∂
x
2
.
.
.
∂
f
1
(
x
)
∂
x
n
∂
f
2
(
x
)
∂
x
n
.
.
.
∂
f
n
(
x
)
∂
x
n
]
\frac{df(x)}{dx}=
例1: 求
d
f
(
x
)
d
x
\frac{df(x)}{dx}
dxdf(x),其中
f
(
x
)
=
A
T
X
f(x)=A^TX
f(x)=ATX,
A
=
[
a
1
a
2
.
.
.
a
n
]
n
×
1
A=
解:
f
(
x
)
=
A
T
X
=
∑
i
=
1
n
a
i
x
i
f(x)=A^TX=\sum^n_{i=1}a_ix_i
f(x)=ATX=∑i=1naixi
d
f
(
x
)
d
x
=
[
∂
f
(
x
)
∂
x
1
∂
f
(
x
)
∂
x
2
.
.
.
∂
f
(
x
)
∂
x
n
]
=
[
a
1
a
2
.
.
.
a
n
]
=
A
\frac{df(x)}{dx}=
由于{标量 T = ^T= T=标量},所以 f ( x ) = A T X = X T A f(x)=A^TX=X^TA f(x)=ATX=XTA,所以 d A T X d x = d X T A d x = A \frac{dA^TX}{dx}=\frac{dX^TA}{dx}=A dxdATX=dxdXTA=A
例2: 求
d
f
(
x
)
d
x
\frac{df(x)}{dx}
dxdf(x),其中
f
(
x
)
=
X
T
A
X
f(x)=X^TAX
f(x)=XTAX,
x
=
[
x
1
x
2
.
.
.
x
n
]
n
×
1
x=
解:
f
(
x
)
=
X
1
×
n
T
A
n
×
n
X
n
×
1
f(x)=X^T_{1\times n }A_{n\times n}X_{n\times 1}
f(x)=X1×nTAn×nXn×1,为标量
f
(
x
)
=
[
x
1
x
2
.
.
.
x
n
]
[
a
11
a
12
.
.
.
a
1
n
a
21
a
22
.
.
.
a
2
n
.
.
.
a
n
1
a
n
2
.
.
.
a
n
n
]
[
x
1
x
2
.
.
.
x
n
]
=
∑
i
=
1
n
∑
j
=
1
n
a
i
j
x
i
x
j
f(x)=
d
f
(
x
)
d
x
=
[
∂
f
(
x
)
∂
x
1
∂
f
(
x
)
∂
x
2
.
.
.
∂
f
(
x
)
∂
x
n
]
=
[
∑
j
=
1
n
a
1
j
x
j
+
∑
i
=
1
n
a
i
1
x
i
∑
j
=
1
n
a
2
j
x
j
+
∑
i
=
1
n
a
i
2
x
i
.
.
.
∑
j
=
1
n
a
n
j
x
j
+
∑
i
=
1
n
a
i
n
x
i
]
=
[
∑
j
=
1
n
a
1
j
x
j
∑
j
=
1
n
a
2
j
x
j
.
.
.
∑
j
=
1
n
a
n
j
x
j
]
+
[
∑
i
=
1
n
a
i
1
x
i
∑
i
=
1
n
a
i
2
x
i
.
.
.
∑
i
=
1
n
a
i
n
x
i
]
\frac{df(x)}{dx}=