最大似然估计具有很多好的性质,包括相合性,同变性,渐进正态性等。本文主要关注的是渐进正态性。渐近正态性表明,估计量的极限分布是正态分布。而该正态分布的方差,与Fisher信息有着密不可分的关系。
(定义)记分函数(Score Function):
s
(
X
;
θ
)
=
∂
l
o
g
f
(
X
;
θ
)
∂
θ
.
s(X;\theta)=\frac{\partial logf(X;\theta)}{\partial \theta}.
s(X;θ)=∂θ∂logf(X;θ).
(定义)Fisher信息量(Fisher Information):
I
n
(
θ
)
=
V
(
∑
i
=
1
n
s
(
X
i
;
θ
)
)
=
∑
i
=
1
n
V
(
s
(
X
i
;
θ
)
)
(定理)
E
θ
[
s
(
X
;
θ
)
]
=
0
\mathbb{E}_\theta[s(X;\theta)]=0
Eθ[s(X;θ)]=0
证明:
E
θ
[
s
(
X
;
θ
)
]
=
∫
x
∂
l
o
g
f
(
x
;
θ
)
∂
θ
f
(
x
;
θ
)
d
x
=
∫
x
1
f
(
x
;
θ
)
∂
f
(
x
;
θ
)
∂
θ
f
(
x
;
θ
)
d
x
=
∫
x
∂
f
(
x
;
θ
)
∂
θ
d
x
=
∂
∂
θ
∫
x
f
(
x
;
θ
)
d
x
=
∂
∂
θ
1
=
0
(定理)若
f
(
X
;
θ
)
f(X;\theta)
f(X;θ)二阶可导,则Fisher信息矩阵可以写为如下形式:
I
n
(
θ
)
=
n
I
(
θ
)
=
−
n
∫
x
∂
2
l
o
g
f
(
x
;
θ
)
∂
θ
2
f
(
x
;
θ
)
d
x
I_n(\theta)=nI(\theta)=-n\int_x\frac{\partial^2logf(x;\theta)}{\partial\theta^2}f(x;\theta)dx
In(θ)=nI(θ)=−n∫x∂θ2∂2logf(x;θ)f(x;θ)dx
证明:
V
θ
[
s
(
X
;
θ
)
]
=
E
θ
[
s
(
X
;
θ
)
2
]
−
E
θ
[
s
(
X
;
θ
)
]
2
=
E
θ
[
s
(
X
;
θ
)
2
]
=
∫
x
∂
l
o
g
f
(
x
;
θ
)
∂
θ
∂
l
o
g
f
(
x
;
θ
)
∂
θ
f
(
x
;
θ
)
d
x
∫
x
∂
2
l
o
g
f
(
x
;
θ
)
∂
θ
2
f
(
x
;
θ
)
d
x
=
∫
x
∂
∂
θ
(
1
f
(
x
;
θ
)
∂
f
(
x
;
θ
)
∂
θ
)
d
x
=
∫
x
−
(
∂
f
(
x
;
θ
)
∂
θ
)
2
f
(
x
;
θ
)
2
+
(
∂
2
f
(
x
;
θ
)
∂
θ
2
)
f
(
x
;
θ
)
f
(
x
;
θ
)
d
x
=
∫
x
−
(
∂
f
(
x
;
θ
)
∂
θ
)
2
f
(
x
;
θ
)
2
d
x
=
−
∫
x
∂
2
l
o
g
f
(
x
;
θ
)
∂
θ
2
f
(
x
;
θ
)
d
x
极大似然估计具有渐进正态性
θ
^
n
−
θ
s
e
→
N
(
0
,
1
)
\frac{\hat{\theta}_n-\theta}{se}\rightarrow N(0,1)
seθ^n−θ→N(0,1)
其中,
s
e
≈
1
I
n
(
θ
)
≈
1
I
n
(
θ
^
)
se\approx\sqrt{\frac{1}{I_n(\theta)}}\approx\sqrt{\frac{1}{I_n(\hat{\theta})}}
se≈In(θ)1≈In(θ^)1
证明从略,资料比较多。
由此可以构建估计的置信区间。
设
X
1
,
⋯
,
X
n
∼
B
e
r
n
o
u
l
l
i
(
p
)
X_1, \cdots,X_n \sim Bernoulli(p)
X1,⋯,Xn∼Bernoulli(p),则其似然函数是
L
(
p
)
=
∏
i
=
1
n
p
X
i
(
1
−
p
)
1
−
X
i
L(p)=\prod_{i=1}^{n} p^{X_i}(1-p)^{1-X_i}
L(p)=i=1∏npXi(1−p)1−Xi
l
o
g
L
(
p
)
=
∑
i
n
X
i
l
o
g
p
+
(
1
−
X
i
)
l
o
g
(
1
−
p
)
logL(p)=\sum_{i}^{n}X_ilogp+(1-X_i)log(1-p)
logL(p)=i∑nXilogp+(1−Xi)log(1−p)
最大化对数似然,就得到:
d
d
x
l
o
g
L
(
p
)
=
0
∑
i
n
X
i
1
p
−
(
1
−
X
i
)
1
1
−
p
=
0
p
=
1
n
∑
i
=
1
n
X
i
其记分函数是:
∂
l
o
g
L
(
p
)
∂
p
=
X
p
−
1
−
X
1
−
p
\frac{\partial logL(p)}{\partial p}=\frac{X}{p}-\frac{1-X}{1-p}
∂p∂logL(p)=pX−1−p1−X
I
(
p
)
=
−
E
θ
[
d
(
X
p
−
1
−
X
1
−
p
)
d
p
]
=
1
1
−
p
+
1
p
=
1
p
(
1
−
p
)
I(p)=-E_\theta[\frac{d(\frac{X}{p}-\frac{1-X}{1-p})}{dp}]=\frac{1}{1-p}+\frac{1}{p}\\=\frac{1}{p(1-p)}
I(p)=−Eθ[dpd(pX−1−p1−X)]=1−p1+p1=p(1−p)1
I
n
(
p
)
=
n
I
(
p
)
I_n(p)=nI(p)
In(p)=nI(p),估计的方差
V
(
p
)
=
n
p
(
1
−
p
)
≈
n
p
^
(
1
−
p
^
)
V(p)=np(1-p) \approx n\hat{p}(1-\hat{p})
V(p)=np(1−p)≈np^(1−p^)