torch.nn.Conv2d(
in_channels,
out_channels,
kernel_size,
stride=1,
padding=0,
dilation=1,
groups=1,
bias=True,
padding_mode='zeros',
device=None,
dtype=None
)
在这段函数中,输入为
(
N
,
C
i
n
,
H
,
W
)
(N,C_{in},H,W)
(N,Cin,H,W),输出为
(
N
,
C
o
u
t
,
H
o
u
t
,
W
o
u
t
)
(N,C_{out},H_{out},W_{out})
(N,Cout,Hout,Wout),它们的关系为:
out
(
N
i
,
C
out
j
)
=
bias
(
C
out
j
)
+
∑
k
=
0
C
i
n
−
1
weight
(
C
out
j
,
k
)
⋆
input
(
N
i
,
k
)
\operatorname{out}\left(N_i, C_{\text {out }_j}\right)=\operatorname{bias}\left(C_{\text {out }_j}\right)+\sum_{k=0}^{C_{\mathrm{in}}-1} \operatorname{weight}\left(C_{\text {out }_j}, k\right) \star \operatorname{input}\left(N_i, k\right)
out(Ni,Cout j)=bias(Cout j)+k=0∑Cin−1weight(Cout j,k)⋆input(Ni,k)
其中 N 为 batch size,C 为输入通道数,H 为图像高,W 为图像宽。
输入可以为:
(
N
,
C
i
n
,
H
i
n
,
W
i
n
)
(N,C_{in},H_{in},W_{in})
(N,Cin,Hin,Win) 或
(
C
i
n
,
H
i
n
,
W
i
n
)
(C_{in},H_{in},W_{in})
(Cin,Hin,Win)
输出可以为:
(
N
,
C
o
u
t
,
H
o
u
t
,
W
o
u
t
)
(N,C_{out},H_{out},W_{out})
(N,Cout,Hout,Wout) 或
(
C
o
u
t
,
H
o
u
t
,
W
o
u
t
)
(C_{out},H_{out},W_{out})
(Cout,Hout,Wout)
它们之间的关系为:
H
o
u
t
=
⌊
H
i
n
+
2
×
p
a
d
d
i
n
g
[
0
]
−
d
i
l
a
t
i
o
n
[
0
]
×
(
k
e
r
n
e
l
_
s
i
z
e
[
0
]
−
1
)
−
1
s
t
r
i
d
e
[
0
]
+
1
⌋
H_{out}=\left\lfloor\frac{H_{in}+2 \times padding[0]-dilation[0] \times(kernel\_size[0]-1)-1}{ stride [0]}+1\right\rfloor
Hout=⌊stride[0]Hin+2×padding[0]−dilation[0]×(kernel_size[0]−1)−1+1⌋
W o u t = ⌊ W i n + 2 × p a d d i n g [ 1 ] − d i l a t i o n [ 1 ] × ( k e r n e l _ s i z e [ 1 ] − 1 ) − 1 s t r i d e [ 1 ] + 1 ⌋ W_{out}=\left\lfloor\frac{W_{in}+2 \times padding[1]-dilation[1] \times(kernel\_size[1]-1)-1}{ stride [1]}+1\right\rfloor Wout=⌊stride[1]Win+2×padding[1]−dilation[1]×(kernel_size[1]−1)−1+1⌋
# With square kernels and equal stride
m = nn.Conv2d(16, 33, 3, stride=2)
# non-square kernels and unequal stride and with padding
m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2))
# non-square kernels and unequal stride and with padding and dilation
m = nn.Conv2d(16, 33, (3, 5), stride=(2, 1), padding=(4, 2), dilation=(3, 1))
input = torch.randn(20, 16, 50, 100)
output = m(input)
⭐ 区别
torch.nn.Conv2d 和 torch.nn.functional.conv2d,在 pytorch 构建模型中,都可以作为二维卷积的引入,但前者为类模块,后者为函数,在使用上存在不同。
⭐ 使用
torch.nn.functional.conv2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1)