Traceback (most recent call last):
File "./codes/ign_train.py", line 235, in <module>
run_a_train_epoch(DTIModel, loss_fn, train_dataloader, optimizer, device)
File "./codes/ign_train.py", line 46, in run_a_train_epoch
outputs = model(bg, bg3)
File ".../miniconda3/envs/IGN/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File ".codes/model.py", line 380, in forward
return self.FC(readouts)
File ".../miniconda3/envs/IGN/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "./codes/model.py", line 40, in forward
h = layer(h)
File ".../miniconda3/envs/IGN/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
result = self.forward(*input, **kwargs)
File "/home/haida_liuhao/software/miniconda3/envs/IGN/lib/python3.6/site-packages/torch/nn/modules/batchnorm.py", line 107, in forward
exponential_average_factor, self.eps)
File ".../miniconda3/envs/IGN/lib/python3.6/site-packages/torch/nn/functional.py", line 1666, in batch_norm
raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 200])
其实这个问题就是他本身的机制问题:
当Batch Normalization设为训练模式时(通过训练样本学习均值和方差),拒绝任何batch-size为1的情况。至于原因,简单地说就是BN归一化是依靠当前mini-batch的均值和方差进行归一化的,如果batch-size太小,显然所谓的均值和方差并不能代表不同sample之间的差距,各个mini-batch归一化结果的差异会非常大,归一化就没有意义了。另外,当batch-size设为1时,BN的结果近似于IN。
归结一句话就是:不能以任何形式,或者每一轮都不能有batch_size==1的情况出现。基本上现在不会直接设置batch_size==1,还有一种情况就是,在最后一轮的时候batch_size==1。
不好理解的话,举个例子:
比如我的数据集为17,我设置batch_size==16。那么16个数据集一轮,最后一轮就只剩下了1个数据集,这也就是出现了:batch_size==1的情况。
nn.functional文件(文件在报错的提示中出现了)中,把刚刚的那两行判断batch-size为1就报错的代码注释掉,如下图所示: File ".../miniconda3/envs/IGN/lib/python3.6/site-packages/torch/nn/functional.py", line 1666,