• resnet_v1.resnet_v1()


    参考  resnet_v1.resnet_v1() - 云+社区 - 腾讯云

    1. def resnet_v1(inputs,
    2. blocks,
    3. num_classes=None,
    4. is_training=True,
    5. global_pool=True,
    6. output_stride=None,
    7. include_root_block=True,
    8. spatial_squeeze=True,
    9. store_non_strided_activations=False,
    10. reuse=None,
    11. scope=None):
    12. """Generator for v1 ResNet models.
    13. This function generates a family of ResNet v1 models. See the resnet_v1_*()
    14. methods for specific model instantiations, obtained by selecting different
    15. block instantiations that produce ResNets of various depths.
    16. Training for image classification on Imagenet is usually done with [224, 224]
    17. inputs, resulting in [7, 7] feature maps at the output of the last ResNet
    18. block for the ResNets defined in [1] that have nominal stride equal to 32.
    19. However, for dense prediction tasks we advise that one uses inputs with
    20. spatial dimensions that are multiples of 32 plus 1, e.g., [321, 321]. In
    21. this case the feature maps at the ResNet output will have spatial shape
    22. [(height - 1) / output_stride + 1, (width - 1) / output_stride + 1]
    23. and corners exactly aligned with the input image corners, which greatly
    24. facilitates alignment of the features to the image. Using as input [225, 225]
    25. images results in [8, 8] feature maps at the output of the last ResNet block.
    26. For dense prediction tasks, the ResNet needs to run in fully-convolutional
    27. (FCN) mode and global_pool needs to be set to False. The ResNets in [1, 2] all
    28. have nominal stride equal to 32 and a good choice in FCN mode is to use
    29. output_stride=16 in order to increase the density of the computed features at
    30. small computational and memory overhead, cf. http://arxiv.org/abs/1606.00915.
    31. Args:
    32. inputs: A tensor of size [batch, height_in, width_in, channels].
    33. blocks: A list of length equal to the number of ResNet blocks. Each element
    34. is a resnet_utils.Block object describing the units in the block.
    35. num_classes: Number of predicted classes for classification tasks.
    36. If 0 or None, we return the features before the logit layer.
    37. is_training: whether batch_norm layers are in training mode. If this is set
    38. to None, the callers can specify slim.batch_norm's is_training parameter
    39. from an outer slim.arg_scope.
    40. global_pool: If True, we perform global average pooling before computing the
    41. logits. Set to True for image classification, False for dense prediction.
    42. output_stride: If None, then the output will be computed at the nominal
    43. network stride. If output_stride is not None, it specifies the requested
    44. ratio of input to output spatial resolution.
    45. include_root_block: If True, include the initial convolution followed by
    46. max-pooling, if False excludes it.
    47. spatial_squeeze: if True, logits is of shape [B, C], if false logits is
    48. of shape [B, 1, 1, C], where B is batch_size and C is number of classes.
    49. To use this parameter, the input images must be smaller than 300x300
    50. pixels, in which case the output logit layer does not contain spatial
    51. information and can be removed.
    52. store_non_strided_activations: If True, we compute non-strided (undecimated)
    53. activations at the last unit of each block and store them in the
    54. `outputs_collections` before subsampling them. This gives us access to
    55. higher resolution intermediate activations which are useful in some
    56. dense prediction problems but increases 4x the computation and memory cost
    57. at the last unit of each block.
    58. reuse: whether or not the network and its variables should be reused. To be
    59. able to reuse 'scope' must be given.
    60. scope: Optional variable_scope.
    61. Returns:
    62. net: A rank-4 tensor of size [batch, height_out, width_out, channels_out].
    63. If global_pool is False, then height_out and width_out are reduced by a
    64. factor of output_stride compared to the respective height_in and width_in,
    65. else both height_out and width_out equal one. If num_classes is 0 or None,
    66. then net is the output of the last ResNet block, potentially after global
    67. average pooling. If num_classes a non-zero integer, net contains the
    68. pre-softmax activations.
    69. end_points: A dictionary from components of the network to the corresponding
    70. activation.
    71. Raises:
    72. ValueError: If the target output_stride is not valid.
    73. """
    74. with tf.variable_scope(scope, 'resnet_v1', [inputs], reuse=reuse) as sc:
    75. end_points_collection = sc.original_name_scope + '_end_points'
    76. with slim.arg_scope([slim.conv2d, bottleneck,
    77. resnet_utils.stack_blocks_dense],
    78. outputs_collections=end_points_collection):
    79. with (slim.arg_scope([slim.batch_norm], is_training=is_training)
    80. if is_training is not None else NoOpScope()):
    81. net = inputs
    82. if include_root_block:
    83. if output_stride is not None:
    84. if output_stride % 4 != 0:
    85. raise ValueError('The output_stride needs to be a multiple of 4.')
    86. output_stride /= 4
    87. net = resnet_utils.conv2d_same(net, 64, 7, stride=2, scope='conv1')
    88. net = slim.max_pool2d(net, [3, 3], stride=2, scope='pool1')
    89. net = resnet_utils.stack_blocks_dense(net, blocks, output_stride,
    90. store_non_strided_activations)
    91. # Convert end_points_collection into a dictionary of end_points.
    92. end_points = slim.utils.convert_collection_to_dict(
    93. end_points_collection)
    94. if global_pool:
    95. # Global average pooling.
    96. net = tf.reduce_mean(net, [1, 2], name='pool5', keep_dims=True)
    97. end_points['global_pool'] = net
    98. if num_classes:
    99. net = slim.conv2d(net, num_classes, [1, 1], activation_fn=None,
    100. normalizer_fn=None, scope='logits')
    101. end_points[sc.name + '/logits'] = net
    102. if spatial_squeeze:
    103. net = tf.squeeze(net, [1, 2], name='SpatialSqueeze')
    104. end_points[sc.name + '/spatial_squeeze'] = net
    105. end_points['predictions'] = slim.softmax(net, scope='predictions')
    106. return net, end_points
    107. resnet_v1.default_image_size = 224

    生成器为v1 ResNet模型。该函数生成一系列ResNet v1模型。有关特定的模型实例化,请参见resnet_v1_*()方法,该方法通过选择产生不同深度的resnet的不同块实例化获得。Imagenet上的图像分类训练通常使用[224,224]输入,对于[1]中定义的、标称步长为32的ResNet,在最后一个ResNet块的输出处生成[7,7]feature map。然而,对于密集预测任务,我们建议使用空间维度为320 + 1的倍数的输入,例如[321,321]。在这种情况下,ResNet输出处的特征映射将具有空间形状[(height - 1) / output_stride + 1, (width - 1) / output_stride + 1]和与输入图像角完全对齐的角,这极大地促进了特征与图像的对齐。对于密集预测任务,ResNet需要在全卷积(FCN)模式下运行,global_pool需要设置为False。[1,2]中的ResNets都有公称stride= 32,在FCN模式下,一个很好的选择是使用output_stride=16,以便在较小的计算和内存开销下增加计算特性的密度,cf. http://arxiv.org/abs/1606.00915。

    参数:

    • inputs:大小张量[batch, height_in, width_in,channels]。长度等于ResNet块数量的列表。每个元素都是一个resnet_utils。块对象,描述块中的单元。
    • num_classes:用于分类任务的预测类的数量。如果没有,则返回logit层之前的特性。
    • is_training: batch_norm层是否处于训练模式。
    • global_pool:如果为真,则在计算对数之前执行全局平均池。图像分类设为真,预测密度设为假。
    • output_stride:如果没有,那么输出将在标称网络步长处计算。如果output_stride不为None,则指定请求的输入与输出空间分辨率之比。
    • include_root_block:如果为真,则包含初始卷积后的最大池,如果为假则排除它。
    • reuse:是否应该重用网络及其变量。为了能够重用“范围”,必须给出。
    • scope:可选variable_scope。

    返回:

    • net:一个大小为[batch, height_out, width_out, channels_out]的秩-4张量。如果global_pool是假的,那么与各自的height_in和width_in相比,height_out和width_out将减少一个output_stride因子,否则height_out和width_out都等于1。如果num_classes为None,则net是最后一个ResNet块的输出,可能在全局平均池之后。如果num_classes不是None, net包含pre-softmax激活。
    • end_points:从网络组件到相应激活的字典。

    可能产生的异常:

    • ValueError: If the target output_stride is not valid.

    [1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun Deep Residual Learning for Image Recognition. arXiv:1512.03385
    [2] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun Identity Mappings in Deep Residual Networks. arXiv: 1603.05027

  • 相关阅读:
    docker镜像run后 ps命令查不到解决办法;docker 容器显示exit(1)
    【Arduino+ESP32专题】模拟I/O的使用——PWM
    鸿蒙手表开发之使用adb命令安装线上包
    分库分表如何管理不同实例中几万张分片表?
    SQL批量处理+JDBC操作大数据及工具类的封装
    Diffusion Models & CLIP
    七夕表白网页效果实现与解析
    AI从入门到精通,什么是LLMs大型语言模型?
    Java idea编译器工程out目录修改
    自定义动态组件,剩下的三种周期函数
  • 原文地址:https://blog.csdn.net/weixin_36670529/article/details/100112660