• torch.nn.init


    参考   torch.nn.init - 云+社区 - 腾讯云

    torch.nn.init.calculate_gain(nonlinearity, param=None)[source]

    Return the recommended gain value for the given nonlinearity function. The values are as follows:

    nonlinearity

    gain

    Linear / Identity

    1

    Conv{1,2,3}D

    1

    Sigmoid

    1

    Tanh

    \frac{5}{3}

    ReLU

    \sqrt{2}

    Leaky Relu

    \sqrt{\frac{2}{1+negtive\_slope^2}}

    ​Parameters

    • nonlinearity – the non-linear function (nn.functional name)

    • param – optional parameter for the non-linear function

    Examples

    >>> gain = nn.init.calculate_gain('leaky_relu', 0.2)  # leaky_relu with negative_slope=0.2

    torch.nn.init.uniform_(tensor, a=0.0, b=1.0)[source]

    Fills the input Tensor with values drawn from the uniform distribution U(a,b)\mathcal{U}(a, b)U(a,b) .

    Parameters

    • tensor – an n-dimensional torch.Tensor

    • a – the lower bound of the uniform distribution

    • b – the upper bound of the uniform distribution

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.uniform_(w)

    torch.nn.init.normal_(tensor, mean=0.0, std=1.0)[source]

    Fills the input Tensor with values drawn from the normal distribution N(mean,std^2)

    Parameters

    • tensor – an n-dimensional torch.Tensor

    • mean – the mean of the normal distribution

    • std – the standard deviation of the normal distribution

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.normal_(w)

    torch.nn.init.constant_(tensor, val)[source]

    Fills the input Tensor with the value val\text{val}val .

    Parameters

    • tensor – an n-dimensional torch.Tensor

    • val – the value to fill the tensor with

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.constant_(w, 0.3)

    torch.nn.init.ones_(tensor)[source]

    Fills the input Tensor with the scalar value 1.

    Parameters

    tensor – an n-dimensional torch.Tensor

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.ones_(w)

    torch.nn.init.zeros_(tensor)[source]

    Fills the input Tensor with the scalar value 0.

    Parameters

    tensor – an n-dimensional torch.Tensor

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.zeros_(w)

    torch.nn.init.eye_(tensor)[source]

    Fills the 2-dimensional input Tensor with the identity matrix. Preserves the identity of the inputs in Linear layers, where as many inputs are preserved as possible.

    Parameters

    tensor – a 2-dimensional torch.Tensor

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.eye_(w)

    torch.nn.init.dirac_(tensor)[source]

    Fills the {3, 4, 5}-dimensional input Tensor with the Dirac delta function. Preserves the identity of the inputs in Convolutional layers, where as many input channels are preserved as possible.

    Parameters

    tensor – a {3, 4, 5}-dimensional torch.Tensor

    Examples

    1. >>> w = torch.empty(3, 16, 5, 5)
    2. >>> nn.init.dirac_(w)

    torch.nn.init.xavier_uniform_(tensor, gain=1.0)[source]

    Fills the input Tensor with values according to the method described in Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010), using a uniform distribution. The resulting tensor will have values sampled from U(a,a)" role="presentation" style="position: relative;">U(a,a) where

    a=gain\times \sqrt{\frac{6}{fan\_in+fan\_out}}

    Also known as Glorot initialization.

    Parameters

    • tensor – an n-dimensional torch.Tensor

    • gain – an optional scaling factor

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.xavier_uniform_(w, gain=nn.init.calculate_gain('relu'))

    torch.nn.init.xavier_normal_(tensor, gain=1.0)[source]

    Fills the input Tensor with values according to the method described in Understanding the difficulty of training deep feedforward neural networks - Glorot, X. & Bengio, Y. (2010), using a normal distribution. The resulting tensor will have values sampled from N(0,std2)" role="presentation" style="position: relative;">N(0,std2) where

    a=gain×2fan_in+fan_out" role="presentation" style="position: relative;">a=gain×2fan_in+fan_out

    Also known as Glorot initialization.

    Parameters

    • tensor – an n-dimensional torch.Tensor

    • gain – an optional scaling factor

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.xavier_normal_(w)

    torch.nn.init.kaiming_uniform_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')[source]

    Fills the input Tensor with values according to the method described in Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - He, K. et al. (2015), using a uniform distribution. The resulting tensor will have values sampled from U(bound,bound)" role="presentation" style="position: relative;">U(bound,bound) where

    bound=6(1+a2)×fan_in" role="presentation" style="position: relative;">bound=6(1+a2)×fan_in

    Also known as He initialization.

    Parameters

    • tensor – an n-dimensional torch.Tensor

    • a – the negative slope of the rectifier used after this layer (0 for ReLU by default)

    • mode – either 'fan_in' (default) or 'fan_out'. Choosing 'fan_in' preserves the magnitude of the variance of the weights in the forward pass. Choosing 'fan_out' preserves the magnitudes in the backwards pass.

    • nonlinearity – the non-linear function (nn.functional name), recommended to use only with 'relu' or 'leaky_relu' (default).

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.kaiming_uniform_(w, mode='fan_in', nonlinearity='relu')

    torch.nn.init.kaiming_normal_(tensor, a=0, mode='fan_in', nonlinearity='leaky_relu')[source]

    Fills the input Tensor with values according to the method described in Delving deep into rectifiers: Surpassing human-level performance on ImageNet classification - He, K. et al. (2015), using a normal distribution. The resulting tensor will have values sampled from N(0,std2)" role="presentation" style="position: relative;">N(0,std2) where

    std=2(1+a2)×fan_in" role="presentation" style="position: relative;">std=2(1+a2)×fan_in

    Also known as He initialization.

    Parameters

    • tensor – an n-dimensional torch.Tensor

    • a – the negative slope of the rectifier used after this layer (0 for ReLU by default)

    • mode – either 'fan_in' (default) or 'fan_out'. Choosing 'fan_in' preserves the magnitude of the variance of the weights in the forward pass. Choosing 'fan_out' preserves the magnitudes in the backwards pass.

    • nonlinearity – the non-linear function (nn.functional name), recommended to use only with 'relu' or 'leaky_relu' (default).

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.kaiming_normal_(w, mode='fan_out', nonlinearity='relu')

    torch.nn.init.orthogonal_(tensor, gain=1)[source]

    Fills the input Tensor with a (semi) orthogonal matrix, as described in Exact solutions to the nonlinear dynamics of learning in deep linear neural networks - Saxe, A. et al. (2013). The input tensor must have at least 2 dimensions, and for tensors with more than 2 dimensions the trailing dimensions are flattened.

    Parameters

    • tensor – an n-dimensional torch.Tensor, where n≥2n \geq 2n≥2

    • gain – optional scaling factor

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.orthogonal_(w)

    torch.nn.init.sparse_(tensor, sparsity, std=0.01)[source]

    Fills the 2D input Tensor as a sparse matrix, where the non-zero elements will be drawn from the normal distribution N(0,0.01)" role="presentation" style="position: relative;">N(0,0.01), as described in Deep learning via Hessian-free optimization - Martens, J. (2010).

    Parameters

    • tensor – an n-dimensional torch.Tensor

    • sparsity – The fraction of elements in each column to be set to zero

    • std – the standard deviation of the normal distribution used to generate the non-zero values

    Examples

    1. >>> w = torch.empty(3, 5)
    2. >>> nn.init.sparse_(w, sparsity=0.1)

  • 相关阅读:
    初识MySQL
    EMC原理-传导(共模、差模)与辐射(近场、远场)详解
    【深入MaxCompute】人力家:借助Information Schema合理治理费用
    SAP数据元素描述增强修改
    Go入门系列:变量声明
    06.数据解析-xpath
    springboot:xml配置信息
    Ble Mesh的Generic Model ID&Opcode
    CSDN竞赛 - 第四期 总结
    Fundamental theorem of calculus
  • 原文地址:https://blog.csdn.net/weixin_36670529/article/details/101194024