• 【MindSpore易点通】网络构建经验总结中篇


    一个简单的MindSpore实现GAN网络示例

    背景信息

    从随机噪声生成MNIST手写字符,这个简单的GAN任务在目前主流的深度学习框架上都可以轻而易举地实现,在MindSpore平台上我们将它当做一个简单的GAN示例。

    经验总结

    关键细节1:定义两个Optimizer,分别用来更新生成器和判别器

    1. # opt
    2. gen_opt = Momentum(params=gen_network.trainable_params(),
    3.                learning_rate=Tensor(lr_schedule),
    4.                momentum=args.momentum,
    5.                weight_decay=0,
    6.                loss_scale=args.loss_scale,
    7.                decay_filter=default_wd_filter)
    8. dis_opt = Momentum(params=dis_network.trainable_params(),
    9.                    learning_rate=Tensor(lr_schedule),
    10.                    momentum=args.momentum,
    11.                    weight_decay=0,
    12.                    loss_scale=args.loss_scale,
    13.                    decay_filter=default_wd_filter)

    关键细节2:定义两个TrainOneStepCell用来计算梯度

    判别器的TrainOneStepCell比较简单,训练时将 (生成图片,0) 和 (真实图片,1) 送进去训练,就可以了。

    1. class TrainOneStepCellDIS(Cell):
    2.     def __init__(self, network, optimizer, sens=1.0):
    3.         super(TrainOneStepCellDIS, self).__init__(auto_prefix=False)
    4.         self.network = network
    5.         self.weights = ParameterTuple(network.trainable_params())
    6.         self.optimizer = optimizer
    7.         self.grad = C.GradOperation('grad', get_by_list=True, sens_param=True)
    8.         self.sens = sens
    9.         self.reducer_flag = False
    10.         self.grad_reducer = None
    11.         parallel_mode = _get_parallel_mode()
    12.         if parallel_mode in (ParallelMode.DATA_PARALLEL, ParallelMode.HYBRID_PARALLEL):
    13.             self.reducer_flag = True
    14.         if self.reducer_flag:
    15.             mean = _get_mirror_mean()
    16.             degree = _get_device_num()
    17.             self.grad_reducer = DistributedGradReducer(optimizer.parameters, mean, degree)
    18.     def construct(self, loss, img, label):
    19.         sens = P.Fill()(P.DType()(loss), P.Shape()(loss), self.sens)
    20.         grads = self.grad(self.network, self.weights)(img, label, sens)
    21.         if self.reducer_flag:
    22.             # apply grad reducer on grads
    23.             grads = self.grad_reducer(grads)
    24.         return F.depend(loss, self.optimizer(grads))

    生成器的训练比较麻烦一点,因为生成器自己是不能产生Loss的,必须将结果送入判别器才能产生Loss,在更新时只更新生成器的参数。在构造生成器的TrainOneStepCell时,需要将判别器网络也传进来,计算出对于Input的梯度。这个Input,实际上就是生成器的Output。用这个梯度向前传播,去更新生成器的参数。注意,训练时将 (生成图片,1)送进去训练。

    1. class TrainOneStepCellGEN(Cell):
    2.     def __init__(self, network, optimizer, postnetwork, sens=3.0):
    3.         super(TrainOneStepCellGEN, self).__init__(auto_prefix=False)
    4.         self.network = network
    5.         self.postnetwork = postnetwork
    6.         self.weights = ParameterTuple(network.trainable_params())
    7.         self.postweights = ParameterTuple(postnetwork.trainable_params())
    8.         self.optimizer = optimizer
    9.         self.grad = C.GradOperation('grad', get_by_list=True, sens_param=True)
    10.         self.postgrad = C.GradOperation('grad', get_all=True, get_by_list=True, sens_param=True)
    11.         self.sens = sens
    12.         self.reducer_flag = False
    13.         self.grad_reducer = None
    14.         parallel_mode = _get_parallel_mode()
    15.         if parallel_mode in (ParallelMode.DATA_PARALLEL, ParallelMode.HYBRID_PARALLEL):
    16.             self.reducer_flag = True
    17.         if self.reducer_flag:
    18.             mean = _get_mirror_mean()
    19.             degree = _get_device_num()
    20.             self.grad_reducer = DistributedGradReducer(optimizer.parameters, mean, degree)
    21.         self.cast = P.Cast()
    22.         self.print = P.Print()
    23.     def construct(self, loss, z, fake_img, inverse_fake_label):
    24.         sens_d = P.Fill()(P.DType()(loss), P.Shape()(loss), self.sens)
    25.         grads_d = self.postgrad(self.postnetwork, self.postweights)(fake_img, inverse_fake_label, sens_d)
    26.         sens_g = grads_d[0][0]
    27.         grads_g = self.grad(self.network, self.weights)(z, sens_g)
    28.         if self.reducer_flag:
    29.             # apply grad reducer on grads
    30.             grads_g = self.grad_reducer(grads_g)
    31.         return F.depend(loss, self.optimizer(grads_g))

    需要注意的是,静态图中命名一致的变量就是一个变量。为了保证TrainOneStepCellDis和TrainOneStepCellGen中的判别器网络参数是一致的,一定要把auto_prefix设为False。这样在TrainOneStepCell的命名空间中,不会为参数增加新的前缀,就可以实现权重共享。

    用MindSpore实现多个算子共享同一个权重

    背景信息

    网络训练中有时候需要共享某几个层的权值,前向的时候使用同一个权重,反向的时候只更新一次。

    经验总结

    示例 网络里多次用到了一个Conv+ReLU的结构,希望这些结构共享一个权值:

    1.PyTorch实现

    1. conv1x1 = nn.Conv2d(16, 16, 1, has_bias=True)
    2. self.predict_conv_relu = nn.SequentialCell([conv1x1, nn.ReLU()])
    3. self.predict_conv_relu2 = nn.SequentialCell([conv1x1, nn.ReLU()])
    4. self.predict_conv_relu3 = nn.SequentialCell([conv1x1, nn.ReLU()])

    PyTorch中,如果传入Sequential的模块是同一个Module实例的话参数就是共享的

    2.MindSpore实现 先初始化好predict_conv_relu的权值,然后对要共享的其他层做赋值

    1. self.predict_conv_relu2[0].weight = self.predict_conv_relu[0].weight
    2. self.predict_conv_relu2[0].bias = self.predict_conv_relu[0].bias
    3. self.predict_conv_relu3[0].weight = self.predict_conv_relu[0].weight
    4. self.predict_conv_relu3[0].bias = self.predict_conv_relu[0].bias

    在MindSpore网络的construct中手动修改卷积权值

    背景信息

    在网络训练前向过程中有需求修改卷积的权值,比如做一个Mask操作

    经验总结

    通过在init中取出某个Conv的权值,然后在construct中进行修改

    1. class MaskedConv2d(nn.Cell):
    2.     def __init__(self, in_channels, out_channels, kernel_size):
    3.         super(MaskedConv2d, self).__init__()
    4.         self.conv = nn.Conv2d(in_channels, out_channels, kernel_size=kernel_size, weight_init='ones')
    5.         self.conv_p = self.conv.conv2d
    6.         self.p = ParameterTuple((self.conv.weight,))
    7.         self.mask = np.ones_like(self.conv.weight.data.asnumpy())
    8.         self.mask[:,:,5//2,5//2:] = 0
    9.         self.mask[:,:,5//2+1:]=0
    10.         self.mask = Tensor(self.mask)
    11.         self.mul = P.Mul()
    12.         self.filter = np.ones_like(self.conv.weight.data.asnumpy())
    13.         self.filter = Tensor(self.filter)
    14.     def construct(self, x):
    15.         filter = self.mul(self.p[0], self.mask)
    16.         P.Assign()(self.p[0], filter)
    17.         update_weight = self.p[0] * 1
    18.         return self.conv_p(x, update_weight)
    19. class Context(nn.Cell):
    20.     def __init__(self, N=3):
    21.         super(Context, self).__init__()
    22.         self.mask_conv = MaskedConv2d(N, N*2, kernel_size=5)
    23.     def construct(self, x):
    24.         x = self.mask_conv(x)
    25.         return x
  • 相关阅读:
    如何优雅的删除Oracle数据库中的超大表
    【GAMES101】作业 5: 光线与三角形相交
    Vue-SplitPane可拖拽分隔面板(随意拖动div)
    智云通CRM:读懂客户的五种“成交信号”,恰到好处地收单?
    LeetCode 75. 颜色分类
    【图解RabbitMQ-5】RabbitMQ Web管控台图文介绍
    有些东西你要是早点懂,也不至于走那么多冤枉路
    为何越来越多人选择进入软件测试行业?深度剖析软件测试的优势...
    Cadence OrCAD Capture 查找功能详细介绍
    UNet网络模型学习总结
  • 原文地址:https://blog.csdn.net/Kenji_Shinji/article/details/127582920