torch.autograd.grad求二阶导数

1 用法介绍

pytorch中torch.autograd.grad函数主要用于计算并返回输出相对于输入的梯度总和，具体的参数作用如下所示：

torch.tril(input, diagonal=0, *, out=None) $\longrightarrow$ Tensor

outputs(sequence of Tensor)：表示微分函数的输出
inputs (sequence of Tensor)：表示微分函数的输入
grad_outputs (sequence of Tensor)：表示“向量-雅克比矩阵”的向量
retain_graph (bool, optional)：表示是否需要将计算图释放掉，当计算二阶导数时需要设置为True
create_graph (bool, optional)：表示是否需要将梯度将会加入到计算图中，当计算高阶导数或者其他计算时会将其设置为需要设置为True
allow_unused (bool, optional)：表示是否只返回输入的梯度，而不返回其他叶子节点的梯度

2 实例讲解

以下给出了具体的二阶导数解析解的数学实例

给定一个向量 ${\bf{x}}=(x_1,x_2)^{\top}$ ，可以得到向量 ${\bf{y}}=(y_1,y_2)^{\top}=(x^2_1,x^2_2)^{\top}$ 。对向量 ${\bf{y}}$ 的元素求平均可以得到损失函数 $\mathrm{loss}_1$ 为： $\mathrm{loss}_1({\bf{x}})=\mathrm{mean}({\bf{y}})=\frac{x_1^2+x^2_2}{2}$ 向量 ${\bf{y}}$ 元素的分量分别对 ${\bf{x}}$ 求偏导，然后相加求平均得到损失函数 $\mathrm{loss}_2$ 为 $\left\{h1(x)=∂y1∂x=(2x1,0)⊤h2(x)=∂y2∂x=(0,2x2)⊤$
\right.,\quad \mathrm{loss}_2({\bf{x}})=\mathrm{mean}(h_1({\bf{x}}_1)-h_2({\bf{x}}_2))=x_1-x_2 $⎩ ⎨ ⎧ h_{1} (x) h_{2} (x) = \frac{\partial y _{1}}{\partial x} = (2 x_{1}, 0)^{⊤} = \frac{\partial y _{2}}{\partial x} = (0, 2 x_{2})^{⊤}, loss_{2} (x) = mean (h_{1} (x_{1}) - h_{2} (x_{2})) = x_{1} - x_{2}$ 将损失函数 $\mathrm{loss}_1$ 与损失函数 $\mathrm{loss}_2$ 相加可以得到 $\mathrm{loss}({\bf{x}})=\mathrm{loss}_1({\bf{x}})+\mathrm{loss}_2({\bf{x}})=\frac{x_1^2+x_2^2}{2}+x_1-x_2$ 最终损失函数 $\mathrm{loss}$ 对向量 ${\bf{x}}$ 的偏导数为 $\frac{\partial {\mathrm{loss}}}{\partial{{\bf{x}}}}=(x_1+1,x_2-1)^{\top}$

以下为用pytorch实现二阶导数相对应的代码实例：

import torch

x = torch.tensor([5.0, 7.0], requires_grad=True)
y = x**2

loss1 = torch.mean(y)

h1 = torch.autograd.grad(y[0], x, retain_graph = True, create_graph=True)
h2 = torch.autograd.grad(y[1], x, retain_graph = True, create_graph=True)
loss2 = torch.mean(h1[0] - h2[0])

loss = loss1 + loss2

result = torch.autograd.grad(loss, x)
print(result)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

当向量 ${\bf{x}}$ 取值为 $(5,7)^{\top}$ 时，根据数学解析解得到的二阶导数为 $(6,6)^{\top}$ ，对应的代码运行的实验结果也为 $(6, 6)$ 。

相关阅读:
2024.6.17 作业 xyt
公众号微信网页授权
【2022版】基于矩阵分解的PCA 白化&ZCA白化
云原生技术详解
思维模型晕轮效应
数据结构-----串(String)详解
在CSDN上挣点外快的小tips
模拟京东快递单号查询练习
Linux实用操作-----快捷键的使用（收藏系列）
建筑模板常见的问题有哪些？

原文地址：https://blog.csdn.net/qq_38406029/article/details/126133345