试图绕过梯度的表示方式和autograd的工作方式:
import torch
from torch.autograd import Variable
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
z.backward()
print(x.grad)
#Variable containing:
#32
#[torch.FloatTensor of size 1]
print(y.grad)
#None
为什么不为
y
生成渐变?如果是y.grad = dz/dy
,那么它至少不应该产生像y.grad = 2*y
这样的变量吗? 最佳答案
默认情况下,仅保留叶变量的渐变。非叶变量的梯度不会保留以供以后检查。这是
设计完成,以节省内存。
-中华umi
参见:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94
选项1:
致电y.retain_grad()
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
y.retain_grad()
z.backward()
print(y.grad)
#Variable containing:
# 8
#[torch.FloatTensor of size 1]
资料来源:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/16
选项2:
注册
hook
,基本上是计算该梯度时调用的函数。然后,您可以保存,分配,打印它,无论如何...from __future__ import print_function
import torch
from torch.autograd import Variable
x = Variable(torch.Tensor([2]), requires_grad=True)
y = x * x
z = y * y
y.register_hook(print) ## this can be anything you need it to be
z.backward()
输出:
Variable containing: 8 [torch.FloatTensor of size 1
资料来源:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/2
另请参见:https://discuss.pytorch.org/t/why-cant-i-see-grad-of-an-intermediate-variable/94/7
关于pytorch - 为什么autograd不会为中间变量产生梯度?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/45988168/