如何在不破坏反向传播的情况下为

如何在不破坏反向传播的情况下为

本文介绍了如何在不破坏反向传播的情况下为 pytorch 变量分配新值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 pytorch 变量,用作模型的可训练输入.在某些时候,我需要手动重新分配此变量中的所有值.

I have a pytorch variable that is used as a trainable input for a model. At some point I need to manually reassign all values in this variable.

如何在不中断与损失函数的连接的情况下做到这一点?

How can I do that without breaking the connections with the loss function?

假设当前值是 [1.2, 3.2, 43.2] 而我只是想让它们变成 [1,2,3].

Suppose the current values are [1.2, 3.2, 43.2] and I simply want them to become [1,2,3].

在我问这个问题的时候,我还没有意识到 PyTorch 没有像 Tensorflow 或 Keras 那样的静态图.

At the time I asked this question, I hadn't realized that PyTorch doesn't have a static graph as Tensorflow or Keras do.

在 PyTorch 中,训练循环是手动进行的,您需要在每个训练步骤中调用所有内容.(没有占位符 + 静态图的概念供以后提供数据).

In PyTorch, the training loop is made manually and you need to call everything in each training step. (There isn't the notion of placeholder + static graph for later feeding data).

因此,我们不能破坏图形",因为我们将使用新变量再次执行所有进一步的计算.我担心发生在 Keras 中的问题,而不是 PyTorch.

Consequently, we can't "break the graph", since we will use the new variable to perform all the further computations again. I was worried about a problem that happens in Keras, not in PyTorch.

推荐答案

您可以使用张量的 data 属性来修改值,因为对 data 的修改不会影响图形.
所以图形仍然是完整的,data 属性本身的修改对图形没有影响.(对 data 的操作和更改不会被 autograd 跟踪,因此不会出现在图表中)

You can use the data attribute of tensors to modify the values, since modifications on data do not affect the graph.
So the graph will still be intact and modifications of the data attribute itself have no influence on the graph. (Operations and changes on data are not tracked by autograd and thus not present in the graph)

由于您没有给出示例,因此此示例基于您的评论声明:
'假设我想更改图层的权重.'
我在这里使用了普通张量,但这对于图层的 weight.databias.data 属性的作用相同.

Since you haven't given an example, this example is based on your comment statement:
'Suppose I want to change the weights of a layer.'
I used normal tensors here, but this works the same for weight.data and bias.data attributes of a layers.

这是一个简短的例子:

import torch
import torch.nn.functional as F



# Test 1, random vector with CE
w1 = torch.rand(1, 3, requires_grad=True)
loss = F.cross_entropy(w1, torch.tensor([1]))
loss.backward()
print('w1.data', w1)
print('w1.grad', w1.grad)
print()

# Test 2, replacing values of w2 with w1, before CE
# to make sure that everything is exactly like in Test 1 after replacing the values
w2 = torch.zeros(1, 3, requires_grad=True)
w2.data = w1.data
loss = F.cross_entropy(w2, torch.tensor([1]))
loss.backward()
print('w2.data', w2)
print('w2.grad', w2.grad)
print()

# Test 3, replace data after computation
w3 = torch.rand(1, 3, requires_grad=True)
loss = F.cross_entropy(w3, torch.tensor([1]))
# setting values
# the graph of the previous computation is still intact as you can in the below print-outs
w3.data = w1.data
loss.backward()

# data were replaced with values from w1
print('w3.data', w3)
# gradient still shows results from computation with w3
print('w3.grad', w3.grad)

输出:

w1.data tensor([[ 0.9367,  0.6669,  0.3106]])
w1.grad tensor([[ 0.4351, -0.6678,  0.2326]])

w2.data tensor([[ 0.9367,  0.6669,  0.3106]])
w2.grad tensor([[ 0.4351, -0.6678,  0.2326]])

w3.data tensor([[ 0.9367,  0.6669,  0.3106]])
w3.grad tensor([[ 0.3179, -0.7114,  0.3935]])

这里最有趣的部分是w3.在调用 backward 时,值被替换为 w1 的值.
但是梯度是根据原始w3的值基于CE函数计算的.替换的值对图形没有影响.所以图连接没有断开,替换对图没有影响.我希望这就是您要找的!

The most interesting part here is w3. At the time backward is called the values are replaced by values of w1.
But the gradients are calculated based on the CE-function with values of original w3. The replaced values have no effect on the graph.So the graph connection is not broken, replacing had no influence on graph. I hope this is what you were looking for!

这篇关于如何在不破坏反向传播的情况下为 pytorch 变量分配新值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 03:08