并且训练数据的标签是张量 [N] 的形式,范围从 0 到 3(代表真实类).这是我的问题,我看过一些演示,他们直接将损失函数应用于输出张量和标签张量.我想知道为什么这可以工作,因为它们有不同的大小,而且这些大小似乎不符合广播语义".这是最小的演示.导入火炬将 torch.nn 导入为 nn导入 torch.optim 作为 optim如果 __name__ == '__main__':特征 = torch.randn(2, 7)gt = torch.tensor([1, 1])模型 = nn.Sequential(nn.Linear(7, 4),nn.ReLU(),nn.Linear(4, 4))优化器 = optim.SGD(model.parameters(), lr=0.005)f = nn.CrossEntropyLoss()对于范围内的纪元(1000):optimizer.zero_grad()输出 = 模型(特征)损失 = f(输出,gt)损失.向后()优化器.step() 解决方案 在 PyTorch 中的实现是:文档链接:https://pytorch.org/docs/stable/nn.html#torch.nn.CrossEntropyLoss所以在 pytorch 中实现这个公式你会得到:导入火炬导入 torch.nn.functional 作为 F输出 = torch.tensor([ 0.1998, -0.2261, -0.0388, 0.1457])target = torch.LongTensor([1])# 实现上面的公式print('手动交叉熵:', (-output[target] + torch.log(torch.sum(torch.exp(output))))[0])# 调用内置交叉熵函数检查结果打印('pytorch交叉熵:',F.cross_entropy(output.unsqueeze(0),目标))输出:手动交叉熵:张量(1.6462)pytorch 交叉熵:张量(1.6462)我希望这会有所帮助!For example, I have a net that take tensor [N, 7](N is the samples num) as input and tensor [N, 4] as output, the "4" represents the different classes’ probabilities.And the training data’s labels are the form of tensor [N], from range 0 to 3(represent the ground-truth class).Here’s my question, I’ve seen some demos, they directly apply the loss function on the output tensor and label tensor. I wonder why this can work, since they have different size, and there sizes seems don’t fit the "broadcasting semantics".Here’s the minimal demo.import torchimport torch.nn as nnimport torch.optim as optimif __name__ == '__main__': features = torch.randn(2, 7) gt = torch.tensor([1, 1]) model = nn.Sequential( nn.Linear(7, 4), nn.ReLU(), nn.Linear(4, 4) ) optimizer = optim.SGD(model.parameters(), lr=0.005) f = nn.CrossEntropyLoss() for epoch in range(1000): optimizer.zero_grad() output = model(features) loss = f(output, gt) loss.backward() optimizer.step() 解决方案 In PyTorch the implementation is:Link to the Documentation: https://pytorch.org/docs/stable/nn.html#torch.nn.CrossEntropyLossSo implementing this formula in pytorch you get:import torchimport torch.nn.functional as Foutput = torch.tensor([ 0.1998, -0.2261, -0.0388, 0.1457])target = torch.LongTensor([1])# implementing the formula aboveprint('manual cross-entropy:', (-output[target] + torch.log(torch.sum(torch.exp(output))))[0])# calling build in cross entropy function to check the resultprint('pytorch cross-entropy:', F.cross_entropy(output.unsqueeze(0), target))Output:manual cross-entropy: tensor(1.6462)pytorch cross-entropy: tensor(1.6462)I hope this helps! 这篇关于为什么损失函数可以应用于不同大小的张量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云! 08-14 15:03