理解 Transformers 库中 BERTforTokenClassification 类的输出时的困惑

本文介绍了理解 Transformers 库中 BERTforTokenClassification 类的输出时的困惑的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是transformers pytorch library文档中给出的例子

It is the example given in the documentation of transformers pytorch library

from transformers import BertTokenizer, BertForTokenClassification
import torch

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
model = BertForTokenClassification.from_pretrained('bert-base-uncased',
                      output_hidden_states=True, output_attentions=True)

input_ids = torch.tensor(tokenizer.encode("Hello, my dog is cute",
                         add_special_tokens=True)).unsqueeze(0)  # Batch size 1
labels = torch.tensor([1] * input_ids.size(1)).unsqueeze(0)  # Batch size 1
outputs = model(input_ids, labels=labels)

loss, scores, hidden_states,attentions = outputs

这里的 hidden_states 是一个长度为 13 的元组，包含模型在每层输出处的隐藏状态以及初始嵌入输出.我想知道，hidden_states[0] 还是 hidden_states[12] 代表最终的隐藏状态向量?

Here hidden_states is a tuple of length 13 and contains hidden-states of the model at the output of each layer plus the initial embedding outputs. I would like to know, whether hidden_states[0] or hidden_states[12] represent the final hidden state vectors?

BERTforTokenClassification

理解 Transformers 库中 BERTforTokenClassification 类的输出时的困惑

问题描述

推荐答案