java - 如何在CoreNLP中将字符串传递给AbstractSequenceClassifier.classifyAndWriteAnswersKBest？

AbstractSequenceClassifier.classifyAndWriteAnswersKBest允许传递文件名和ObjectBank<List<IN>>，但是从ObjectBank的文档尚不清楚如何在不涉及文件的情况下创建这样的ObjectBank。

我在Java 8中使用CoreNLP 3.7.0。

最佳答案

您应该只使用此方法：

Counter<List<IN>> classifyKBest(List<IN> doc, Class<? extends CoreAnnotation<String>> answerField, int k)

它将返回返回序列到分数的映射。

使用以下代码行，您可以将计数器变成序列的排序列表：

List<List<IN>> sorted = Counters.toSortedList(kBest);

我不确定您要做什么，但通常IN是CoreLabel。这里的关键是将您的String转换为IN的列表。这应该是一个CoreLabel，但是我不知道您正在使用的AbstractSequenceClassifier的完整细节。

如果要在句子上运行序列分类器，则可以先使用管道将其分类，然后将标记列表传递给classifyKBest(...)

例如，如果在您的示例中，您尝试获取k个最佳命名实体标签：

// set up pipeline
Properties props = new Properties();
props.setProperty("annotators", "tokenize");
StanfordCoreNLP tokenizerPipeline = new StanfordCoreNLP(props);

// get list of tokens for example sentence
String exampleSentence = "...";
// wrap sentence in an Annotation object
Annotation annotation = new Annotation(exampleSentence);
// tokenize sentence
tokenizerPipeline.annotate(annotation);
// get the list of tokens
List<CoreLabel> tokens = annotation.get(CoreAnnotations.TokensAnnotation.class);

//...
// classifier should be an AbstractSequenceClassifier

// get the k best sequences from your abstract sequence classifier
Counter<List<CoreLabel>> kBestSequences = classifier.classifyKBest(tokens,CoreAnnotations.NamedEntityTagAnnotation.class, 10)
// sort the k-best examples
List<List<CoreLabel>> sortedKBest = Counters.toSortedList(kBestSequences);
// example: getting the second best list
List<CoreLabel> secondBest = sortedKBest.get(1);
// example: print out the tags for the second best list
System.out.println(secondBest.stream().map(token->token.get(CoreAnnotations.NamedEntityTagAnnotation.class)).collect(Collectors.joining(" ")));
// example print out the score for the second best list
System.out.println(kBestSequences.getCount(secondBest));

如果您还有其他问题，请告诉我，我会帮忙！

关于java - 如何在CoreNLP中将字符串传递给AbstractSequenceClassifier.classifyAndWriteAnswersKBest？，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/42873268/