我正在使用斯坦福解析器从文本中获取依存关系,如下所示:
Reader reader = new StringReader("The room was not nice. It was bright, but cold.");
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
// the dependencies of the entire text
List<TypedDependency> textDependencies = new ArrayList<TypedDependency>();
// get the dependencies of each sentence and add it to the list
for (List<HasWord> sentence : new DocumentPreprocessor(reader)) {
Tree parse = lp.apply(sentence);
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
textDependencies.addAll(gs.typedDependenciesCCprocessed());
}
从上面运行代码后,名为
textDependencies
的列表将包含以下依赖项: det(room-2, The-1)
nsubj(nice-5, room-2)
cop(nice-5, was-3)
neg(nice-5, not-4)
root(ROOT-0, nice-5)
nsubj(warm-3, It-1)
nsubj(noisy-6, It-1)
cop(warm-3, was-2)
root(ROOT-0, warm-3)
conj_but(warm-3, noisy-6)
有没有一种方法可以找出谁是“房间”,从而得到一些东西来证明它实际上是房间?
最佳答案
您想要的就是共参考解析。斯坦福大学CoreNLP does that already。我找不到以编程方式完成演示的示例,但是如果您正在运行预编译的可执行文件,则需要将dcoref
添加到这样的注释器列表中:
java -cp <all jars> edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file input.txt
关于java - NLP-我可以找出谁是“它”吗?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/23339388/