java - NLP-我可以找出谁是“它”吗？

我正在使用斯坦福解析器从文本中获取依存关系，如下所示：

    Reader reader = new StringReader("The room was not nice. It was bright, but cold.");
    TreebankLanguagePack tlp = new PennTreebankLanguagePack();
    GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();

    // the dependencies of the entire text
    List<TypedDependency> textDependencies = new ArrayList<TypedDependency>();
    // get the dependencies of each sentence and add it to the list
    for (List<HasWord> sentence : new DocumentPreprocessor(reader)) {
        Tree parse = lp.apply(sentence);
        GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
        textDependencies.addAll(gs.typedDependenciesCCprocessed());
    }

从上面运行代码后，名为textDependencies的列表将包含以下依赖项：

    det(room-2, The-1)
    nsubj(nice-5, room-2)
    cop(nice-5, was-3)
    neg(nice-5, not-4)
    root(ROOT-0, nice-5)
    nsubj(warm-3, It-1)
    nsubj(noisy-6, It-1)
    cop(warm-3, was-2)
    root(ROOT-0, warm-3)
    conj_but(warm-3, noisy-6)

有没有一种方法可以找出谁是“房间”，从而得到一些东西来证明它实际上是房间？

最佳答案

您想要的就是共参考解析。斯坦福大学CoreNLP does that already。我找不到以编程方式完成演示的示例，但是如果您正在运行预编译的可执行文件，则需要将dcoref添加到这样的注释器列表中：

java -cp <all jars> edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file input.txt

关于java - NLP-我可以找出谁是“它”吗？，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/23339388/