阅读 NLTK 的书,不清楚如何从给定的句子生成依赖树.

Going through the NLTK book, it's not clear how to generate a dependency tree from a given sentence.

本书的相关部分:关于依赖的子章节语法 给出了一个示例图,但它没有显示如何解析句子以得出这些关系 - 或者我可能遗漏了 NLP 中的一些基本知识?

The relevant section of the book: sub-chapter on dependency grammar gives an example figure but it doesn't show how to parse a sentence to come up with those relationships - or maybe I'm missing something fundamental in NLP?

我想要类似于 stanford parser 的功能:给定一个句子我在睡梦中射杀了一头大象",它应该返回如下内容:

I want something similar to what the stanford parser does:Given a sentence "I shot an elephant in my sleep", it should return something like:

nsubj(shot-2, I-1)
det(elephant-4, an-3)
dobj(shot-2, elephant-4)
prep(shot-2, in-5)
poss(sleep-7, my-6)
pobj(in-5, sleep-7)


我们可以使用来自 NLTK 的 Stanford Parser.

We can use Stanford Parser from NLTK.


You need to download two things from their website:

  1. Stanford CoreNLP 解析器.
  2. 语言模型适用于您所需的语言(例如 英语语言模型)
  1. The Stanford CoreNLP parser.
  2. Language model for your desired language (e.g. english language model)


确保您的语言模型版本与您的斯坦福 CoreNLP 解析器版本匹配!


Make sure that your language model version matches your Stanford CoreNLP parser version!

截至 2018 年 5 月 22 日的当前 CoreNLP 版本为 3.9.1.

The current CoreNLP version as of May 22, 2018 is 3.9.1.

下载这两个文件后,将 zip 文件解压到您喜欢的任何位置.

After downloading the two files, extract the zip file anywhere you like.


Next, load the model and use it through NLTK

from nltk.parse.stanford import StanfordDependencyParser

path_to_jar = 'path_to/stanford-parser-full-2014-08-27/stanford-parser.jar'
path_to_models_jar = 'path_to/stanford-parser-full-2014-08-27/stanford-parser-3.4.1-models.jar'

dependency_parser = StanfordDependencyParser(path_to_jar=path_to_jar, path_to_models_jar=path_to_models_jar)

result = dependency_parser.raw_parse('I shot an elephant in my sleep')
dep = result.next()





The output of the last line is:

[((u'shot', u'VBD'), u'nsubj', (u'I', u'PRP')),
 ((u'shot', u'VBD'), u'dobj', (u'elephant', u'NN')),
 ((u'elephant', u'NN'), u'det', (u'an', u'DT')),
 ((u'shot', u'VBD'), u'prep', (u'in', u'IN')),
 ((u'in', u'IN'), u'pobj', (u'sleep', u'NN')),
 ((u'sleep', u'NN'), u'poss', (u'my', u'PRP$'))]


I think this is what you want.

