The old oak tree from India fell down.

如何使用python NLTK获得以下句子的解析树表示形式?

How can I get the following parse tree representation of the sentence using python NLTK?

(ROOT (S (NP (NP (DT The) (JJ old) (NN oak) (NN tree)) (PP (IN from) (NP (NNP India)))) (VP (VBD fell) (PRT (RP down)))))


I need a complete example which I couldn't find in web!



I have gone through this book chapter to learn about parsing using NLTK but the problem is, I need a grammar to parse sentences or phrases which I do not have. I have found this stackoverflow post which also asked about grammar for parsing but there is no convincing answer there.


So, I am looking for a complete answer that can give me the parse tree given a sentence.


以下是使用StanfordCoreNLP而不是nltk的替代解决方案.在StanfordCoreNLP之上构建的库很少,我个人使用 pycorenlp 来解析句子.

Here is alternative solution using StanfordCoreNLP instead of nltk. There are few library that build on top of StanfordCoreNLP, I personally use pycorenlp to parse the sentence.

首先,您必须下载 stanford-corenlp-full 文件夹,其中具有*.jar文件里面.并在文件夹中运行服务器(默认端口为9000).

First you have to download stanford-corenlp-full folder where you have *.jar file inside. And run the server inside the folder (default port is 9000).

export CLASSPATH="`find . -name '*.jar'`"
java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer [port?] # run server


Then in Python, you can run the following in order to tag the sentence.

from pycorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('http://localhost:9000')

text = "The old oak tree from India fell down."

output = nlp.annotate(text, properties={
  'annotators': 'parse',
  'outputFormat': 'json'

print(output['sentences'][0]['parse']) # tagged output sentence

