问题描述
还有其他几篇文章,例如使用NLTK检测英语动词时态,,
还有其他几篇文章,例如使用NLTK检测英语动词时态,,
Following several other posts, [e.g. Detect English verb tenses using NLTK , Identifying verb tenses in python, Python NLTK figure out tense ] I wrote the following code to determine tense of a sentence in Python using POS tagging:
from nltk import word_tokenize, pos_tag
def determine_tense_input(sentence):
text = word_tokenize(sentence)
tagged = pos_tag(text)
tense = {}
tense["future"] = len([word for word in tagged if word[1] == "MD"])
tense["present"] = len([word for word in tagged if word[1] in ["VBP", "VBZ","VBG"]])
tense["past"] = len([word for word in tagged if word[1] in ["VBD", "VBN"]])
return(tense)
这将返回一个用于过去/现在/将来动词的值,然后我通常将的最大值作为句子的时态.准确度还算不错,但是我想知道是否有更好的方法可以做到这一点.
This returns a value for the usage of past/present/future verbs, which I typically then take the max value of as the tense of the sentence. The accuracy is moderately decent, but I am wondering if there is a better way of doing this.
例如,现在是否有附带机会编写了一个更专用于提取句子时态的软件包? [注意-3个堆栈溢出帖子中有2个已经使用了4年,所以现在情况可能已经改变了].或者,是否应该在nltk中使用其他解析器来提高准确性?如果没有,希望上面的代码可以对其他人有所帮助!
For example, is there now by-chance a package written which is more dedicated to extracting the tense of a sentence? [note - 2 of the 3 stack-overflow posts are 4-years old, so things may have now changed]. Or alternatively, should I be using a different parser from within nltk to increase accuracy? If not, hope the above code may help someone else!
您可以使用斯坦福解析器获取句子的依存关系分析.依赖项解析的根源将是定义句子的主要"动词(我不太确定具体的语言术语是什么).然后,您可以在该动词上使用POS标记来找到其时态,并使用它.
You could use the Stanford Parser to get a dependency parse of the sentence. The root of the dependency parse will be the 'primary' verb that defines the sentence (I'm not too sure what the specific linguistic term is). You can then use the POS tag on this verb to find its tense, and use that.
这篇关于确定句子的时态Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!