问题描述
我试图通过使用 nltk pos_tag 来获取单词是单数形式还是复数形式.但结果并不准确.
I am trying to get whether a word is in singular form or in plural form by using nltk pos_tag. But the results are not accurate.
那么,我需要一种方法来找到如何获得一个单词是单数形式还是复数形式?而且我不需要使用任何 python 包就需要它.
So, I need a way to find how can get whether a word is in singular form or in plural form? moreover I need it without using any python package.
推荐答案
对于英语,每个词都应该有一个词根词条,其中默认的复数是单数.
For English, every word should somehow have a root lemma where the default plurality is singular.
假设你的列表中只有名词,你可以试试这个:
Assuming that you have only nouns in your list, you can try this:
from nltk.stem import WordNetLemmatizer
wnl = WordNetLemmatizer()
def isplural(word):
lemma = wnl.lemmatize(word, 'n')
plural = True if word is not lemma else False
return plural, lemma
nounls = ['geese', 'mice', 'bars', 'foos', 'foo',
'families', 'family', 'dog', 'dogs']
for nn in nounls:
isp, lemma = isplural(nn)
print nn, lemma, isp
你会遇到wordnet外的问题,那么你必须使用更复杂的分类器 或 有限状态机 来自 NLTK代码>.
You will have a problem when word is out of wordnet, then you have to use more sophiscated classifier or finite state machines out of NLTK
.
这篇关于如何在python中测试一个单词是否为单数形式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!