我想要一个python库函数,可跨语音的不同部分进行翻译/转换。有时它应该输出多个单词(例如“coder”和“code”都是动词“to code”中的名词,一个是主语,另一个是宾语)
# :: String => List of String
print verbify('writer') # => ['write']
print nounize('written') # => ['writer']
print adjectivate('write') # => ['written']
我主要关心动词名词,我想编写一个记笔记程序。即,我可以写“咖啡因拮抗A1”或“咖啡因是A1拮抗剂”,并且通过一些NLP可以弄清楚它们的含义相同。 (我知道这并不容易,而且需要NLP进行解析,而不仅仅是标记,但我想破解一个原型(prototype))。
类似的问题...
Converting adjectives and adverbs to their noun forms
(此答案仅源于根POS。我想在POS之间进行操作。)
ps在语言学中称为转换http://en.wikipedia.org/wiki/Conversion_%28linguistics%29
最佳答案
这是一种启发式方法。我刚刚对其进行了编码,以便为样式使用代码。它使用来自wordnet的derivationally_related_forms()。我已经实现名词化。我猜verbify工作类似。根据我的测试,效果很好:
from nltk.corpus import wordnet as wn
def nounify(verb_word):
""" Transform a verb to the closest noun: die -> death """
verb_synsets = wn.synsets(verb_word, pos="v")
# Word not found
if not verb_synsets:
return []
# Get all verb lemmas of the word
verb_lemmas = [l for s in verb_synsets \
for l in s.lemmas if s.name.split('.')[1] == 'v']
# Get related forms
derivationally_related_forms = [(l, l.derivationally_related_forms()) \
for l in verb_lemmas]
# filter only the nouns
related_noun_lemmas = [l for drf in derivationally_related_forms \
for l in drf[1] if l.synset.name.split('.')[1] == 'n']
# Extract the words from the lemmas
words = [l.name for l in related_noun_lemmas]
len_words = len(words)
# Build the result in the form of a list containing tuples (word, probability)
result = [(w, float(words.count(w))/len_words) for w in set(words)]
result.sort(key=lambda w: -w[1])
# return all the possibilities sorted by probability
return result
关于python - 在动词/名词/形容词形式之间转换单词,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/14489309/