似乎引理与Synset具有一对一的关系.来自 https://github.com上的文档字符串/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L220 : 引理属性,可通过具有相同名称的方法访问:: 名称:此引理的规范名称. 同义词集:此引理所属的同义词集. syntactic_marker:对于形容词,WordNet字符串标识 句法位置相对修饰名词.看: http://wordnet.princeton.edu/man/wninput.5WN.html# sect10 对于语音的所有其他部分,此属性为无". count:这个词在词网中的出现频率.因此我们可以执行此操作,并以某种方式知道每个Lemma对象将只向我们返回1个同义词集:>>> wn.synsets('dog')[0].lemmas()[0]Lemma('dog.n.01.dog')>>> wn.synsets('dog')[0].lemmas()[0].synset()Synset('dog.n.01')假设您正在尝试进行情感分析,并且需要WordNet中每个形容词的反义词,则可以轻松地执行此操作以接受反义词的同义词集:>>> from nltk.corpus import wordnet as wn>>> all_adj_in_wn = wn.all_synsets(pos='a')>>> def get_antonyms(ss):... return set(chain(*[[a.synset() for a in l.antonyms()] for l in ss.lemmas()]))...>>> for ss in all_adj_in_wn:... print ss, ':', get_antonyms(ss)...Synset('unable.a.01') : set([Synset('unable.a.01')])I have successfully retrieved synsets connected to a base synset via other semantic relations, as follows: wn.synset('good.a.01').also_sees() Out[63]: [Synset('best.a.01'), Synset('better.a.01'), Synset('favorable.a.01'), Synset('good.a.03'), Synset('obedient.a.01'), Synset('respectable.a.01')]wn.synset('good.a.01').similar_tos()Out[64]:[Synset('bang-up.s.01'), Synset('good_enough.s.01'), Synset('goodish.s.01'), Synset('hot.s.15'), Synset('redeeming.s.02'), Synset('satisfactory.s.02'), Synset('solid.s.01'), Synset('superb.s.02'), Synset('well-behaved.s.01')]However, the antonym relation seems different. I managed to retrieve the lemma connected to my base synset, but was not able to retrieve the actual synset, like so:wn.synset('good.a.01').lemmas()[0].antonyms()Out[67]: [Lemma('bad.a.01.bad')]How can I get the synset, and not the lemma, that is connected via antonymy to my base synset - wn.synset('good.a.01') ? TIA 解决方案 For some reason, WordNet indexes antonymy relations at the Lemma level instead of the Synset (see http://wordnetweb.princeton.edu/perl/webwn?o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&s=good&i=8&h=00001000000000000000000000000000#c), so the question is whether Synsets and Lemmas have many-to-many or one-to-one relations.In the case of ambiguous words, one word many meaning, we have a one-to-many relation between String-to-Synset, e.g.>>> wn.synsets('dog')[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]In the case of one meaning/concept, multiple representation, we have a one-to-many relation between Synset-to-String (where String refers to Lemma names):>>> dog = wn.synset('dog.n.1')>>> dog.definition()u'a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds'>>> dog.lemma_names()[u'dog', u'domestic_dog', u'Canis_familiaris']Note: up till now, we are comparing the relationships between String and Synsets not Lemmas and Synsets.The "cute" thing is that Lemma and String has a one-to-one relationship:>>> wn.synsets('dog')[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]>>> wn.synsets('dog')[0]Synset('dog.n.01')>>> wn.synsets('dog')[0].definition()u'a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds'>>> wn.synsets('dog')[0].lemmas()[Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), Lemma('dog.n.01.Canis_familiaris')]>>> wn.synsets('dog')[0].lemmas()[0]Lemma('dog.n.01.dog')>>> wn.synsets('dog')[0].lemmas()[0].name()u'dog'The _name property of a Lemma object returns a unicode string, not a list. From the code points: https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L202 and https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L444And it seems like the Lemma has a one-to-one relation with Synset. From docstring at https://github.com/nltk/nltk/blob/develop/nltk/corpus/reader/wordnet.py#L220: Lemma attributes, accessible via methods with the same name:: name: The canonical name of this lemma. synset: The synset that this lemma belongs to. syntactic_marker: For adjectives, the WordNet string identifying the syntactic position relative modified noun. See: http://wordnet.princeton.edu/man/wninput.5WN.html#sect10 For all other parts of speech, this attribute is None. count: The frequency of this lemma in wordnet.So we can do this and somehow know that each Lemma object is only going to return us 1 synset:>>> wn.synsets('dog')[0].lemmas()[0]Lemma('dog.n.01.dog')>>> wn.synsets('dog')[0].lemmas()[0].synset()Synset('dog.n.01')Assuming that you are trying to do some sentiment analysis and you need the antonyms of every adjective in WordNet, you can easily do this to accept the Synsets of the antonyms:>>> from nltk.corpus import wordnet as wn>>> all_adj_in_wn = wn.all_synsets(pos='a')>>> def get_antonyms(ss):... return set(chain(*[[a.synset() for a in l.antonyms()] for l in ss.lemmas()]))...>>> for ss in all_adj_in_wn:... print ss, ':', get_antonyms(ss)...Synset('unable.a.01') : set([Synset('unable.a.01')]) 这篇关于如何在NLTK的Wordnet中检索目标同义词集的反义词同义词集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!
06-26 05:49