本文介绍了比较同义词 NLTK的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想不出一个陌生人的问题,猜你会帮我.

for p in wn.synsets('change'):
打印(页)

获取:

Synset('change.n.01')Synset('change.n.02')Synset('change.n.03')Synset('change.n.04')Synset('change.n.05')Synset('change.n.06')Synset('change.n.07')Synset('change.n.08')Synset('change.n.09')Synset('variety.n.06')Synset('change.v.01')Synset('change.v.02')Synset('change.v.03')Synset('switch.v.03')Synset('change.v.05')Synset('change.v.06')Synset('exchange.v.01')Synset('transfer.v.06')Synset('deepen.v.04')Synset('change.v.10')

例如我有一个字符串:

a = '传输'

我希望能够识别单词 'change' 的各种同义词并知道 f.e.'转移' 就是其中之一.我如何询问我的程序:‘转移’是‘改变’的同义词之一吗?"

解决方案

首先,wordnet 索引概念(又名 Synsets)并链接每个概念的可能词,以下代码显示了链接到词 'change':

>>>从 nltk.corpus 导入 wordnet as wn>>>wn.synsets('改变')[Synset('change.n.01'), Synset('change.n.02'), Synset('change.n.03'), Synset('change.n.04'), Synset('change.n.02')n.05'), Synset('change.n.06'), Synset('change.n.07'), Synset('change.n.08'), Synset('change.n.09'),Synset('variety.n.06'), Synset('change.v.01'), Synset('change.v.02'), Synset('change.v.03'), Synset('switch.v.03'), Synset('change.v.05'), Synset('change.v.06'), Synset('exchange.v.01'), Synset('transfer.v.06'), Synset('deepen.v.04'), Synset('change.v.10')]

同义词集有几个属性,它有:

  • 身份证号码
  • 词性标签
  • 定义
  • 引理名称,即可用于实例化概念的可能词
  • 通过 N-nymy 关系(例如上位词、下位词、meronym)链接到其他同义词

以下是在 NLTK 中接口上述属性的方法:

>>>wn.synsets('改变')[0]Synset('change.n.01')>>>wn.synsets('change')[0].offset()7296428>>>wn.synsets('change')[0].pos()联合国'>>>wn.synsets('change')[0].definition()u'当某物从一个状态或阶段转移到另一个状态时发生的事件'>>>wn.synsets('change')[0].lemma_names()[你'改变',你'改变',你'修改']>>>wn.synsets('change')[0].hypernyms()[Synset('happening.n.01')]

但是同义词集不一定有同义词关系.如果我们将同义词定义为具有相似含义的词,则是具有同义关系的词(即引理).此外,词的上下文定义了一个词是否是另一个词的同义词.一个词的意义是有限的,它是包含意义并通过人类语言实例化意义的概念".至少这是典型的语义理论,请参阅 http://goo.gl/ZHzlNF 中的第 2 章>

所以当你要问'transfer'是'change'的同义词时,你必须先:

  • 定义/选择您在此处引用的概念并提供使用转移"的上下文,google Word Sense Disambiguation
  • 定义您所指的变更概念.

那么意义的比较是可能的.

另见:

I can't come up with a stranger problem, guess you'll help me.

for p in wn.synsets('change'):<br>
    print(p)

Getting:

Synset('change.n.01')
Synset('change.n.02')
Synset('change.n.03')
Synset('change.n.04')
Synset('change.n.05')
Synset('change.n.06')
Synset('change.n.07')
Synset('change.n.08')
Synset('change.n.09')
Synset('variety.n.06')
Synset('change.v.01')
Synset('change.v.02')
Synset('change.v.03')
Synset('switch.v.03')
Synset('change.v.05')
Synset('change.v.06')
Synset('exchange.v.01')
Synset('transfer.v.06')
Synset('deepen.v.04')
Synset('change.v.10')

For example I have an a string:

a = 'transfer'

I'd like to be able to identify all kinds of synonyms of word 'change' and know f.e. 'transfer' is the one of them. How can I ask my program:"Is 'transfer' is one of the synonyms of 'change'?"

解决方案

Firstly, wordnet indexes concepts (aka Synsets) and link possible words for each concept, the following code shows the concepts link to the word 'change':

>>> from nltk.corpus import wordnet as wn
>>> wn.synsets('change')
[Synset('change.n.01'), Synset('change.n.02'), Synset('change.n.03'), Synset('change.n.04'), Synset('change.n.05'), Synset('change.n.06'), Synset('change.n.07'), Synset('change.n.08'), Synset('change.n.09'), Synset('variety.n.06'), Synset('change.v.01'), Synset('change.v.02'), Synset('change.v.03'), Synset('switch.v.03'), Synset('change.v.05'), Synset('change.v.06'), Synset('exchange.v.01'), Synset('transfer.v.06'), Synset('deepen.v.04'), Synset('change.v.10')]

A synset has several properties, it has:

  • ID number
  • Part-of-Speech label
  • definition
  • lemma names, i.e. the possible words that can be used to instantiate the concept
  • links to other synset by N-nymy relations (e.g. hypernym, hyponym, meronym)

Here's how to interface the above properties in NLTK:

>>> wn.synsets('change')[0]
Synset('change.n.01')
>>> wn.synsets('change')[0].offset()
7296428
>>> wn.synsets('change')[0].pos()
u'n'
>>> wn.synsets('change')[0].definition()
u'an event that occurs when something passes from one state or phase to another'
>>> wn.synsets('change')[0].lemma_names()
[u'change', u'alteration', u'modification']
>>> wn.synsets('change')[0].hypernyms()
[Synset('happening.n.01')]

But a synset doesn't necessary have synonym relations. If we define synonyms as words that have similar meaning, it is the words (i.e. lemmas) that have synonymy relations. In addition, the context of the words defines whether a word is a synonym of another. A single word has limited meaning, it's the "concept" that contains meaning and instantiate the meaning through human words. At least that's the typical theory of semantics, see chapter 2 in http://goo.gl/ZHzlNF

So when you want to ask is 'transfer' a synonym of 'change', you have to first:

  • define/select the concept you're referring to here and provide the context where 'transfer' is used, google Word Sense Disambiguation
  • define which concept of change are you referring to.

Then comparison of meaning is possible.

See also:

这篇关于比较同义词 NLTK的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-12 11:12