本文介绍了Wordnet 查找同义词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种使用 wordnet 查找特定单词的所有同义词的方法.我正在使用 JAWS.

例如:

love(v):欣赏、崇拜、依恋、迷恋、迷恋、迷恋、迷恋、迷恋、喜爱、恋爱与,封圣,关心,珍惜,选择,神化,喜欢,宠爱,尊重,提升,堕落,幻想,荣耀,去,继续......

爱(n):同义词: 奉承、喜爱、忠诚、亲情、多情、恋情、欣赏、热情、热情、依恋、案例*、珍惜、迷恋、喜悦、忠诚、奉献、情感、魅力、享受、热情、忠诚、火焰、喜爱、友谊,渴望,偶像崇拜,倾向,迷恋,参与

在一个相关的问题中,用户 Ram 指出一些代码,但这还不够,因为它只是给出了截然不同的输出:

爱,激情:任何温暖的感情或奉献的对象,亲爱的,最亲爱的,亲爱的,爱:一个心爱的人;用作爱称爱、性爱、性爱:对性欲和吸引力的深刻感受爱情:网球或壁球零分性爱,做爱,做爱,爱,爱情生活:两个人之间的性活动(通常包括性交)爱:深爱或喜欢

那么我如何实现它并且 wordnet 适合我想做的事情?

解决方案

只使用 WordNet,您可以尝试使用语义相似性来确定两个单词(同义词集)是否足够相似以成为同义词.下面是一个简单的例子,它来自修改我的另一个关于使用 WordNet 的语义相似性的答案.

它确实有它的问题:

  • 反义词与同义词混在一起
  • !(因为它必须检查所有 ~117k 同义词集)

尽管如此,与单独使用 lemma_names 相比,它产生的同义词更多,所以我把它留在这里以防它可能有用(可能与其他东西结合使用).

>>>从 nltk.corpus 导入 wordnet as wn>>>def syn(word, lch_threshold=2.26):对于 wn.synsets(word) 中的 net1:对于 wn.all_synsets() 中的 net2:尝试:lch = net1.lch_similarity(net2)除了:继续# 与 LCH 进行比较的值是凭经验找到的.#(该值非常依赖于应用程序.实验!)如果 lch >= lch_threshold:产量 (net1, net2, lch)>>>for x in syn('love'):打印 x

上面的代码输出:

(Synset('love.n.01'), Synset('feeling.n.01'), 2.538973871058276)(Synset('love.n.01'), Synset('conditional_emotional_response.n.01'), 2.538973871058276)(Synset('love.n.01'), Synset('emotion.n.01'), 2.9444389791664407)(Synset('love.n.01'), Synset('worship.n.02'), 2.9444389791664407)(Synset('love.n.01'), Synset('anger.n.01'), 2.538973871058276)(Synset('love.n.01'), Synset('fear.n.01'), 2.538973871058276)(Synset('love.n.01'), Synset('fear.n.03'), 2.538973871058276)(Synset('love.n.01'), Synset('anxiety.n.02'), 2.538973871058276)(Synset('love.n.01'), Synset('joy.n.01'), 2.538973871058276)(Synset('love.n.01'), Synset('love.n.01'), 3.6375861597263857)(Synset('love.n.01'), Synset('agape.n.02'), 2.9444389791664407)(Synset('love.n.01'), Synset('agape.n.01'), 2.9444389791664407)(Synset('love.n.01'), Synset('filial_love.n.01'), 2.9444389791664407)(Synset('love.n.01'), Synset('ardor.n.02'), 2.9444389791664407)(Synset('love.n.01'), Synset('amorousness.n.01'), 2.9444389791664407)(Synset('love.n.01'), Synset('puppy_love.n.01'), 2.9444389791664407)(Synset('love.n.01'), Synset('devotion.n.01'), 2.9444389791664407)(Synset('love.n.01'), Synset('benevolence.n.01'), 2.9444389791664407)(Synset('love.n.01'), Synset('beneficence.n.01'), 2.538973871058276)(Synset('love.n.01'), Synset('heartstrings.n.01'), 2.9444389791664407)(Synset('love.n.01'), Synset('lovingness.n.01'), 2.9444389791664407)(Synset('love.n.01'), Synset('warm heartness.n.01'), 2.538973871058276)(Synset('love.n.01'), Synset('loyalty.n.02'), 2.9444389791664407)(Synset('love.n.01'), Synset('hate.n.01'), 2.538973871058276)(Synset('love.n.01'), Synset('emotional_state.n.01'), 2.538973871058276)(Synset('love.n.02'), Synset('content.n.05'), 2.538973871058276)(Synset('love.n.02'), Synset('object.n.04'), 2.9444389791664407)(Synset('love.n.02'), Synset('antipathy.n.02'), 2.538973871058276)(Synset('love.n.02'), Synset('bugbear.n.02'), 2.538973871058276)(Synset('love.n.02'), Synset('execration.n.03'), 2.538973871058276)(Synset('love.n.02'), Synset('center.n.06'), 2.538973871058276)(Synset('love.n.02'), Synset('hallucination.n.03'), 2.538973871058276)(Synset('love.n.02'), Synset('infatuation.n.03'), 2.538973871058276)(Synset('love.n.02'), Synset('love.n.02'), 3.6375861597263857)(Synset('beloved.n.01'), Synset('person.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('lover.n.01'), 2.9444389791664407)(Synset('beloved.n.01'), Synset('admirer.n.03'), 2.538973871058276)(Synset('beloved.n.01'), Synset('beloved.n.01'), 3.6375861597263857)(Synset('beloved.n.01'), Synset('bettrothed.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('boyfriend.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('darling.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('girlfriend.n.02'), 2.538973871058276)(Synset('beloved.n.01'), Synset('idolizer.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('inamorata.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('inamorato.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('kisser.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('necker.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('petter.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('romeo.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('soul_mate.n.01'), 2.538973871058276)(Synset('beloved.n.01'), Synset('squeeze.n.04'), 2.538973871058276)(Synset('beloved.n.01'), Synset('sweetheart.n.01'), 2.538973871058276)(Synset('love.n.04'), Synset('desire.n.01'), 2.538973871058276)(Synset('love.n.04'), Synset('sexual_desire.n.01'), 2.9444389791664407)(Synset('love.n.04'), Synset('love.n.04'), 3.6375861597263857)(Synset('love.n.04'), Synset('aphrodisia.n.01'), 2.538973871058276)(Synset('love.n.04'), Synset('anaphrodisia.n.01'), 2.538973871058276)(Synset('love.n.04'), Synset('passion.n.05'), 2.538973871058276)(Synset('love.n.04'), Synset('sensuality.n.01'), 2.538973871058276)(Synset('love.n.04'), Synset('amorousness.n.02'), 2.538973871058276)(Synset('love.n.04'), Synset('fetish.n.01'), 2.538973871058276)(Synset('love.n.04'), Synset('libido.n.01'), 2.538973871058276)(Synset('love.n.04'), Synset('lecherousness.n.01'), 2.538973871058276)(Synset('love.n.04'), Synset('nymphomania.n.01'), 2.538973871058276)(Synset('love.n.04'), Synset('satyriasis.n.01'), 2.538973871058276)(Synset('love.n.04'), Synset('the_hots.n.01'), 2.538973871058276)(Synset('love.n.05'), Synset('bowling_score.n.01'), 2.538973871058276)(Synset('love.n.05'), Synset('football_score.n.01'), 2.538973871058276)(Synset('love.n.05'), Synset('baseball_score.n.01'), 2.538973871058276)(Synset('love.n.05'), Synset('basketball_score.n.01'), 2.538973871058276)(Synset('love.n.05'), Synset('number.n.02'), 2.538973871058276)(Synset('love.n.05'), Synset('score.n.03'), 2.9444389791664407)(Synset('love.n.05'), Synset('stroke.n.06'), 2.538973871058276)(Synset('love.n.05'), Synset('birdie.n.01'), 2.538973871058276)(Synset('love.n.05'), Synset('bogey.n.02'), 2.538973871058276)(Synset('love.n.05'), Synset('deficit.n.03'), 2.538973871058276)(Synset('love.n.05'), Synset('double-bogey.n.01'), 2.538973871058276)(Synset('love.n.05'), Synset('duck.n.02'), 2.538973871058276)(Synset('love.n.05'), Synset('eagle.n.02'), 2.538973871058276)(Synset('love.n.05'), Synset('double_eagle.n.01'), 2.538973871058276)(Synset('love.n.05'), Synset('game.n.06'), 2.538973871058276)(Synset('love.n.05'), Synset('lead.n.07'), 2.538973871058276)(Synset('love.n.05'), Synset('love.n.05'), 3.6375861597263857)(Synset('love.n.05'), Synset('match.n.05'), 2.538973871058276)(Synset('love.n.05'), Synset('par.n.01'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('bondage.n.03'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('outercourse.n.01'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('safe_sex.n.01'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('sexual_activity.n.01'), 2.9444389791664407)(Synset('sexual_love.n.02'), Synset('conception.n.02'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('sexual_intercourse.n.01'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('pleasure.n.05'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('sexual_love.n.02'), 3.6375861597263857)(Synset('sexual_love.n.02'), Synset('carnal_abuse.n.01'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('coupling.n.03'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('reproduction.n.05'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('foreplay.n.01'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('perversion.n.02'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('autoeroticism.n.01'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('promiscuity.n.01'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('lechery.n.01'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('homosexuality.n.01'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('bisexuality.n.02'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('heterosexuality.n.01'), 2.538973871058276)(Synset('sexual_love.n.02'), Synset('bestiality.n.02'), 2.538973871058276)# ...

I am searching for a way to find all the synonyms of a particular word using wordnet. I am using JAWS.

For example:

love(v): admire, adulate, be attached to, be captivated by, be crazy about, be enamored of, be enchanted by, be fascinated with, be fond of, be in love with, canonize, care for, cherish, choose, deify, delight in, dote on, esteem, exalt, fall for, fancy, glorify, go for, gone on....

love(n):Synonym : adulation, affection, allegiance, amity, amorousness, amour, appreciation, ardency, ardor, attachment, case*, cherishing, crush, delight, devotedness, devotion, emotion, enchantment, enjoyment, fervor, fidelity, flame, fondness, friendship, hankering, idolatry, inclination, infatuation, involvement

In a related question user Ram has pointed to some code but that does not suffice as it just gives a vastly different output:

love, passion: any object of warm affection or devotion beloved, dear, dearest, honey,love: a beloved person; used as terms of endearmentlove, sexual love, erotic love: a deep feeling of sexual desire and attractionlove: a score of zero in tennis or squashsexual love, lovemaking, making love, love, love life: sexual activities (often including sexual intercourse) between two peoplelove: have a great affection or liking for

So how do I achieve it and is wordnet suited for what I want to do?

解决方案

Sticking with just WordNet, you could try to use semantic similarity to determine if two words (synsets) are similar enough to be synonyms. Below is a quick example that came from modifying another of my answers on semantic similarity using WordNet.

It does have its problems though:

  • Antonyms are mixed in with synonyms
  • It is slow! (as it has to check all ~117k synsets)

Still, it produces more synonyms than using lemma_names alone, so I leave it here in case it might be useful (in conjunction with something else perhaps).

>>> from nltk.corpus import wordnet as wn
>>> def syn(word, lch_threshold=2.26):
    for net1 in wn.synsets(word):
        for net2 in wn.all_synsets():
            try:
                lch = net1.lch_similarity(net2)
            except:
                continue
            # The value to compare the LCH to was found empirically.
            # (The value is very application dependent. Experiment!)
            if lch >= lch_threshold:
                yield (net1, net2, lch)


>>> for x in syn('love'):
    print x

Code above outputs:

(Synset('love.n.01'), Synset('feeling.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('conditioned_emotional_response.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('emotion.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('worship.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('anger.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('fear.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('fear.n.03'), 2.538973871058276)
(Synset('love.n.01'), Synset('anxiety.n.02'), 2.538973871058276)
(Synset('love.n.01'), Synset('joy.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('love.n.01'), 3.6375861597263857)
(Synset('love.n.01'), Synset('agape.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('agape.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('filial_love.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('ardor.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('amorousness.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('puppy_love.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('devotion.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('benevolence.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('beneficence.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('heartstrings.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('lovingness.n.01'), 2.9444389791664407)
(Synset('love.n.01'), Synset('warmheartedness.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('loyalty.n.02'), 2.9444389791664407)
(Synset('love.n.01'), Synset('hate.n.01'), 2.538973871058276)
(Synset('love.n.01'), Synset('emotional_state.n.01'), 2.538973871058276)
(Synset('love.n.02'), Synset('content.n.05'), 2.538973871058276)
(Synset('love.n.02'), Synset('object.n.04'), 2.9444389791664407)
(Synset('love.n.02'), Synset('antipathy.n.02'), 2.538973871058276)
(Synset('love.n.02'), Synset('bugbear.n.02'), 2.538973871058276)
(Synset('love.n.02'), Synset('execration.n.03'), 2.538973871058276)
(Synset('love.n.02'), Synset('center.n.06'), 2.538973871058276)
(Synset('love.n.02'), Synset('hallucination.n.03'), 2.538973871058276)
(Synset('love.n.02'), Synset('infatuation.n.03'), 2.538973871058276)
(Synset('love.n.02'), Synset('love.n.02'), 3.6375861597263857)
(Synset('beloved.n.01'), Synset('person.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('lover.n.01'), 2.9444389791664407)
(Synset('beloved.n.01'), Synset('admirer.n.03'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('beloved.n.01'), 3.6375861597263857)
(Synset('beloved.n.01'), Synset('betrothed.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('boyfriend.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('darling.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('girlfriend.n.02'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('idolizer.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('inamorata.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('inamorato.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('kisser.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('necker.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('petter.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('romeo.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('soul_mate.n.01'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('squeeze.n.04'), 2.538973871058276)
(Synset('beloved.n.01'), Synset('sweetheart.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('desire.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('sexual_desire.n.01'), 2.9444389791664407)
(Synset('love.n.04'), Synset('love.n.04'), 3.6375861597263857)
(Synset('love.n.04'), Synset('aphrodisia.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('anaphrodisia.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('passion.n.05'), 2.538973871058276)
(Synset('love.n.04'), Synset('sensuality.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('amorousness.n.02'), 2.538973871058276)
(Synset('love.n.04'), Synset('fetish.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('libido.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('lecherousness.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('nymphomania.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('satyriasis.n.01'), 2.538973871058276)
(Synset('love.n.04'), Synset('the_hots.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('bowling_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('football_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('baseball_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('basketball_score.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('number.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('score.n.03'), 2.9444389791664407)
(Synset('love.n.05'), Synset('stroke.n.06'), 2.538973871058276)
(Synset('love.n.05'), Synset('birdie.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('bogey.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('deficit.n.03'), 2.538973871058276)
(Synset('love.n.05'), Synset('double-bogey.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('duck.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('eagle.n.02'), 2.538973871058276)
(Synset('love.n.05'), Synset('double_eagle.n.01'), 2.538973871058276)
(Synset('love.n.05'), Synset('game.n.06'), 2.538973871058276)
(Synset('love.n.05'), Synset('lead.n.07'), 2.538973871058276)
(Synset('love.n.05'), Synset('love.n.05'), 3.6375861597263857)
(Synset('love.n.05'), Synset('match.n.05'), 2.538973871058276)
(Synset('love.n.05'), Synset('par.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('bondage.n.03'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('outercourse.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('safe_sex.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('sexual_activity.n.01'), 2.9444389791664407)
(Synset('sexual_love.n.02'), Synset('conception.n.02'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('sexual_intercourse.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('pleasure.n.05'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('sexual_love.n.02'), 3.6375861597263857)
(Synset('sexual_love.n.02'), Synset('carnal_abuse.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('coupling.n.03'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('reproduction.n.05'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('foreplay.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('perversion.n.02'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('autoeroticism.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('promiscuity.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('lechery.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('homosexuality.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('bisexuality.n.02'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('heterosexuality.n.01'), 2.538973871058276)
(Synset('sexual_love.n.02'), Synset('bestiality.n.02'), 2.538973871058276)
# ...

这篇关于Wordnet 查找同义词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-26 05:46