Java中的Wordnet相似性：JAWS，JWNL还是Java WN ::相似性？

本文介绍了Java中的Wordnet相似性：JAWS，JWNL还是Java WN ::相似性？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在基于java的应用程序中使用Wordnet。
我想：

I need to use Wordnet in a java-based app.I want to:

搜索同义词

search synsets

查找synsets之间的相似性/相关性

find similarity/relatedness between synsets

我的应用使用RDF图，我知道有Wordnet的SPARQL端点，但我想最好有一份数据集的本地副本，因为它不是太大。

My app uses RDF graphs and I know there are SPARQL endpoints with Wordnet, but I guess it's better to have a local copy of the dataset, as it's not too big.

我找到了以下的罐子：

一般图书馆 - JAWS

一般图书馆 - JWNL

相似性库（Perl） - Wordnet ::相似性

Wordnet的Java版本::相似度（测试版）

General library - JAWS http://lyle.smu.edu/~tspell/jaws/index.html
General library - JWNL http://sourceforge.net/projects/jwordnet
Similarity library (Perl) - Wordnet::similarity http://wn-similarity.sourceforge.net/
Java version of Wordnet::similarity http://www.cogs.susx.ac.uk/users/drh21/ (beta)

你会为我的应用推荐什么？

What would you recommend for my app?

是否可以通过一些绑定从Java应用程序中使用Perl库？

Is it possible to use a Perl library from a java app via some bindings?

谢谢！
Mulone

Thanks!Mulone

推荐答案

我将JAWS用于普通的wordnet内容，因为它易于使用。对于相似性的度量，但是，我使用位于库。您还需要下载文件夹，包含预处理的WordNet和语料库数据，以便它工作。代码可以这样使用，假设您将该文件夹放在项目文件夹中另一个名为lib的文件夹中：

I use JAWS for normal wordnet stuff because it's easy to use. For similarity metrics, though, I use the library located here. You'll also need to download this folder, containing pre-processed WordNet and corpus data, for it to work. The code can be used like this, assuming you placed that folder in another called "lib" in your project folder:

JWS ws = new JWS("./lib", "3.0");
Resnik res = ws.getResnik();
TreeMap<String, Double> scores1 = res.res(word1, word2, partOfSpeech);
for(Entry<String, Double> e: scores1.entrySet())
    System.out.println(e.getKey() + "\t" + e.getValue());
System.out.println("\nhighest score\t=\t" + res.max(word1, word2, partOfSpeech) + "\n\n\n");

这将打印如下内容，显示每个可能的同义词组合之间的相似性得分。要比较的词：

This will print something like the following, showing the similarity score between each possible combination of synsets represented by the words to be compared:

hobby#n#1,gardening#n#1 2.6043996588901104
hobby#n#2,gardening#n#1 -0.0
hobby#n#3,gardening#n#1 -0.0
highest score   =   2.6043996588901104

还有一些方法可以指定任何一个/两个词的含义： res（String word1，int senseNum1，String word2，partOfSpeech）等。遗憾的是，源文档不是JavaDoc，因此您需要手动检查它。该来源可以在下载。

There are also methods that allow you to specify which sense of either/both words: res(String word1, int senseNum1, String word2, partOfSpeech), etc. Unfortunately, the source documentation is not JavaDoc, so you'll need to inspect it manually. The source can be downloaded here.

可用的算法是：

JWSRandom(ws.getDictionary(), true, 16.0);//random number for baseline
Resnik res = ws.getResnik();
LeacockAndChodorowlch = ws.getLeacockAndChodorow();
AdaptedLesk adLesk = ws.getAdaptedLesk();
AdaptedLeskTanimoto alt = ws.getAdaptedLeskTanimoto();
AdaptedLeskTanimotoNoHyponyms altnh = ws.getAdaptedLeskTanimotoNoHyponyms();
HirstAndStOnge hso = ws.getHirstAndStOnge();
JiangAndConrath jcn = ws.getJiangAndConrath();
Lin lin = ws.getLin();
WuAndPalmer wup = ws.getWuAndPalmer();

此外，它要求你有麻省理工学院的jar文件

Also, it requires you to have the jar file for MIT's JWI

这篇关于Java中的Wordnet相似性：JAWS，JWNL还是Java WN ::相似性？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！