问题描述
在wordnet.princeton.edu上似乎没有变更日志或类似的内容
There doesn't seem to be a changelog or something of that sort available at wordnet.princeton.edu
推荐答案
要添加到@abarisone的答案中,WordNet 3.0和WordNet 3.1之间实际的同义词集ID可能会有所不同:(
To add to @abarisone's answer, the actual synset IDs themselves can differ between WordNet 3.0 and WordNet 3.1 :(
例如,在WordNet 3.1中,主席是 103005231-n .
For example, in WordNet 3.1 a chair is 103005231-n.
但是,在WordNet 3.0中,它是 103001627-n .但是您不能在 http://wordnet-rdf.princeton.edu/wn31/中查找103001627-n 或 http://wordnet-rdf.princeton.edu/wn30 /103001627-n ,但您需要使用 http://wordnet- rdf.princeton.edu/wn30/03001627-n 会错误地重定向到 102992974-n .
However, in WordNet 3.0 it was 103001627-n. But you cannot look that up in http://wordnet-rdf.princeton.edu/wn31/103001627-n nor http://wordnet-rdf.princeton.edu/wn30/103001627-n, but instead you need to use http://wordnet-rdf.princeton.edu/wn30/03001627-n which incorrectly redirects to 102992974-n.
我认为这是 WordNet RDF 3.1在线应用程序中的错误,因为 102992974-n 并不正式存在.您甚至都无法搜索(在线和离线).而且,如果您在该页面上获得RDF/JSON-LD文件,它就会为您提供 103005231-n .
I think it's a bug in WordNet RDF 3.1 online app, because 102992974-n doesn't officially exist. You can't even search for it (both online and offline). And if you get the RDF/JSON-LD file on that page, it gives you 103005231-n.
在wn3.1.dict/dict/index.noun
中:
chair n 5 4 @ ~ %p + 5 2 03005231 00599171 10488547 03275941 03005700
在该文件的任何地方都没有提及02992974
.
There's no mention of 02992974
anywhere in that file.
这两个问题都令人困惑.我想知道为什么他们在次要版本中更改了同义词集ID.
Both of these issues are confusing. I wonder why they changed synset IDs in minor revision.
关于WordNet同义词集ID的状态:
Regarding status of WordNet synset IDs:
目前,结论是最安全的使用WordNet 3.0同义词集ID.
Conclusion is, currently, using WordNet 3.0 synset IDs is safest.
对于将来的工作,可以考虑使用全球Wordnet协会(即将推出)中的语言间索引.它将具有与Wordnet 3.0兼容的ID.
For future work, can consider using Inter-Lingual Index from Global Wordnet Association (coming soon). Which will have IDs compatible with Wordnet 3.0.
来自 wn-users邮件列表的引用, 2015年10月30日:
URI是从"dblocation"字段构建的,该字段是一个字节偏移量 从相关的基于字符的数据库文件的开头(我是 不知道哪个).由于项目的不同,发行版本之间会有所不同 删除并添加并四处移动.
The URI is built from the "dblocation" field, which is a byte offset from the beginning of the relevant character-based database file (I’m not sure which). This will change from release to release as items are removed and added and moved around.
.
据我所知…….仅供参考,事实是, 感知键(例如"ability%1:07:00 ::")在各个版本之间是稳定的, 除非感官被分割或合并.这提供了一种稳定的方法 指的是跨发行版的同义词集,而不是使用同义词集编号.还 您可以在不同版本中找到同义词集编号之间的映射 通过寻找相同的感应键. (sensekey-> synset是多对一 映射:同义词集可能具有多个感应键,每个感应键一个 同义词集中的单词+感官.但是感知键恰好映射到一个同义词集. 祝你好运,皮特
To the best of my knowledge…. FYI a little known fact is that the sense keys (e.g., "ability%1:07:00::") are stable between releases, except when senses are split or merged. This provides a stable way to refer to synsets across releases, rather than use synset numbers. Also you can find the mappings between synset numbers in different releases by looking for the same sense keys. (sensekey->synset is a many-to-1 mapping: A synset may have multiple sense keys, one for each word+sense in the synset. But a sense key maps to exactly one synset). Best wishes, Pete
.
你好,亨迪,
是的,WordNet同义词集标识符基于标识符的字节偏移量. 给定版本的WordNet中的描述符,因此它们离 在各个版本的WordNet中保持稳定.感官标识符更多 稳定,但仍然会变得不可靠,因为感觉会分裂并合并. 此外,还有两个稍有不同的WordNet 3.1版本和 WordNet RDF版本接受以下任一者的同义词集标识符:这是 当然,正如其他人所评论的那样,这一切都很令人困惑.
Yes WordNet synset Identifiers are based on the byte offset of the descriptor in a given release of WordNet, as such they are far from stable across versions of WordNets. The sense identifiers are more stable but still can be unreliable as sense do get split and merged. Also, there are two slightly different versions of WordNet 3.1 and the WordNet RDF version accepts synset identifiers from either... this is of course, as others have commented, all very confusing.
由于这个原因,全球WordNet协会已开始着手 语际索引,我们希望该索引很快会在线(即及时发布) 参加一月份的全球WordNet会议),并将为每个 同步设置一个不变的URI.
For this reason, the Global WordNet Association has started work on an Inter-Lingual Index, which we expect to be online soon (i.e., in time for the Global WordNet Conference in January), and will give each synset a single unchanging URI.
Piek Vossen最近对此做了很好的演讲,这些幻灯片是 在此处在线: http://ldl2014.org/slides/Vossen-LOD-CILI.pdf
Piek Vossen gave a good talk about this recently and this slides are online here: http://ldl2014.org/slides/Vossen-LOD-CILI.pdf
目前,我建议您使用WN 3.0标识符进行链接 同义词集,WordNet双语索引也将以此为基础.
For the moment, I would recommend using WN 3.0 identifiers to link synsets, which the WordNet Interlingual Index will also be based on.
问候,约翰
这篇关于WordNet 3.1和WordNet 3.0之间有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!