将自然语言表示为RDF

将自然语言表示为RDF

本文介绍了将自然语言表示为RDF的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

RDF/OWL能够代表自然语言传达多少个概念?我仍在学习RDF和其他语义技术,但据我目前所知,信息通常表示为形式的三元组(主题,谓词,对象).所以我可以想象如何表达鲍勃有一顶帽子"的句子.但是,您将如何表达一个更复杂的句子,例如在第42街上的鲍勃(Bob),在所有者批准后将在购物中心工作"?是否存在表示名词/动词/所有权/因果关系/时态/等的标签的约定?

注意,我不是在问如何将任意自然语言文本自动转换为RDF(因为目前看来这是不可能的).我只是想了解RDF如何用于代表自然语言所代表的相同信息.

解决方案

也许看看 Attempto项目在ACE中编写OWL本体.

您的例句

Bob, over on 42nd street, will have a job at the Mall after the owner approves

可以用Attempto Controlled English(ACE)重写为

If an owner of Mall approves John whose address is "42nd street"
    then he is employed by Mall.

(或类似的内容,具体取决于您要说的是什么.)

这句话可以自动映射到OWL2 SubClassOf -公理

   SubClassOf(
      ObjectIntersectionOf(
         ObjectOneOf(
            :Mall
         )
         ObjectSomeValuesFrom(
            :owner
            ObjectSomeValuesFrom(
               :approve
               ObjectIntersectionOf(
                  ObjectOneOf(
                     :John
                  )
                  DataHasValue(
                     :address
                     "42nd street"^^<http://www.w3.org/2001/XMLSchema#string>
                  )
               )
            )
         )
      )
      ObjectSomeValuesFrom(
         :employ
         ObjectOneOf(
            :John
         )
      )
   )

此映射实现了有关基本单词类别的某些约定:

  • 常见名词映射到OWL类名
  • 专有名称映射到OWL个人名称
  • 及物动词,及物动词形容词和 -结构映射到OWL属性名称:如果数据属性名称的参数是数字或字符串,则为数据属性名称,否则为对象属性名称

此映射不支持ACE支持的许多单词类,例如不及物动词和不及物动词,不及物动词形容词和副词.覆盖范围可以扩大,例如不及物动词可以映射到OWL类(例如," John睡觉."可以表示个体 John 属于 sleepers 的类别) ).目前尚不清楚如何处理例如双及物动词和副词.

总的来说,英语在构成要件(名词,不同类型的形容词,不同类型的动词等)方面要比OWL(具有类,个体,对象和数据属性,并且(类型)数据项(例如字符串和数字).这只是文字与实体"的层次.诸如时态之类的事情更加复杂,因为它们具有许多英语表面表示,并且在OWL方面缺少任何内置功能.

How much of the concepts conveyed in natural language is RDF/OWL able to represent? I'm still learning RDF and other semantic technologies, but as I currently understand it, information is typically represented as triples of the form (subject,predicate,object). So I can imagine how the sentence "Bob has a hat" might be represented. However, how would you represent a more complicated sentence like "Bob, over on 42nd street, will have a job at the Mall after the owner approves"? Are there conventions for tags representing nouns/verbs/ownership/causality/tense/etc?

Note, I'm not asking how to automatically convert arbitrary natural language text to RDF (as this currently appears impossible). I'm just trying to understand how RDF might be used to represent the same information that natural language represents.

解决方案

Maybe have a look at the Attempto project the goal of which is to define a fragment of English that can be automatically mapped to first-order logic. Part of this effort is a mapping to OWL 2 DL. See e.g. Writing OWL ontologies in ACE.

Your example sentence

Bob, over on 42nd street, will have a job at the Mall after the owner approves

could be rewritten in Attempto Controlled English (ACE) as

If an owner of Mall approves John whose address is "42nd street"
    then he is employed by Mall.

(or something similar, depending on what you exactly intend to say.)

This sentence can be automatically mapped to an OWL2 SubClassOf-axiom

   SubClassOf(
      ObjectIntersectionOf(
         ObjectOneOf(
            :Mall
         )
         ObjectSomeValuesFrom(
            :owner
            ObjectSomeValuesFrom(
               :approve
               ObjectIntersectionOf(
                  ObjectOneOf(
                     :John
                  )
                  DataHasValue(
                     :address
                     "42nd street"^^<http://www.w3.org/2001/XMLSchema#string>
                  )
               )
            )
         )
      )
      ObjectSomeValuesFrom(
         :employ
         ObjectOneOf(
            :John
         )
      )
   )

This mapping implements certain conventions about basic word classes:

  • common nouns map to OWL class names
  • proper names map to OWL individual names
  • transitive verbs, transitive adjectives, and of-constructions map to OWL property names: data property names if their argument is a number or string, object property names otherwise

Many word classes that ACE supports are not supported by this mapping, e.g. intransitive and ditransitive verbs, intransitive adjectives, and adverbs. The coverage could be extended, e.g. intransitive verbs could map to OWL classes (e.g. "John sleeps." could be taken to mean that the individual John belongs to the class of sleepers). It is less clear how to handle e.g. ditransitive verbs and adverbs.

In general, English is much richer in terms of its building blocks (nouns, different types of adjectives, different types of verbs, ...) than OWL (which has classes, individuals, object and data properties, and (typed) data items such as strings and numbers). And this is just the "word vs entity" level. Things like tense are more complicated as they have many surface representations in English and lack any built-ins on the OWL side.

这篇关于将自然语言表示为RDF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 18:39