本文介绍了使用 XSLT 根据 XSD 转换 XML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想创建一个可以转换 XML 的 XSLT,以便在输出 XML(来自 XSLT)中排除所有未在 XSD 中定义的元素和属性.

I would like to create a XSLT that can transform a XML so that all of the elements and attributes that is not defined in the XSD is excluded in the output XML (from the XSLT).

假设您有这个 XSD.

Lets say you have this XSD.

<xs:element name="parent">
    <xs:complexType>
        <xs:sequence>
            <xs:element name="keptElement1" />
            <xs:element name="keptElement2" />
        </xs:sequence>

        <xs:attribute name="keptAttribute1" />
        <xs:attribute name="keptAttribute2" />
    </complexType>
</xsd:element>

你有这个输入 XML

<parent keptAttribute1="kept"
    keptAttribute2="kept"
    notKeptAttribute3="not kept"
    notKeptAttribute4="not kept">

    <notKeptElement0>not kept</notKeptElement0>
    <keptElement1>kept</keptElement1>
    <keptElement2>kept</keptElement2>
    <notKeptElement3>not kept</notKeptElement3>
</parent>

然后我希望输出的 Xml 看起来像这样.

Then i would like to have the output Xml looking like this.

<parent keptAttribute1="kept"
    keptAttribute2="kept">

    <keptElement1>kept</keptElement1>
    <keptElement2>kept</keptElement2>
</parent>

我可以通过指定元素来做到这一点,但这大约是我的 xslt 技能所能达到的.我通常对所有元素和所有属性执行此操作时遇到问题.

I am able to do this by specifying the elements, but this is about as far as my xslt skills reach. I have problem doing this generally for all elements and all attributes.

推荐答案

这里有两个挑战:(1) 识别模式中声明的元素名称和属性集,以及适当的局部声明上下文信息,以及 (2) 编写 XSLT 以保留与这些名称或名称和上下文匹配的元素和属性.

You have two challenges here: (1) identifying the set of element names and attributes declared in the schema, with appropriate context information for local declarations, and (2) writing XSLT to retain elements and attributes which match those names or names-and-contexts.

还有第三个问题,即明确指定在 XSD 架构中定义(或未定义)的元素和属性"是什么意思.出于讨论的目的,我假设您指的是可以绑定到模式中的元素或属性声明的元素和属性,在验证阶段 (a) 以输入文档树中的任意点为根,以及 (b) 以顶级元素声明或属性声明.这个假设意味着几件事.(a) 局部元素声明只会匹配上下文中的东西——在你的例子中,keptElement1keptElement2 只有当它们是 parent,否则.(b) 不能保证输入中的元素实际上会绑定到所讨论的元素声明:如果它们的祖先之一在本地无效,那么在 XSD 1.0 和 1.1 中事情都会变得复杂起来.(c) 我们不允许从命名类型定义开始验证;我们可以,但听起来好像这不是您感兴趣的内容.(d) 我们不允许从本地元素或属性声明开始验证.

There is also a third issue, namely specifying clearly what you mean by "elements and attributes that are (or are not) defined in the XSD schema". For purposes of discussion I'll assume you mean elements and attributes which could be bound to element or attribute declarations in the schema, in a validation episode (a) rooted at an arbitrary point in the input document tree and (b) starting with a top-level element declaration or attribute declaration. This assumption means several things. (a) Local element declarations will only match things in context -- in your example, keptElement1 and keptElement2 will be retained only when they are children of parent, not otherwise. (b) There is no guarantee that the elements in the input would in fact be bound to the element declarations in question: if one of their ancestors is locally invalid, things get complicated fast both in XSD 1.0 and in 1.1. (c) We don't allow for starting validation from a named type definition; we could, but it doesn't sound as if that's what you're interested in. (d) We don't allow for starting validation from local element or attribute declarations.

有了这些明确的假设,我们就可以转向您的问题了.

With those assumptions explicit, we can turn to your problem.

第一个任务要求您列出 (a) 架构中具有顶级声明的所有元素和属性,以及 (b) 可从它们访问的所有元素和属性.对于顶级声明,我们只需要记录对象(元素或属性)的种类和扩展名.对于本地对象,我们需要来自顶级元素声明的对象类型和完整路径.对于您的示例架构,列表 (a) 包含

The first task requires that you make a list of (a) all the elements and attributes with top-level declarations in your schema, and (b) all the elements and attributes reachable from them. For top-level declarations, all we need to record is the kind of object (element or attribute) and the expanded name. For local objects, we need the kind of object and the full path from a top-level element declaration. For your sample schema, list (a) consists of

  • 元素{}父

(我使用的约定是用大括号中的命名空间名称编写扩展名称;有些人称此为 Clark 表示法,代表 James Clark.)

(I am using the convention of writing expanded names with the namespace name in braces; some call this Clark notation, for James Clark.)

列表(b)由

  • 元素 {}parent/{}ketElement1
  • 元素 {}parent/{}ketElement2
  • 属性 {}parent/{}keptAttribute1
  • 属性 {}parent/{}keptAttribute2

在更复杂的模式中,当您完成生成此列表的过程时,会有一定数量的簿记.

In more complicated schemas, there will be a certain amount of bookkeeping as you go through the process of generating this list.

您的第二个任务是编写一个 XSLT 样式表,该样式表将元素和属性保留在列表中并删除其余部分.(我在这里假设当你删除一个元素时,你也会删除它的所有内容;你的问题是关于元素,而不是标签.)

Your second task is to write an XSLT stylesheet that keeps the elements and attributes in the list and drops the rest. (I'm assuming here that when you drop an element, you drop all its contents, too; your question talks about elements, not tags.)

对于列表中的每个元素,使用列表中给出的上下文编写适当的身份转换:

For each element in the list, write an appropriate identity transform, using the context given in the list:

<xsl:template match="parent">
  <xsl:copy>
    <xsl:apply-templates select="@* | node()"/>
  </xsl:copy>
</xsl:template>

您可以为每个元素编写单独的模板,也可以将多个元素写入匹配模式:

You can write a separate template for each element, or you can write several elements into the match pattern:

<xsl:template match="parent
                    | parent/keptElement1
                    | parent/keptElement2">
  <xsl:copy>
    <xsl:apply-templates select="@* | node()"/>
  </xsl:copy>
</xsl:template>

对于列表中的每个属性,执行相同的操作:

For each attribute in the list, do the same:

<xsl:template match="parent/@keptAttribute1">
  <xsl:copy/>
</xsl:template>

覆盖元素和属性的默认模板,以取消所有其他元素和属性:

Override the default templates for elements and attributes, to suppress all other elements and attributes:

<xsl:template match="*|@*"/>

[或者,根据 DrMacro 的建议,您可以在 XSLT 中编写一个函数或命名模板来查询您在任务 1 中生成的列表,而不是将它写到具有显式匹配模式的重复模板中.根据您的背景,您可能会发现这种方法更容易或更难理解样式表正在做什么.]

[Alternatively, as suggested by DrMacro, you can write a function or named template in XSLT to consult the list you generated in task 1, instead of writing it out into repetitive templates with explicit match patterns. Depending on your background, you may find that that approach makes it easier, or harder, to understand what the stylesheet is doing.]

这篇关于使用 XSLT 根据 XSD 转换 XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 19:21