本文介绍了使用XSD,目录解析器和XSLT的JAXP DOM验证XML的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用JDK 6将XML文件加载到。必须针对XSD验证XML文件。 XSD文件位置因运行环境而异。确保XML可以针对XSD进行验证,无论目录结构如何,都需要目录解析器。一旦验证了XML,就可以对其进行转换。

Using JDK 6 to load XML files into DOM. The XML files must be validated against an XSD. The XSD file location differs depending on the running environment. Ensuring the XML can be validated against an XSD, regardless of directory structure, requires a catalog resolver. Once the XML is validated, it can then be transformed.

我的理解是可用于配置此类验证。这是通过使用查找与XML文件关联的XSD文件(以及任何包含的文件)。

My understanding is that a DocumentBuilderFactory can be used to configure such validation. This is achieved by using a DocumentBuilder with an XMLCatalogResolver to find the XSD file (and any included files) associated with an XML file.

有关使用目录派生的XSD验证XML文档的问题,包括:

Questions about validating XML documents using a catalog-derived XSD, include:









  • JAXP - debug XSD catalog look up
  • Java XML Schema validator with custom resource resolver fails to resolve elements
  • Can XMLCatalog be used for schema imports?
  • How to load XMLCatalog from classpath resources (inside a jar), reliably?
  • XMLSchema validation with Catalog.xml file for entity resolving
  • Resolving type definitions from imported schema in XJC fails
  • Find items that can be repeated in an xml schema using Java
  • Java servlets: xml validation against xsd

这些问题中的大多数和回答rs引用硬编码的XSD文件路径,或使用执行验证,或属于,或要求 。

Most of these questions and answers reference a hard-coded XSD file path, or use SAX to perform the validation, or pertain to DTDs, or require JDOM dependencies, or have no transformation.

没有规范的解决方案描述如何使用JAXP使用XML目录进行XSD验证DOM,随后通过XSLT进行转换。 www.xml.com/lpt/a/1378rel =nofollow noreferrer>片段,但没有编译和运行的完整独立示例(在JDK 6下)。

There is no canonical solution that describes how to employ an XML catalog for XSD validation using JAXP DOM, that is subsequently transformed via XSLT. There are a number of snippets, but no complete, standalone example that compiles and runs (under JDK 6).

我发布了一个答案似乎在技术上有效,但过于冗长。

I posted an answer that seems to work, technically, but is overly verbose.

用于验证和转换XML文档的规范方法(使用JDK 1.6库)是什么?以下是一种可能的算法:

What is the canonical way (using JDK 1.6 libraries) to validate and transform an XML document? Here is one possible algorithm:


  1. 创建目录解析器。

  2. 创建XML解析器。

  3. 将解析器与解析器关联。

  4. 解析包含XSD参考的XML文档。

  5. 终止于验证错误。

  6. 使用XSL模板转换经过验证的XML。

  1. Create a catalog resolver.
  2. Create an XML parser.
  3. Associate the resolver with the parser.
  4. Parse an XML document containing an XSD reference.
  5. Terminate on validation errors.
  6. Transform the validated XML using an XSL template.


推荐答案

源文件



源文件包括目录管理器属性文件,Java源代码,目录文件,XML数据,XSL文件和XSD文件。所有文件都相对于当前工作目录( ./ )。

此属性文件由CatalogResolver类读取;另存为 ./ CatalogManager.properties

This properties file is read by the CatalogResolver class; save as ./CatalogManager.properties:

catalogs=catalog.xml
relative-catalogs=yes
verbosity=99
prefer=system
static-catalog=yes
allow-oasis-xml-catalog-pi=yes



TestXSD.java



这是主应用程序;保存为 ./ src / TestXSD.java

package src;

import java.io.*;
import java.net.URI;
import java.util.*;
import java.util.regex.Pattern;
import java.util.regex.Matcher;

import javax.xml.parsers.*;
import javax.xml.xpath.*;
import javax.xml.XMLConstants;

import org.w3c.dom.*;
import org.xml.sax.*;

import org.apache.xml.resolver.tools.CatalogResolver;
import org.apache.xerces.util.XMLCatalogResolver;
import static org.apache.xerces.jaxp.JAXPConstants.JAXP_SCHEMA_LANGUAGE;
import static org.apache.xerces.jaxp.JAXPConstants.W3C_XML_SCHEMA;

import javax.xml.validation.SchemaFactory;
import javax.xml.validation.Schema;
import javax.xml.validation.Validator;

import javax.xml.transform.Result;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;

import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.sax.SAXSource;

import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

/**
 * Download http://xerces.apache.org/xml-commons/components/resolver/CatalogManager.properties
 */
public class TestXSD {
  private final static String ENTITY_RESOLVER =
    "http://apache.org/xml/properties/internal/entity-resolver";

  /**
   * This program reads an XML file, performs validation, reads an XSL
   * file, transforms the input XML, and then writes the transformed document
   * to standard output.
   *
   * args[0] - The XSL file used to transform the XML file
   * args[1] - The XML file to transform using the XSL file
   */
  public static void main( String args[] ) throws Exception {
    // For validation error messages.
    ErrorHandler errorHandler = new DocumentErrorHandler();

    // Read the CatalogManager.properties file.
    CatalogResolver resolver = new CatalogResolver();
    XMLCatalogResolver xmlResolver = createXMLCatalogResolver( resolver );

    logDebug( "READ XML INPUT SOURCE" );
    // Load an XML document in preparation to transform it.
    InputSource xmlInput = new InputSource( new InputStreamReader(
      new FileInputStream( args[1] ) ) );

    DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
    dbFactory.setAttribute( JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA );
    dbFactory.setNamespaceAware( true );

    DocumentBuilder builder = dbFactory.newDocumentBuilder();
    builder.setEntityResolver( xmlResolver );
    builder.setErrorHandler( errorHandler );

    logDebug( "PARSE XML INTO DOCUMENT MODEL" );
    Document xmlDocument = builder.parse( xmlInput );

    logDebug( "CONVERT XML DOCUMENT MODEL INTO DOMSOURCE" );
    DOMSource xml = new DOMSource( xmlDocument );

    logDebug( "GET XML SCHEMA DEFINITION" );
    String schemaURI = getSchemaURI( xmlDocument );

    logDebug( "SCHEMA URI: " + schemaURI );

    if( schemaURI != null ) {
      logDebug( "CREATE SCHEMA FACTORY" );
      // Create a Schema factory to obtain a Schema for XML validation...
      SchemaFactory sFactory = SchemaFactory.newInstance( W3C_XML_SCHEMA );
      sFactory.setResourceResolver( xmlResolver );

      logDebug( "CREATE XSD INPUT SOURCE" );
      String xsdFileURI = xmlResolver.resolveURI( schemaURI );

      logDebug( "CREATE INPUT SOURCE XSD FROM: " + xsdFileURI );
      InputSource xsd = new InputSource(
        new FileInputStream( new File( new URI( xsdFileURI ) ) ) );

      logDebug( "CREATE SCHEMA OBJECT FOR XSD" );
      Schema schema = sFactory.newSchema( new SAXSource( xsd ) );

      logDebug( "CREATE VALIDATOR FOR SCHEMA" );
      Validator validator = schema.newValidator();

      logDebug( "VALIDATE XML AGAINST XSD" );
      validator.validate( xml );
    }

    logDebug( "READ XSL INPUT SOURCE" );
    // Load an XSL template for transforming XML documents.
    InputSource xslInput = new InputSource( new InputStreamReader(
      new FileInputStream( args[0] ) ) );

    logDebug( "PARSE XSL INTO DOCUMENT MODEL" );
    Document xslDocument = builder.parse( xslInput );

    transform( xmlDocument, xslDocument, resolver );
    System.out.println();
  }

  private static void transform(
    Document xml, Document xsl, CatalogResolver resolver ) throws Exception
  {
    if( versionAtLeast( xsl, 2 ) ) {
      useXSLT2Transformer();
    }

    logDebug( "CREATE TRANSFORMER FACTORY" );
    // Create the transformer used for the document.
    TransformerFactory tFactory = TransformerFactory.newInstance();
    tFactory.setURIResolver( resolver );

    logDebug( "CREATE TRANSFORMER FROM XSL" );
    Transformer transformer = tFactory.newTransformer( new DOMSource( xsl ) );

    logDebug( "CREATE RESULT OUTPUT STREAM" );
    // This enables writing the results to standard output.
    Result out = new StreamResult( new OutputStreamWriter( System.out ) );

    logDebug( "TRANSFORM THE XML AND WRITE TO STDOUT" );
    // Transform the document using a given stylesheet.
    transformer.transform( new DOMSource( xml ), out );
  }

  /**
   * Answers whether the given XSL document version is greater than or
   * equal to the given required version number.
   *
   * @param xsl The XSL document to check for version compatibility.
   * @param version The version number to compare against.
   *
   * @return true iff the XSL document version is greater than or equal
   * to the version parameter.
   */
  private static boolean versionAtLeast( Document xsl, float version ) {
    Element root = xsl.getDocumentElement();
    float docVersion = Float.parseFloat( root.getAttribute( "version" ) );

    return docVersion >= version;
  }

  /**
   * Enables Saxon9's XSLT2 transformer for XSLT2 files.
   */
  private static void useXSLT2Transformer() {
    System.setProperty("javax.xml.transform.TransformerFactory",
      "net.sf.saxon.TransformerFactoryImpl");
  }

  /**
   * Creates an XMLCatalogResolver based on the file names found in
   * the given CatalogResolver. The resulting XMLCatalogResolver will
   * contain the absolute path to all the files known to the given
   * CatalogResolver.
   *
   * @param resolver The CatalogResolver to examine for catalog file names.
   * @return An XMLCatalogResolver instance with the same number of catalog
   * files as found in the given CatalogResolver.
   */
  private static XMLCatalogResolver createXMLCatalogResolver(
    CatalogResolver resolver ) {
    int index = 0;
    List files = resolver.getCatalog().getCatalogManager().getCatalogFiles();
    String catalogs[] = new String[ files.size() ];
    XMLCatalogResolver xmlResolver = new XMLCatalogResolver();

    for( Object file : files ) {
      catalogs[ index ] = (new File( file.toString() )).getAbsolutePath();
      index++;
    }

    xmlResolver.setCatalogList( catalogs );

    return xmlResolver;
  }

  private static String[] parseNameValue( String nv ) {
    Pattern p = Pattern.compile( "\\s*(\\w+)=\"([^\"]*)\"\\s*" );
    Matcher m = p.matcher( nv );
    String result[] = new String[2];

    if( m.find() ) {
      result[0] = m.group(1);
      result[1] = m.group(2);
    }

    return result;
  }

  /**
   * Retrieves the XML schema definition using an XSD.
   *
   * @param node The document (or child node) to traverse seeking processing
   * instruction nodes.
   * @return null if no XSD is present in the XML document.
   * @throws IOException Never thrown (uses StringReader).
   */
  private static String getSchemaURI( Node node ) throws IOException {
    String result = null;

    if( node.getNodeType() == Node.PROCESSING_INSTRUCTION_NODE ) {
      ProcessingInstruction pi = (ProcessingInstruction)node;

      logDebug( "NODE IS PROCESSING INSTRUCTION" );

      if( "xml-model".equals( pi.getNodeName() ) ) {
        logDebug( "PI IS XML MODEL" );

        // Hack to get the attributes.
        String data = pi.getData();

        if( data != null ) {
          final String attributes[] = pi.getData().trim().split( "\\s+" );

          String type = parseNameValue( attributes[0] )[1];
          String href = parseNameValue( attributes[1] )[1];

          // TODO: Schema should = http://www.w3.org/2001/XMLSchema
          //String schema = attributes.getNamedItem( "schematypens" );

          if( "application/xml".equalsIgnoreCase( type ) && href != null ) {
            result = href;
          }
        }
      }
    }
    else {
      // Try to get the schema type information.
      NamedNodeMap attrs = node.getAttributes();

      if( attrs != null ) {
        // TypeInfo.toString() returns values of the form:
        // schemaLocation="uri schemaURI"
        // The following loop extracts the schema URI.
        for( int i = 0; i < attrs.getLength(); i++ ) {
          Attr attribute = (Attr)attrs.item( i );
          TypeInfo typeInfo = attribute.getSchemaTypeInfo();
          String attr[] = parseNameValue( typeInfo.toString() );

          if( "schemaLocation".equalsIgnoreCase( attr[0] ) ) {
            result = attr[1].split( "\\s" )[1];
            break;
          }
        }
      }

      // Look deeper for the schema URI.
      if( result == null ) {
        NodeList list = node.getChildNodes();

        for( int i = 0; i < list.getLength(); i++ ) {
          result = getSchemaURI( list.item( i ) );

          if( result != null ) {
            break;
          }
        }
      }
    }

    return result;
  }

  /**
   * Writes a message to standard output.
   */
  private static void logDebug( String s ) {
    System.out.println( s );
  }
}



错误处理程序



这是人性化错误消息的代码;另存为 ./ src / DocumentErrorHandler.java

package src;

import java.io.PrintStream;

import org.xml.sax.ErrorHandler;
import org.xml.sax.SAXParseException;
import org.xml.sax.SAXException;

/**
 * Handles error messages during parsing and validating XML documents.
 */
public class DocumentErrorHandler implements ErrorHandler {
  private final static PrintStream OUTSTREAM = System.err;

  private void log( String type, SAXParseException e ) {
    OUTSTREAM.println( "SAX PARSE EXCEPTION " + type );
    OUTSTREAM.println( "  Public ID: " + e.getPublicId() );
    OUTSTREAM.println( "  System ID: " + e.getSystemId() );
    OUTSTREAM.println( "  Line     : " + e.getLineNumber() );
    OUTSTREAM.println( "  Column   : " + e.getColumnNumber() );
    OUTSTREAM.println( "  Message  : " + e.getMessage() );
  }

  @Override
  public void error( SAXParseException e ) throws SAXException {
    log( "ERROR", e );
  }

  @Override
  public void fatalError( SAXParseException e ) throws SAXException {
    log( "FATAL ERROR", e );
  }

  @Override
  public void warning( SAXParseException e ) throws SAXException {
    log( "WARNING", e );
  }
}



目录文件



另存为 ./ catalog.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE catalog PUBLIC "-//OASIS//DTD XML Catalogs V1.1//EN" "http://www.oasis-open.org/committees/entity/release/1.1/catalog.dtd">
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
    <!-- XSDs linked through primary catalog -->
    <!-- catalog entry for good-note1.xml -->
    <rewriteSystem
        systemIdStartString="http://stackoverflow.com/schema"
        rewritePrefix="./ArbitraryFolder/schemas"
    />

    <!-- catalog entry for good-note2.xml, good-note3.xml, bad-note1.xml, bad-note2.xml -->
    <rewriteURI
        uriStartString="http://stackoverflow.com/2014/09/xsd"
        rewritePrefix="./ArbitraryFolder/schemas"
    />

    <!-- add a second catalog as a further test:
         XSL will be resolved through it -->
    <nextCatalog
        catalog="./ArbitraryFolder/catalog.xml"
    />
</catalog>



XML数据



不同的测试用例包括处理指令或根节点中引用的XSD。

XML Data

The different test cases include XSDs referenced in either processing instructions or root nodes.

可以提供架构使用 xml-model 处理指令(PI)。另存为 ./ Tests / good-notes2.xml

The schema can be provided using an xml-model processing instruction (PI). Save as ./Tests/good-notes2.xml:

<?xml version="1.0" encoding="UTF-8"?>
<!-- Associating Schemas with XML documents: http://www.w3.org/TR/xml-model/ -->
<?xml-model type="application/xml" href="http://stackoverflow.com/2014/09/xsd/notes/notes.xsd"?>
<note>
    <title>Shopping List</title>
    <date>2014-08-30</date>
    <body>headlight fluid, flamgrabblit, exhaust coil</body>
</note>



架构:根节点



架构可以在文档的根节点的属性中提供。另存为 ./ Tests / good-notes3.xml

<?xml version="1.0" encoding="UTF-8"?>
<!-- XML Schema Part 1: Structures:
     Schema-Related Markup in Documents Being Validated:
     http://www.w3.org/TR/xmlschema-1/#Instance_Document_Constructions -->
<note
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://stackoverflow.com http://stackoverflow.com/2014/09/xsd/notes/notes.xsd">
    <title>Shopping List</title>
    <date>2014-08-30</date>
    <body>Eggs, Milk, Carrots</body>
</note>



失败验证



以下内容应失败验证(日期需要连字符);另存为 ./ Tests / bad-note1.xml

<?xml version="1.0" encoding="UTF-8"?>
<!-- Associating Schemas with XML documents: http://www.w3.org/TR/xml-model/ -->
<?xml-model type="application/xml" href="http://stackoverflow.com/2014/09/xsd/notes/notes.xsd"?>
<!-- FAILS SCHEMA: date is not valid; should use hyphens -->
<note>
    <title>Shopping List</title>
    <date>20140830</date>
    <body>headlight fluid, flamgrabblit, exhaust coil</body>
</note>



转型



将此保存为 ./测试/记录到html.xsl

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">
    <!-- is in the second catalog (../ArbitraryFolder/catalog.xml) -->
    <xsl:import href="http://stackoverflow.com/2014/09/xsl/notes/notes.xsl"/>
</xsl:stylesheet>



任意文件夹



任意文件夹代表计算机上文件的路径,可以位于文件系统的任何位置。这些文件的位置可能不同,例如,在生产,开发和存储库之间。

Arbitrary Folder

The arbitrary folder represents the path to files on a computer that can be located anywhere on the file system. The location of these files could differ, for example, between production, development, and the repository.

将此文件另存为 ./ ArbitraryFolder / catalog.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE catalog PUBLIC "-//OASIS//DTD XML Catalogs V1.1//EN" "http://www.oasis-open.org/committees/entity/release/1.1/catalog.dtd">
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">

    <!-- catalog entry for all notes -->
    <rewriteURI
        uriStartString="http://stackoverflow.com/2014/09/xsl/"
        rewritePrefix="./XSL/"/>

</catalog>



注释



有两个文件在这个例子用于转换注释:notes.xsl和note-body.xsl。第一个包括第二个。

Notes

There are two files in this example for transforming the notes: notes.xsl and note-body.xsl. The first includes the second.

将此保存为 ./ ArbitraryFolder /XSL/notes/notes.xsl

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">

    <!-- will not be in catalog (though it could be):
         by convention, absolute path is assumed to be part of static file structure -->
    <xsl:import href="note-body.xsl"/>

    <xsl:template match="/">
        <html>
            <head>
                <title>A Note</title>
            </head>
            <body>
                <xsl:apply-templates/>
            </body>
        </html>
    </xsl:template>
    <xsl:template match="note">
        <div>
            <xsl:apply-templates select="title, date, body"/>
        </div>
    </xsl:template>
    <xsl:template match="title">
        <h1><xsl:value-of select="."/></h1>
    </xsl:template>
    <xsl:template match="date">
        <p class="date"><xsl:value-of select="."/></p>
    </xsl:template>
</xsl:stylesheet>



注释正文样式表



将此另存为 ./ ArbitraryFolder / XSL / notes / note-body.xsl

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    exclude-result-prefixes="xs"
    version="2.0">

    <xsl:template match="body">
        <p class="notebody"><xsl:value-of select="."/></p>
    </xsl:template>

</xsl:stylesheet>



架构



所需的最后一个文件是架构;保存为 ./ schemas / notes / notes.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="note">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="title" type="xs:token"/>
                <xs:element name="date" type="xs:date"/>
                <xs:element name="body" type="xs:string"/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>



建筑



本节详细介绍如何构建测试应用程序。

Building

This section details how to build the test application.

您将需要Saxon 9(用于XSLT2.0文档),Xerces ,Xalan和Resolver API:

You will need Saxon 9 (for XSLT2.0 documents), Xerces, Xalan, and the Resolver API:

jaxen-1.1.6.jar
resolver.jar
saxon9he.jar
serializer.jar
xalan.jar
xercesImpl.jar
xml-apis.jar
xsltc.jar



脚本



另存为 ./ build.sh

#!/bin/bash
javac -d bin -cp .:lib/* src/TestXSD.java

另存为 ./ run.sh

#!/bin/bash
java -cp .:bin:lib/* src.TestXSD Tests/note-to-html.xsl $1



编译



使用 ./ build.sh 编译代码。

运行时使用:

./run.sh filename.xml



良好测试



测试好笔记通过验证:

Good Test

Test that the good note passes validation:

./run.sh Tests/good-note2.xml

没有错误。

测试不好note的日期未通过验证:

Test that the bad note's date does not pass validation:

./run.sh Tests/bad-note1.xml

正如预期的那样,这会产生所需的错误:

As expected, this produces the desired error:

Exception in thread "main" org.xml.sax.SAXParseException; cvc-datatype-valid.1.2.1: '20140830' is not a valid value for 'date'.
    at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
    at org.apache.xerces.util.ErrorHandlerWrapper.error(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.xs.XMLSchemaValidator$XSIErrorReporter.reportError(Unknown Source)
    at org.apache.xerces.impl.xs.XMLSchemaValidator.reportSchemaError(Unknown Source)
    at org.apache.xerces.impl.xs.XMLSchemaValidator.elementLocallyValidType(Unknown Source)
    at org.apache.xerces.impl.xs.XMLSchemaValidator.processElementContent(Unknown Source)
    at org.apache.xerces.impl.xs.XMLSchemaValidator.handleEndElement(Unknown Source)
    at org.apache.xerces.impl.xs.XMLSchemaValidator.endElement(Unknown Source)
    at org.apache.xerces.jaxp.validation.DOMValidatorHelper.finishNode(Unknown Source)
    at org.apache.xerces.jaxp.validation.DOMValidatorHelper.validate(Unknown Source)
    at org.apache.xerces.jaxp.validation.DOMValidatorHelper.validate(Unknown Source)
    at org.apache.xerces.jaxp.validation.ValidatorImpl.validate(Unknown Source)
    at javax.xml.validation.Validator.validate(Validator.java:124)
    at src.TestXSD.main(TestXSD.java:103)

这篇关于使用XSD,目录解析器和XSLT的JAXP DOM验证XML的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 05:36