如何在编写XML文件时忽略DTD验证但保留Doctype

如何在编写XML文件时忽略DTD验证但保留Doctype

本文介绍了如何在编写XML文件时忽略DTD验证但保留Doctype?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发一个系统,该系统应该能够读取任何(或至少是任何格式良好的)XML文件,操作一些节点并将它们写回到同一个文件中。我希望我的代码尽可能通用,我不希望

I am working on a system that should be able to read any (or at least, any well-formed) XML file, manipulate a few nodes and write them back into that same file. I want my code to be as generic as possible and I don't want


  • 在我的代码中的任何地方对模式/ Doctype信息进行硬编码引用。 doctype信息位于源文档中,我想保留该doctype信息,而不是在我的代码中再次提供。如果文档没有DocType,我不会添加一个。我根本不关心这些文件的形式或内容,除了我的几个节点。

  • 自定义EntityResolvers或StreamFilters以省略或以其他方式操纵源信息(已经很可惜了)该命名空间信息似乎在某种程度上无法从声明它的文档文件中访问,但我可以使用uglier XPath进行管理)

  • DTD验证。我没有引用的DTD,我不想包含它们,并且在不知道它们的情况下完全可以进行节点操作。

目标是使源文件完全不变,除了通过XPath检索的已更改的节点。我想逃避标准的javax.xml。

The aim is to have the source file entirely unchanged except for the changed Nodes, which are retrieved via XPath. I would like to get away with the standard javax.xml stuff.

到目前为止我的进展:

    DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

    factory.setAttribute("http://xml.org/sax/features/namespaces", true);
    factory.setAttribute("http://xml.org/sax/features/validation", false);
    factory.setAttribute("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
    factory.setAttribute("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);

    factory.setNamespaceAware(true);
    factory.setIgnoringElementContentWhitespace(false);
    factory.setIgnoringComments(false);
    factory.setValidating(false);
    DocumentBuilder builder = factory.newDocumentBuilder();
    Document document = builder.parse(new InputSource(inStream));

这会成功地将XML源加载到org.w3c.dom.Document中,忽略DTD验证。我可以替换,然后使用

This loads the XML source into a org.w3c.dom.Document successfully, ignoring DTD validation. I can do my replacements and then I use

    Source source = new DOMSource(document);
    Result result = new StreamResult(getOutputStream(getPath()));

    // Write the DOM document to the file
    Transformer xformer = TransformerFactory.newInstance().newTransformer();
    xformer.transform(source, result);

将其写回。这几乎是完美的。但无论我做什么,Doctype标签都不见了。在调试时,我看到解析后在Document对象中有一个DeferredDoctypeImpl [log4j:configuration:null]对象,但它在某种程度上是错误的,空的或被忽略的。我测试的文件就像这样开始(但是对于其他文件类型它是相同的):

to write it back. Which is nearly perfect. But the Doctype tag is gone, no matter what I do. While debugging, I saw that there is a DeferredDoctypeImpl [log4j:configuration: null] object in the Document object after parsing, but it is somehow wrong, empty or ignored. The file I tested on starts like this (but it is the same for other file types):

<?xml version =1.0encoding =UTF-8 ?>

<?xml version="1.0" encoding="UTF-8"?>

<!DOCTYPE log4j:配置系统log4j.dtd>

<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">

< ; log4j:配置xmlns:log4j =http://jakarta.apache.org/log4j/debug =false>

<log4j:configuration xmlns:log4j="http://jakarta.apache.org/log4j/" debug="false">

[...]

我认为有很多(简单?)方法涉及黑客攻击或将额外的JAR引入项目中。但我更愿意使用我已经使用过的工具。

I think there are a lot of (easy?) ways involving hacks or pulling additional JARs into the project. But I would rather like to have it with the tools I already use.

推荐答案

抱歉,现在使用XMLSerializer代替Transformer ...

Sorry, got it right now using a XMLSerializer instead of the Transformer...

这篇关于如何在编写XML文件时忽略DTD验证但保留Doctype?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-22 16:25