java - 通过 DOM 解析器处理来自 XML 的 CDATA

我以前从未处理过 XML，所以我不确定如何在 XML 文件中处理 CDATA。我在节点、父节点、子节点、nList 等中迷失了方向。

谁能从这些代码片段中告诉我我的问题是什么？

我的 getTagValue() 方法适用于除“Details”之外的所有标签，它是包含 CDATA 的标签。

.....
NodeList nList = doc.getElementsByTagName("Assignment");
for (int temp = 0; temp < nList.getLength(); temp++) {
    Node nNode = nList.item(temp);
    if (nNode.getNodeType() == Node.ELEMENT_NODE) {
        Element eElement = (Element) nNode;
        results = ("Class : " + getTagValue("ClassName", eElement)) +
                  ("Period : " + getTagValue("Period", eElement)) +
                  ("Assignment : " + getTagValue("Details", eElement));
        myAssignments.add(results);
    }
}
.....
private String getTagValue(String sTag, Element eElement) {
    NodeList nlList = eElement.getElementsByTagName(sTag).item(0).getChildNodes();

    Node nValue = (Node) nlList.item(0);
    if((CharacterData)nValue instanceof CharacterData)
    {
        return ((CharacterData) nValue).getData();
    }
    return nValue.getNodeValue();
}

最佳答案

我怀疑您的问题出在 getTagValue 方法的以下代码行中:

Node nValue = (Node) nlList.item(0);

你总是得到第一个 child !但你可能不止一个。

以下示例有 3 个子节点:文本节点“detail”、CDATA 节点“with cdata”和文本节点“here”:

<Details>detail <![CDATA[with cdata]]> here</Details>

如果你运行你的代码，你只会得到“细节”，你失去了其余的。

下面的例子有 1 个子节点:一个 CDATA 节点“detail with cdata here”:

<Details><![CDATA[detail with cdata here]]></Details>

如果你运行你的代码，你就会得到一切。

但是与上面相同的例子是这样写的:

<Details>
   <![CDATA[detail with cdata here]]>
</Details>

现在有 3 个 child ，因为空格和换行符被选为文本节点。如果你运行你的代码，你会得到第一个带有换行符的空文本节点，剩下的就丢失了。

您要么必须遍历所有子项(无论有多少)并连接每个子项的值以获得完整结果，或者如果区分 CDATA 中的纯文本和文本并不重要，则在首先是文档生成器工厂:

DocumentBuilderFactory docFactory = DocumentBuilderFactory.newInstance();
docFactory.setCoalescing(true);
...

Coalescing specifies that the parser produced by this code will convert CDATA nodes to Text nodes and append it to the adjacent (if any) text node. By default the value of this is set to false.

关于java - 通过 DOM 解析器处理来自 XML 的 CDATA，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/10038747/