解析 XML 仅获取注释和日期值

Here is an online example: https://repl.it/repls/MedicalIgnorantEfficiency这是我要解析的 xml 的示例:Here is an example of my xml to parse:<?xml version="1.0" encoding="UTF-8"?><ncc:Message xmlns:ncc="http://blank/1.0.6" xmlns:cs="http://blank/1.0.0" xmlns:jx="http://blank/1.0.0"xmlns:jm="http://blank/1.0.0"xmlns:n-p="http://blank/1.0.0"xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://blank/1.0.6/person person.xsd"> <ncc:DataSection> <ncc:PersonResponse>  <cs:CText cs:type="No">NO WANT</cs:CText> <jm:CaseID>  <jm:ID>ABC123</jm:ID> </jm:CaseID> <jx:PersonName>  <jx:GivenName>Arugula</jx:GivenName> <jx:MiddleName>Pibb</jx:MiddleName> <jx:SurName>Atari</jx:SurName> </jx:PersonName>  <ncc:PersonBirthDateText>1948-05-11</ncc:PersonBirthDateText> <jx:PersonDetails>  <jx:PersonSSN> <jx:ID/> </jx:PersonSSN> </jx:PersonDetails> <n-p:Activity>  <jx:ActivityDate>1996-04-04</jx:ActivityDate> <jx:HomeAgency xsi:type="cs:Organization">  <jx:Organization> <jx:ID>ZR5981034</jx:ID> </jx:Organization> </jx:HomeAgency> </n-p:Activity> <jx:PersonName>  <ncc:BirthDateText>1993-05-12</ncc:BirthDateText> <ncc:BirthDateText>1993-05-13</ncc:BirthDateText> <ncc:BirthDateText>1993-05-14</ncc:BirthDateText> <jx:IDDetails xsi:type="cs:IDDetails">  <jx:SSNID> <jx:ID/> </jx:SSNID> </jx:IDDetails> </jx:PersonName> </ncc:PersonResponse> </ncc:DataSection></ncc:Message>我希望获得日期值和那些日期值之上的评论.所以对于上面的示例 xml 是这样的:I am looking to want to get the date value(s) and the comment above those date values. So something like this for the example xml above:评论: (ncc:DataSection/ncc:PersonResponse) Comment: < !-- DOB --> (ncc:DataSection/ncc:PersonResponse)日期:1948-05-11(ncc:DataSection/ncc:PersonResponse/ncc:PersonBirthDateText)Date: 1948-05-11 (ncc:DataSection/ncc:PersonResponse/ncc:PersonBirthDateText).评论: (ncc:DataSection/ncc:PersonResponse/n-p:Activity) Comment: < !-- DOZ --> (ncc:DataSection/ncc:PersonResponse/n-p:Activity)日期:1996-04-04(ncc:DataSection/ncc:PersonResponse/n-p:Activity/jx:ActivityDate)Date: 1996-04-04 (ncc:DataSection/ncc:PersonResponse/n-p:Activity/jx:ActivityDate).评论: (ncc:DataSection/ncc:PersonResponse/jx:PersonName) Comment: < !-- DOB Newest --> (ncc:DataSection/ncc:PersonResponse/jx:PersonName)日期: 1993-05-12 (ncc:DataSection/ncc:PersonResponse/jx:PersonName/ncc:BirthDateText) 1993-05-13 (ncc:DataSection/ncc:PersonResponse/jx:PersonName/ncc:BirthDateText) 1993-05-14 (ncc:DataSection/ncc:PersonResponse/jx:PersonName/ncc:BirthDateText)我尝试使用的代码是:public static void xpathNodes() throws ParserConfigurationException, SAXException, IOException, XPathExpressionException { File file = new File(base_); XPath xPath = XPathFactory.newInstance().newXPath(); //String expression = "//*[not(*)]"; String expression = "([0-9]{4})-([0-9]{2})-([0-9]{2})"; DocumentBuilderFactory builderFactory = DocumentBuilderFactory.newInstance(); DocumentBuilder builder = builderFactory.newDocumentBuilder(); Document document = builder.parse(file); document.getDocumentElement().normalize(); NodeList nodeList = (NodeList) xPath.compile(expression).evaluate(document, XPathConstants.NODESET); for (int i = 0; i < nodeList.getLength(); i++) { System.out.println(getXPath(nodeList.item(i))); }}private static String getXPath(Node node) { Node parent = node.getParentNode(); if (parent == null) { return node.getNodeName(); } return getXPath(parent) + "/" + node.getNodeName();}public static void main(String[] args) throws Exception { xpathNodes();}我知道 Regex (([0-9]{4})-([0-9]{2})-([0-9]{2})) 的作用是我在 Notepad++ 中使用过它，它在打开的 xml 文件中找到日期工作得很好.I know the Regex (([0-9]{4})-([0-9]{2})-([0-9]{2})) works as I have used it in Notepad++ and it works just fine there finding the dates within the opened xml file.我目前收到错误:线程main"中的异常javax.xml.transform.TransformerException:需要位置路径，但遇到以下标记:[ Exception in thread "main" javax.xml.transform.TransformerException: A location path was expected, but the following token was encountered: [这甚至还没有考虑评论.This doesn't even take in consideration the comments yet.任何帮助都会很棒！推荐答案对于没有 RegEx 的 XPath 1.0 表达式，您可能会使用:For an XPath 1.0 expression without RegEx you might well use://*[string-length()=10] [number(substring(.,1,4))=substring(.,1,4)] [substring(.,5,1)='-'] [number(substring(.,6,2))=substring(.,6,2)] [substring(.,8,1)='-'] [number(substring(.,9,2))=substring(.,9,2)]|//*[string-length()=10] [number(substring(.,1,4))=substring(.,1,4)] [substring(.,5,1)='-'] [number(substring(.,6,2))=substring(.,6,2)] [substring(.,8,1)='-'] [number(substring(.,9,2))=substring(.,9,2)] /preceding-sibling::node()[normalize-space()][1][self::comment()]请注意:有一些重复的表达式，因为您想选择元素和注释节点.该表达式使用众所周知的习语进行数字测试.最后，由于无法保证仅针对空白文本节点的解析器设置，因此在使用 normalize-space() 函数的位置谓词之前.Do note: there is some duplicated expression because you wanted to select elements and comments nodes. The expression use the well known idiom for number testing. Finally and because there is no guarantee about the parser setting for white space only text nodes, before the position predicated the normalize-space() function is used.在此处编辑:强制字符串长度. 这篇关于解析 XML 仅获取注释和日期值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！