我试图将@XmlAnyElement与DomHandler结合使用以捕获特定字段内的未解析文本,例如Blaise Doughan的this示例中的文本。但是,当我尝试解析多个客户时,以前所有记录中的生物字段内容将继续发送到我的DomHandler!

这是我要解析的示例文档:

<?xml version="1.0" encoding="UTF-8"?>
<customers>
   <customer>
     <name>Jane Doe</name>
     <bio>
       <html>Jane's bio</html>
     </bio>
   </customer>
   <customer>
     <name>John Doe</name>
     <bio>
       <html>John's bio</html>
     </bio>
   </customer>
</customers>


但是输出是:

 Name:  Jane Doe
 Bio:   <html>Jane's bio</html>
 Name:  John Doe
 Bio:   <html>Jane's bio</html>


BioHandler(与previous example相同)

package blog.domhandler;

import java.io.StringReader;
import java.io.StringWriter;

import javax.xml.bind.ValidationEventHandler;
import javax.xml.bind.annotation.DomHandler;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;

public class BioHandler implements DomHandler<String, StreamResult> {

    private static final String BIO_START_TAG = "<bio>";
    private static final String BIO_END_TAG = "</bio>";

    private StringWriter xmlWriter = new StringWriter();

    public StreamResult createUnmarshaller(ValidationEventHandler errorHandler) {
        return new StreamResult(xmlWriter);
    }

    public String getElement(StreamResult rt) {
        String xml = rt.getWriter().toString();
        int beginIndex = xml.indexOf(BIO_START_TAG) + BIO_START_TAG.length();
        int endIndex = xml.indexOf(BIO_END_TAG);
        return xml.substring(beginIndex, endIndex);
    }

    public Source marshal(String n, ValidationEventHandler errorHandler) {
        try {
            String xml = BIO_START_TAG + n.trim() + BIO_END_TAG;
            StringReader xmlReader = new StringReader(xml);
            return new StreamSource(xmlReader);
        } catch(Exception e) {
            throw new RuntimeException(e);
        }
    }

}


客户(与previous example保持不变)

package blog.domhandler;

import javax.xml.bind.annotation.XmlAnyElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;

@XmlRootElement
@XmlType(propOrder={"name", "bio"})
public class Customer {

    private String name;
    private String bio;

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    @XmlAnyElement(BioHandler.class)
    public String getBio() {
        return bio;
    }

    public void setBio(String bio) {
        this.bio = bio;
    }

}


顾客

package blog.domhandler;

import java.util.List;

import javax.xml.bind.annotation.XmlAnyElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;

@XmlRootElement
public class Customers {

    private List<Customer> customers;

    public List<Customer> getCustomer() {
        return customers;
    }

    public void setCustomer(List<Customer> c) {
        this.customers = c;
    }

}


演示(驱动程序)

package blog.domhandler;

import java.io.File;

import javax.xml.bind.JAXBContext;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;

public class Demo {

    public static void main(String[] args) throws Exception {
        JAXBContext jc = JAXBContext.newInstance(Customers.class);

        Unmarshaller unmarshaller = jc.createUnmarshaller();
        Customers customers = (Customers) unmarshaller.unmarshal(new File("src/blog/domhandler/input.xml"));

        for( Customer customer: customers.getCustomer() ) {

        System.out.println("Name:  " + customer.getName());
        System.out.println("Bio:   " + customer.getBio());

        }

    }
}


当我在BioHandler.getElement()中放置一个断点时,我看到它的名为String xml的第一次使用该值

<?xml version="1.0" encoding="UTF-8"?><bio><html>Jane's bio</html>
    </bio>


而第二次被称为String xml的值

<?xml version="1.0" encoding="UTF-8"?><bio><html>Jane's bio</html>
    </bio><?xml version="1.0" encoding="UTF-8"?><bio><html>John's bio</html>
    </bio>


是否有某种方法可以向解析器指示在每次调用BioHandler.getElement()之后应丢弃此内容?

最佳答案

原来我的问题已被该示例来自的blog帖子的第一条评论回答。 BioHandler.createUnmarshaller()的代码应为:

public StreamResult createUnmarshaller(ValidationEventHandler errorHandler) {
    xmlWriter.getBuffer().setLength(0);
    return new StreamResult(xmlWriter);
}

关于java - DomHandler捕获多个记录的文本,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/23550197/

10-10 08:54