我试图将@XmlAnyElement与DomHandler结合使用以捕获特定字段内的未解析文本,例如Blaise Doughan的this示例中的文本。但是,当我尝试解析多个客户时,以前所有记录中的生物字段内容将继续发送到我的DomHandler!
这是我要解析的示例文档:
<?xml version="1.0" encoding="UTF-8"?>
<customers>
<customer>
<name>Jane Doe</name>
<bio>
<html>Jane's bio</html>
</bio>
</customer>
<customer>
<name>John Doe</name>
<bio>
<html>John's bio</html>
</bio>
</customer>
</customers>
但是输出是:
Name: Jane Doe
Bio: <html>Jane's bio</html>
Name: John Doe
Bio: <html>Jane's bio</html>
BioHandler(与previous example相同)
package blog.domhandler;
import java.io.StringReader;
import java.io.StringWriter;
import javax.xml.bind.ValidationEventHandler;
import javax.xml.bind.annotation.DomHandler;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
public class BioHandler implements DomHandler<String, StreamResult> {
private static final String BIO_START_TAG = "<bio>";
private static final String BIO_END_TAG = "</bio>";
private StringWriter xmlWriter = new StringWriter();
public StreamResult createUnmarshaller(ValidationEventHandler errorHandler) {
return new StreamResult(xmlWriter);
}
public String getElement(StreamResult rt) {
String xml = rt.getWriter().toString();
int beginIndex = xml.indexOf(BIO_START_TAG) + BIO_START_TAG.length();
int endIndex = xml.indexOf(BIO_END_TAG);
return xml.substring(beginIndex, endIndex);
}
public Source marshal(String n, ValidationEventHandler errorHandler) {
try {
String xml = BIO_START_TAG + n.trim() + BIO_END_TAG;
StringReader xmlReader = new StringReader(xml);
return new StreamSource(xmlReader);
} catch(Exception e) {
throw new RuntimeException(e);
}
}
}
客户(与previous example保持不变)
package blog.domhandler;
import javax.xml.bind.annotation.XmlAnyElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;
@XmlRootElement
@XmlType(propOrder={"name", "bio"})
public class Customer {
private String name;
private String bio;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@XmlAnyElement(BioHandler.class)
public String getBio() {
return bio;
}
public void setBio(String bio) {
this.bio = bio;
}
}
顾客
package blog.domhandler;
import java.util.List;
import javax.xml.bind.annotation.XmlAnyElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;
@XmlRootElement
public class Customers {
private List<Customer> customers;
public List<Customer> getCustomer() {
return customers;
}
public void setCustomer(List<Customer> c) {
this.customers = c;
}
}
演示(驱动程序)
package blog.domhandler;
import java.io.File;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;
public class Demo {
public static void main(String[] args) throws Exception {
JAXBContext jc = JAXBContext.newInstance(Customers.class);
Unmarshaller unmarshaller = jc.createUnmarshaller();
Customers customers = (Customers) unmarshaller.unmarshal(new File("src/blog/domhandler/input.xml"));
for( Customer customer: customers.getCustomer() ) {
System.out.println("Name: " + customer.getName());
System.out.println("Bio: " + customer.getBio());
}
}
}
当我在BioHandler.getElement()中放置一个断点时,我看到它的名为String xml的第一次使用该值
<?xml version="1.0" encoding="UTF-8"?><bio><html>Jane's bio</html>
</bio>
而第二次被称为String xml的值
<?xml version="1.0" encoding="UTF-8"?><bio><html>Jane's bio</html>
</bio><?xml version="1.0" encoding="UTF-8"?><bio><html>John's bio</html>
</bio>
是否有某种方法可以向解析器指示在每次调用BioHandler.getElement()之后应丢弃此内容?
最佳答案
原来我的问题已被该示例来自的blog帖子的第一条评论回答。 BioHandler.createUnmarshaller()的代码应为:
public StreamResult createUnmarshaller(ValidationEventHandler errorHandler) {
xmlWriter.getBuffer().setLength(0);
return new StreamResult(xmlWriter);
}
关于java - DomHandler捕获多个记录的文本,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/23550197/