问题描述
上下文在这里解释:
当我尝试使用iText和XML Worker将html转换为pdf时,我被要求给出< hr>
和<$ c的结束标记$ c>< br> 标签。它工作,如果我手动这样做:转换为PDF工作!但我不想手动添加每个结束标记。我怎样才能以自动的方式做到这一点?
您遇到此问题是因为您正在将HTML提供给iText的XML Worker。 XML Worker需要XML,因此您需要将您的HTML转换为XHTML。
有关如何在官方iText站点执行此操作的示例:
public static void tidyUp(String path)抛出IOException {
文件html = new File(path);
byte [] xhtml = Jsoup.parse(html,US-ASCII)。html()。getBytes();
文件dir = new File(results / xml);
dir.mkdirs();
FileOutputStream fos = new FileOutputStream(new File(dir,html.getName()));
fos.write(xhtml);
fos.close();
$ b $ p
$ b 在这个例子中,我们得到一个普通HTML文件的路径(类似于你有什么)。然后,我们使用库将HTML解析为XHTML字节数组。在这个例子中,我们使用该字节数组将一个XHTML文件写入磁盘。您可以直接将字节数组用作XML Worker的输入。
How to do xml to html conversion to generate closed tags.
The context is explained here: Error while generating pdf from Html file in Java using iText
When I try converting html to pdf using iText and XML Worker, I'm asked to give the closing tag for <hr>
and <br>
tags. It works if I do this manually: conversion to pdf worked! But I don't want to add each closing tag manually. How can I do this in an automated way?
解决方案 You are experiencing this problem because you are feeding HTML to iText's XML Worker. XML Worker requires XML, so you need to convert your HTML into XHTML.
There is an example on how to do this on the official iText site: D00_XHTML
public static void tidyUp(String path) throws IOException {
File html = new File(path);
byte[] xhtml = Jsoup.parse(html, "US-ASCII").html().getBytes();
File dir = new File("results/xml");
dir.mkdirs();
FileOutputStream fos = new FileOutputStream(new File(dir, html.getName()));
fos.write(xhtml);
fos.close();
}
In this example, we get a path to an ordinary HTML file (similar to what you have). We then use the Jsoup library to parse the HTML into an XHTML byte array. In this example, we use that byte array to write an XHTML file to disk. You can use the byte array directly as input for XML Worker.
这篇关于如何将HTML转换为XML以生成封闭标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!