问题描述
我正在尝试将PDF转换为PDF / A.
目前我可以使用和。但这样做非常麻烦。
I am trying to convert PDF to PDF/A.Currently I can do this using OpenOffice pdf viewer plugin together with Jodconverter 2. But this is pretty cumbersome to do.
有没有人知道我可以使用的任何开源/免费Java库?
Does anybody know of any open source / free Java libraries I can use to do this?
我有到目前为止找到了这些开源库,但没有一个支持将PDF转换为PDF / A
I have found these open source libraries so far, but none of which has support for converting PDF to PDF/A
iText
gnujpdf
PDF Box
FOP
JFreeReport
PJX
JPedal
PDFjet
jPod
PDF Renderer
更新
似乎能够将文档(不是PDF文档)转换为PDF / A
Seems like Apache FOP has ability to convert a document (not a PDF document though) to PDF/A
推荐答案
从PDF转换为PDF / A
这是你最初提到的问题的答案。
对于一个不涉及潜在有损重新渲染的解决方案,请查看。
If Zoltan's solution is not acceptable/sufficient according to your requirements then you are stuck with re-rendering. You could stick with OpenOffice/JODConverter, or go for less overhead by preferably using GhostScript (the mother of them all), piping pdf2ps
back into PDF/A-enabled ps2pdf
.
其他受访者建议Apache FOP,在PDF到PDF / A转换的背景下具有以下优点和缺点:
Other respondents have suggested Apache FOP, which in the context of PDF to PDF/A conversion has the following advantages and disadvantages:
- 优点:比OpenOffice / JODCOnverter组合更少移动部件(例如,将进程内FOP与守护OO进行比较)
- 缺点:您负责从PDF转换为或以其他方式呈现给FOP(需要更多编码和/或集成工作),而OpenOff ice / JODCOnverter和Ghostscript可能需要更少的额外编码。
- advantage: less "moving parts" than an OpenOffice/JODCOnverter combination (e.g. comparing in-process FOP with daemonized OO)
- disadvantage: you are responsible for converting from PDF to XSL-FO or otherwise rendering to FOP (more coding and/or integration work required of you), whereas OpenOffice/JODCOnverter and Ghostscript can require less additional coding.
但是,如果我没有弄错,那么你似乎使用PDF作为中间格式,即您要实现的目标是 XHTML到PDF到PDF / A转换。通过直接从XHTML转换为PDF / A,过程将更快,将使用更少的资源(例如内存),并且不会不必要地降低输出质量(如重新渲染解决方案所能)或需要对PDF格式的深入了解(作为Zoltan的解决方案)确实。)
However, if I am not mistaken, it appears that you are using PDF as an intermediate format, i.e. that what you are trying to achieve is XHTML to PDF to PDF/A conversion. By converting directly from XHTML to PDF/A the process will be faster, will use less resources (e.g. memory) and will not needlessly degrade output quality (as re-rendering solutions can) or require intimate knowledge of the PDF format (as Zoltan's solution does.)
在这种情况下,直接从XHTML转换为PDF / A 将是一个理想的解决方案, (该示例使用iTextSharp, iText的.Net端口,但它与Java相同),或者像其他人所建议的那样使用Apache FOP(在输出到PDF时也会在内部使用iText,虽然设置比使用iText更臃肿,效率低且设置复杂直接,它可能产生比iText示例更好的结果 - 只有一种方法可以解决这个问题,即您必须在一些XHTML文件上尝试将其作为样本。:))
In this case, directly converting from XHTML to PDF/A would be an ideal solution, either using iText directly (the example uses iTextSharp, a .Net port of iText, but it's the same for Java), or by using Apache FOP as others have suggested (which also uses iText internally when outputting to PDF, and although it is more bloated, inefficient and complicated to setup than using iText directly, it might produce better results than the iText example -- only one way to settle that, i.e. you have to try it out on a few of your XHTML files as samples. :) )
这篇关于用于将现有PDF转换为PDF / A的免费Java库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!