I have a PDF with embedded fonts that I can't seem to work with. Right now, I'm using GhostScript and trying to do 2 things:
Minimize filesize of PDF:
gswin32c -dSAFER -dBATCH -dNOPAUSE -dQUIET -sDEVICE = pdfwrite -sOutputFile = output.pdf input.pdf
gswin32c -dSAFER -dBATCH -dNOPAUSE -dQUIET -sDEVICE=pdfwrite -sOutputFile=output.pdf input.pdf
Convert PDF to PNG (super sample, to be used for creating other thumbnails):
gswin32c -dSAFER -dBATCH -dNOPAUSE -dQUIET -dFirstPage = 1 -dLastPage = 1 - r288 -sDEVICE = png16m -sOutputFile = output.pdf input.pdf
gswin32c -dSAFER -dBATCH -dNOPAUSE -dQUIET -dFirstPage=1 -dLastPage=1 -r288 -sDEVICE=png16m -sOutputFile=output.pdf input.pdf
The above works well when working on scanned documents. But when I run them against PDFs with embedded fonts (the PDF is generated on the fly by an application), it fails. Here's the error I get:
GPL Ghostscript 8.71: Warning: 'loca' length 274 is greater than numGlyphs 136 i
n the font UUGHDE+ArialMT.
GPL Ghostscript 8.71: Warning: 'loca' length 274 is greater than numGlyphs 136 i
n the font UUGHDE+ArialMT.
GPL Ghostscript 8.71: Warning: 'loca' length 188 is greater than numGlyphs 93 in
the font UUGHDE+Arial-BoldMT.
GPL Ghostscript 8.71: Warning: 'loca' length 188 is greater than numGlyphs 93 in
the font UUGHDE+Arial-BoldMT.
Aside from GhostScript, I also have access to PDFTK and ImageMagick (which might be replaced with GraphicsMagick). I'm also open to other solutions.
Development is on WAMP. Deployment is to LAMP.
使用的字体你的PDF内部似乎是OpenType字体。创建这些PDF的软件似乎已经对字体进行了子集化。在通过该软件进行字体嵌入和子集化(即时生成PDF - 它是否也是Ghostscript?!?)时,似乎出现了一个问题,使其不符合规范100%。
The fonts used inside your PDFs seem to be OpenType fonts. The software that created these PDFs seems to have subsetted the fonts. During font embedding and subsetting by this software (which "generates the PDFs on the fly" -- was it also Ghostscript?!?), there seems to have occurred a problem that made it to not comply 100% with the specification.
'loca' tables are part of OpenType Font descriptions. They represent an index to all glyph locations.
现在使用Ghostscript处理这些不完全'kosher'的PDF。 Ghostscript发出警告,但没有错误。
Now you process these not completely 'kosher' PDFs with Ghostscript. Ghostscript gives out warnings, but no errors.
GS errors usually mean: "I'll abort further processing. I can't work around a problem or repair this corrupt file. Should I have written output files already, they will be useless."
GS warnings usually mean: "I've encountered a problem. But I'll continue to process the input and work around it. I've written a valid output file. But you better check it, especially its fidelity!"
The warnings (not errors!) you see mean this:
- 其中一个有问题的子集化字体根据表格声明字形数为188。
- 但实际上,实际的字体描述只包含93个字形的定义。