问题描述
我大约有50-60个pdf文件(图像),每个文件1.5MB.现在,我不想在论文中放置如此大的pdf文件,那样会使后部的下载,阅读和打印变得很痛苦.因此,我尝试使用ghostscript执行以下操作:
gs \
-dNOPAUSE -dBATCH \
-sDEVICE=pdfwrite \
-dCompatibilityLevel=1.4 \
-dPDFSETTINGS="/screen" \
-sOutputFile=output.pdf \
L_2lambda_max_1wl_E0_1_zg.pdf
但是,现在我的1.4MB pdf是1.5MB大.
我做错了什么?有什么方法可以检查pdf文件的分辨率?我只需要300dpi的图像,所以有人会建议使用convert
更改分辨率,还是可以用gs
更改图像分辨率(降低分辨率),因为当我使用convert
我如何使用转换:
convert \
-units PixelsPerInch \
~/Desktop/L_2lambda_max_1wl_E0_1_zg.pdf \
-density 600 \
~/Desktop/output.pdf
示例文件
http://dl.dropbox.com/u/13223318/L_2lambda_max_1wl_E0_1_zg.pdf
如果您运行Ghostscript -dPDFSETTINGS=/screen
,这只是一种快捷方式.实际上,您将(隐式地)获得一整套使用的设置,可以使用以下命令进行查询:
gs \
-dNODISPLAY \
-c ".distillersettings {exch ==only ( ) print ===} forall quit" \
| grep '/screen'
在我的Ghostscript(v9.06prerelease)上,得到以下输出(略作编辑以提高可读性):
/screen
<< /DoThumbnails false
/MonoImageResolution 300
/ColorImageDownsampleType /Average
/PreserveEPSInfo false
/ColorConversionStrategy /sRGB
/GrayImageDownsampleType /Average
/EmbedAllFonts true
/CannotEmbedFontPolicy /Warning
/PreserveOPIComments false
/GrayImageResolution 72
/GrayACSImageDict <<
/ColorTransform 1
/QFactor 0.76
/Blend 1
/HSamples [2 1 1 2]
/VSamples [2 1 1 2]
>>
/ColorImageResolution 72
/PreserveOverprintSettings false
/CreateJobTicket false
/AutoRotatePages /PageByPage
/MonoImageDownsampleType /Average
/NeverEmbed [/Courier
/Courier-Bold
/Courier-Oblique
/Courier-BoldOblique
/Helvetica
/Helvetica-Bold
/Helvetica-Oblique
/Helvetica-BoldOblique
/Times-Roman
/Times-Bold
/Times-Italic
/Times-BoldItalic
/Symbol
/ZapfDingbats]
/ColorACSImageDict <<
/ColorTransform 1
/QFactor 0.76
/Blend 1
/HSamples [2 1 1 2]
/VSamples [2 1 1 2] >>
/CompatibilityLevel 1.3
/UCRandBGInfo /Remove
>>
我想知道您的PDF是否超大图像,并且这种转换是否会导致不受欢迎的事情(例如,使用错误"参数重新采样图像)会增加文件大小...
如果是这种情况(图像繁多的PDF),请告诉我,我将通过一些建议来更新此答案....
更新
我看了DNA提供的样本文件.有趣的...
不,它不包含任何图像.
相反,它包含一个大流(使用/FlateDecode
压缩),由大约700.000+(!!)个操作组成,大部分是PDF语言中的单个矢量操作,例如:
m
(移动),
l
(lineto),
d
(setdash),
w
(setlinewidth),
S
(行程),
s
(闭合路径和笔划),
W*
(eoclip),
rg
和RG
(setrgbcolor)
还有一些.
(该PDF代码的AFAICS编写效率很低(但确实可以做到),因为它确实连接了许多短笔画而不是长笔画",并且几乎每个笔画都再次定义了颜色(即使它是与之前相同),并具有所有其他开销(开始笔画,结束笔画...).
Ghostscript的-dPDFSETTINGS=/screen
在这里没有任何效果(例如,没有要下采样的图像).文件大小的增加(准确地说是+48 kByte),可能是由于Ghostscript在解释文件时将一些内部描边等命令重新组织为不同的顺序.
因此,关于PDF文件大小,您无能为力 ...
- ...除非您将每个这些PDF页面都转换为 REAL 图像(例如PNG):
gs \ -o out72.png \ -sDEVICE=pngalpha \ L_2lambda_max_1wl_E0_1_zg.pdf
(我使用了pngalpha
输出来获得透明背景.)'out.png'的图像尺寸为259x213px
,文件大小现在为70 KB.但我确定您不会喜欢这种质量的:-)
输出质量很差,因为Ghostscript使用默认分辨率72 dpi.
由于您说过要具有300dpi,因此命令变为:
gs \
-o out300.png \
-sDEVICE=pngalpha \
-r300 \
L_2lambda_max_1wl_E0_1_zg.pdf
文件大小现在为750 KB,图像尺寸为1080x889
像素.
更新2
由于这些日子以来,好奇一直在流行... :-) ...我试图借助Mac上的Adobe Acrobat X Pro减小文件大小.
您想知道结果吗?
执行另存为...(PDF并减小文件大小)" -在过去,这对我来说总是产生非常好的结果! -创建了一个1,8 ++ MByte文件(+ 29%).我想这肯定会使Ghostscript的性能(文件大小增加+ 3%)变成现实的观点!
I have about 50-60 pdf files (images) that are 1.5MB large each. Now I don't want to have such large pdf files in my thesis as that would make downloading, reading and printing a pain in the rear. So I tried using ghostscript to do the following:
gs \
-dNOPAUSE -dBATCH \
-sDEVICE=pdfwrite \
-dCompatibilityLevel=1.4 \
-dPDFSETTINGS="/screen" \
-sOutputFile=output.pdf \
L_2lambda_max_1wl_E0_1_zg.pdf
However, now my 1.4MB pdf is 1.5MB large.
What did I do wrong? Is there some way I can check the resolution of the pdf file? I just need 300dpi images, so would anyone suggest using convert
to change the resolution or is there someway I could change the image resolution (reduce it) with gs
, since the image is very grainy when I use convert
How I use convert:
convert \
-units PixelsPerInch \
~/Desktop/L_2lambda_max_1wl_E0_1_zg.pdf \
-density 600 \
~/Desktop/output.pdf
Example File
http://dl.dropbox.com/u/13223318/L_2lambda_max_1wl_E0_1_zg.pdf
If you run Ghostscript -dPDFSETTINGS=/screen
this is just a sort of shortcut. In fact you'll get (implicitly) a whole bunch of settings used, which you can query with the following command:
gs \
-dNODISPLAY \
-c ".distillersettings {exch ==only ( ) print ===} forall quit" \
| grep '/screen'
On my Ghostscript (v9.06prerelease) I get the following output (slightly edited to increase readability):
/screen
<< /DoThumbnails false
/MonoImageResolution 300
/ColorImageDownsampleType /Average
/PreserveEPSInfo false
/ColorConversionStrategy /sRGB
/GrayImageDownsampleType /Average
/EmbedAllFonts true
/CannotEmbedFontPolicy /Warning
/PreserveOPIComments false
/GrayImageResolution 72
/GrayACSImageDict <<
/ColorTransform 1
/QFactor 0.76
/Blend 1
/HSamples [2 1 1 2]
/VSamples [2 1 1 2]
>>
/ColorImageResolution 72
/PreserveOverprintSettings false
/CreateJobTicket false
/AutoRotatePages /PageByPage
/MonoImageDownsampleType /Average
/NeverEmbed [/Courier
/Courier-Bold
/Courier-Oblique
/Courier-BoldOblique
/Helvetica
/Helvetica-Bold
/Helvetica-Oblique
/Helvetica-BoldOblique
/Times-Roman
/Times-Bold
/Times-Italic
/Times-BoldItalic
/Symbol
/ZapfDingbats]
/ColorACSImageDict <<
/ColorTransform 1
/QFactor 0.76
/Blend 1
/HSamples [2 1 1 2]
/VSamples [2 1 1 2] >>
/CompatibilityLevel 1.3
/UCRandBGInfo /Remove
>>
I'm wondering if your PDFs are image-heavy, and if this sort of conversion does un-welcome things (f.e. re-sampling images with the 'wrong' parameters) which increase the file size...
If this is the case (image-heavy PDF), tell so, and I'll update this answer with a few suggestions....
Update
I had a look at the sample file provided by DNA. Interesting...
No, it does not contain any image.
Instead, it contains one large stream (compressed using /FlateDecode
) which consists of roughly 700.000+ (!!) operations, mostly single vector operations in PDF language, such as:
m
(moveto),
l
(lineto),
d
(setdash),
w
(setlinewidth),
S
(stroke),
s
(closepath and stroke),
W*
(eoclip),
rg
and RG
(setrgbcolor)
and a few more.
(That PDF code is very inefficiently written AFAICS (but does its job), because it does concatenate many short strokes instead of doing 'long' ones, and nearly each stroke defines the color again (even if it is the same as before), and has all the other overhead (start stroke, end stroke,...).
Ghostscript's -dPDFSETTINGS=/screen
do not have any effect here (there are no images to downsample, for example). The increased file size (+48 kByte to be precise) is probably due to Ghostscript re-organizing some of the internal stroking etc. commands to a different order when it interprets the file.
So there is not much you can do about the PDF file size ...
- ...unless you convert each of these PDF pages into a REAL image such as PNG:
gs \ -o out72.png \ -sDEVICE=pngalpha \ L_2lambda_max_1wl_E0_1_zg.pdf
(I used the pngalpha
output to get transparent background.) The image dimensions of 'out.png' are 259x213px
, the filesize is now 70 kByte. But I'm sure you'll not like the quality :-)
The output quality is 'bad' because Ghostscript uses a default resolution of 72 dpi.
Since you said you'd like to have 300dpi, the command becomes this:
gs \
-o out300.png \
-sDEVICE=pngalpha \
-r300 \
L_2lambda_max_1wl_E0_1_zg.pdf
The filesize now is 750 kByte, the image dimensions are 1080x889
Pixels.
Update 2
Since Curiosity is en vogue these days... :-) ...I tried to bring down the file size with the help of Adobe Acrobat X Pro on Mac.
You wanna know the results?
Performing a 'Save as... (PDF with reduced filesize)' -- which for me in the past always yielded very good results! -- created a 1,8++ MByte file (+29%). I guess this definitely puts Ghostscript's performance (file size increase +3%) into a realistic perspective !
这篇关于在Linux上使用Ghostscript减小PDF文件大小不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!