本文介绍了从带有Ghostscript的Postscript文件中创建仅包含文本而不包含图像的Tiff的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以将Postscript文件(从具有可读文本和图像的pdf文档创建)的tiff文件创建为不包含图像且仅包含文本的tiff文件?

Is it possible to create a tiff file from a postscript-file (created from a pdf-document with readable text and images) into a tiff file without the images and only the text?

就像添加一个maxbuffer一样,这样图像将被删除,仅保留文本?

Like add a maxbuffer so images will be removed and only text remaining?

如果也可以删除文本周围的框和线,那将是

And if boxes and lines around text could be removed as well that would be awesome.

最诚挚的问候!

推荐答案

您可以重新定义各种'image'运算符,这样他们就什么也不做:

You can redefine the various 'image' operators so that they don't do anything:

/image {
 type /dicttype eq not { % uses up argument, only one if dict form
   pop pop pop pop   % remove the arguments for the non-dictionary form.
 } ifelse
} bind def

/imagemask {
 type /dicttype eq not { % uses up argument, only one if dict form
   pop pop pop pop   % remove the arguments for the non-dictionary form.
 } ifelse
} bind def

/colorimage {
  type /integertype eq {
    pop                  % multi
    0 1 3 -1 roll {pop} for % one for each colour component
  } {
    pop pop pop
  } ifelse
} bind def

另存为文件,并将文件添加到GS调用中。

Save that as a file, and add the file to your GS invocation.

您可以类似地删除线条通过重新定义描边运算符:

You can remove linework similarly by redefining the stroke operator:

/stroke {
  newpath
} bind def

中风比较困难,建议您阅读PLRM。

rectstroke is harder, I suggest you read the PLRM if you need that one.

可能也是填充运算符:

/fill {
  newpath
} bind def

/eofill {
  newpath
} bind def

当心!某些文本不是使用显示文本操作符绘制的,而是由线条构成的,或绘制为图像的。如上所示,如果重新定义运算符,这些技术将会失败。

Beware! Some text is not drawn using the text 'show' operators, but is constructed from linework, or drawn as images. These techniques will be defeated if you redefine the operators as shown above.

请注意,PDF解释器通常不允许重新定义运算符,因此您可能首先需要使用ps2write设备将PDF文件转换为PostScript,然后通过GS运行生成的文件以获得TIFF文件。

Note that the PDF interpreter often doesn't allow re-definition of operators, so you may first have to convert your PDF file to PostScript, using the ps2write device, then run the resulting file through GS to get a TIFF file.

这篇关于从带有Ghostscript的Postscript文件中创建仅包含文本而不包含图像的Tiff的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-09 06:37