PDF使用iTextSharp进行压缩

本文介绍了PDF使用iTextSharp进行压缩的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前正在尝试重新压缩已经创建的pdf，我正在尝试找到一种方法来重新压缩文档中的图像，以减小文件大小。

I am currently trying to recompress a pdf that has already been created, I am trying to find a way to recompress the images that are in the document, to reduce the file size.

我一直在尝试使用DataLogics PDE和iTextSharp库来执行此操作，但我找不到对项目进行流重新压缩的方法。

I have been trying to do this with the DataLogics PDE and iTextSharp libraries but I can not find a way to do the stream recompression of the items.

我有关于循环遍历xobject并获取图像然后将DPI降低到96或使用libjpeg C＃implimentation来改变图像的质量但是获得它回到pdf流似乎总是最终，内存损坏或其他一些问题。

I have though about looping over the xobjects and getting the images and then dropping the DPI down to 96 or using the libjpeg C# implimentation to change the quality of the image but getting it back into the pdf stream seems to always end up, with memory corruption or some other issue.

任何样本都将受到赞赏。

Any samples will be appreciated.

谢谢

推荐答案

iText和iTextSharp有一些替代间接的方法对象。具体来说， PdfReader.KillIndirect（）执行它所说的并且 PdfWriter.AddDirectImageSimple（iTextSharp.text.Image，PRIndirectReference）然后你可以使用它来代替你杀掉的东西。

iText and iTextSharp have some methods for replacing indirect objects. Specifically there's PdfReader.KillIndirect() which does what it says and PdfWriter.AddDirectImageSimple(iTextSharp.text.Image, PRIndirectReference) which you can then use to replace what you killed off.

在伪C＃代码中你会做：

In pseudo C# code you'd do:

var oldImage = PdfReader.GetPdfObject();
var newImage = YourImageCompressionFunction(oldImage);
PdfReader.KillIndirect(oldImage);
yourPdfWriter.AddDirectImageSimple(newImage, (PRIndirectReference)oldImage);

将原始字节转换为.Net图像可能很棘手，我会把它留给你或者你可以在这里搜索Mark在这里有一个。此外，从技术上讲，PDF没有DPI的概念，这主要适用于打印机。获取更多相关信息。

Converting the raw bytes to a .Net image can be tricky, I'll leave that up to you or you can search here. Mark has a good description here. Also, technically PDFs don't have a concept of DPI, that's for printers mostly. See the answer here for more on that.

使用压缩算法上面的方法实际上可以做两件事，物理缩小图像以及应用JPEG压缩。当您物理缩小图像并将其添加回来时，它将占用与原始图像相同的空间量，但使用的像素更少。这将为您提供您认为的DPI减少。 JPEG压缩说明了一切。

Using the method above your compression algorithm can actually do two things, physically shrink the image as well as apply JPEG compression. When you physically shrink the image and add it back it will occupy the same amount of space as the original image but with less pixels to work with. This will get you what you consider to be DPI reduction. The JPEG compression speaks for itself.

下面是一个针对iTextSharp 5.1.1.0的全功能C＃2010 WinForms应用程序。它会在您的桌面上使用名为LargeImage.jpg的现有JPEG，并从中创建一个新的PDF。然后它打开PDF，提取图像，将其物理缩小到原始大小的90％，应用85％JPEG压缩并将其写回PDF。有关更多说明，请参阅代码中的注释。代码需要更多的空/错误检查。还要查找注意注释，您需要扩展以处理其他情况。

Below is a full working C# 2010 WinForms app targeting iTextSharp 5.1.1.0. It takes an existing JPEG on your desktop called "LargeImage.jpg" and creates a new PDF from it. Then it opens the PDF, extracts the image, physically shrinks it to 90% of the original size, applies 85% JPEG compression and writes it back to the PDF. See the comments in the code for more of an explanation. The code needs lots more null/error checking. Also looks for NOTE comments where you'll need to expand to handle other situations.

using System;
using System.Drawing;
using System.Drawing.Imaging;
using System.Drawing.Drawing2D;
using System.Windows.Forms;
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;

namespace WindowsFormsApplication1 {
    public partial class Form1 : Form {
        public Form1() {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e) {
            //Our working folder
            string workingFolder = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
            //Large image to add to sample PDF
            string largeImage = Path.Combine(workingFolder, "LargeImage.jpg");
            //Name of large PDF to create
            string largePDF = Path.Combine(workingFolder, "Large.pdf");
            //Name of compressed PDF to create
            string smallPDF = Path.Combine(workingFolder, "Small.pdf");

            //Create a sample PDF containing our large image, for demo purposes only, nothing special here
            using (FileStream fs = new FileStream(largePDF, FileMode.Create, FileAccess.Write, FileShare.None)) {
                using (Document doc = new Document()) {
                    using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
                        doc.Open();

                        iTextSharp.text.Image importImage = iTextSharp.text.Image.GetInstance(largeImage);
                        doc.SetPageSize(new iTextSharp.text.Rectangle(0, 0, importImage.Width, importImage.Height));
                        doc.SetMargins(0, 0, 0, 0);
                        doc.NewPage();
                        doc.Add(importImage);

                        doc.Close();
                    }
                }
            }

            //Now we're going to open the above PDF and compress things

            //Bind a reader to our large PDF
            PdfReader reader = new PdfReader(largePDF);
            //Create our output PDF
            using (FileStream fs = new FileStream(smallPDF, FileMode.Create, FileAccess.Write, FileShare.None)) {
                //Bind a stamper to the file and our reader
                using (PdfStamper stamper = new PdfStamper(reader, fs)) {
                    //NOTE: This code only deals with page 1, you'd want to loop more for your code
                    //Get page 1
                    PdfDictionary page = reader.GetPageN(1);
                    //Get the xobject structure
                    PdfDictionary resources = (PdfDictionary)PdfReader.GetPdfObject(page.Get(PdfName.RESOURCES));
                    PdfDictionary xobject = (PdfDictionary)PdfReader.GetPdfObject(resources.Get(PdfName.XOBJECT));
                    if (xobject != null) {
                        PdfObject obj;
                        //Loop through each key
                        foreach (PdfName name in xobject.Keys) {
                            obj = xobject.Get(name);
                            if (obj.IsIndirect()) {
                                //Get the current key as a PDF object
                                PdfDictionary imgObject = (PdfDictionary)PdfReader.GetPdfObject(obj);
                                //See if its an image
                                if (imgObject.Get(PdfName.SUBTYPE).Equals(PdfName.IMAGE)) {
                                    //NOTE: There's a bunch of different types of filters, I'm only handing the simplest one here which is basically raw JPG, you'll have to research others
                                    if (imgObject.Get(PdfName.FILTER).Equals(PdfName.DCTDECODE)) {
                                        //Get the raw bytes of the current image
                                        byte[] oldBytes = PdfReader.GetStreamBytesRaw((PRStream)imgObject);
                                        //Will hold bytes of the compressed image later
                                        byte[] newBytes;
                                        //Wrap a stream around our original image
                                        using (MemoryStream sourceMS = new MemoryStream(oldBytes)) {
                                            //Convert the bytes into a .Net image
                                            using (System.Drawing.Image oldImage = Bitmap.FromStream(sourceMS)) {
                                                //Shrink the image to 90% of the original
                                                using (System.Drawing.Image newImage = ShrinkImage(oldImage, 0.9f)) {
                                                    //Convert the image to bytes using JPG at 85%
                                                    newBytes = ConvertImageToBytes(newImage, 85);
                                                }
                                            }
                                        }
                                        //Create a new iTextSharp image from our bytes
                                        iTextSharp.text.Image compressedImage = iTextSharp.text.Image.GetInstance(newBytes);
                                        //Kill off the old image
                                        PdfReader.KillIndirect(obj);
                                        //Add our image in its place
                                        stamper.Writer.AddDirectImageSimple(compressedImage, (PRIndirectReference)obj);
                                    }
                                }
                            }
                        }
                    }
                }
            }

            this.Close();
        }

        //Standard image save code from MSDN, returns a byte array
        private static byte[] ConvertImageToBytes(System.Drawing.Image image, long compressionLevel) {
            if (compressionLevel < 0) {
                compressionLevel = 0;
            } else if (compressionLevel > 100) {
                compressionLevel = 100;
            }
            ImageCodecInfo jgpEncoder = GetEncoder(ImageFormat.Jpeg);

            System.Drawing.Imaging.Encoder myEncoder = System.Drawing.Imaging.Encoder.Quality;
            EncoderParameters myEncoderParameters = new EncoderParameters(1);
            EncoderParameter myEncoderParameter = new EncoderParameter(myEncoder, compressionLevel);
            myEncoderParameters.Param[0] = myEncoderParameter;
            using (MemoryStream ms = new MemoryStream()) {
                image.Save(ms, jgpEncoder, myEncoderParameters);
                return ms.ToArray();
            }

        }
        //standard code from MSDN
        private static ImageCodecInfo GetEncoder(ImageFormat format) {
            ImageCodecInfo[] codecs = ImageCodecInfo.GetImageDecoders();
            foreach (ImageCodecInfo codec in codecs) {
                if (codec.FormatID == format.Guid) {
                    return codec;
                }
            }
            return null;
        }
        //Standard high quality thumbnail generation from http://weblogs.asp.net/gunnarpeipman/archive/2009/04/02/resizing-images-without-loss-of-quality.aspx
        private static System.Drawing.Image ShrinkImage(System.Drawing.Image sourceImage, float scaleFactor) {
            int newWidth = Convert.ToInt32(sourceImage.Width * scaleFactor);
            int newHeight = Convert.ToInt32(sourceImage.Height * scaleFactor);

            var thumbnailBitmap = new Bitmap(newWidth, newHeight);
            using (Graphics g = Graphics.FromImage(thumbnailBitmap)) {
                g.CompositingQuality = CompositingQuality.HighQuality;
                g.SmoothingMode = SmoothingMode.HighQuality;
                g.InterpolationMode = InterpolationMode.HighQualityBicubic;
                System.Drawing.Rectangle imageRectangle = new System.Drawing.Rectangle(0, 0, newWidth, newHeight);
                g.DrawImage(sourceImage, imageRectangle);
            }
            return thumbnailBitmap;
        }
    }
}

这篇关于PDF使用iTextSharp进行压缩的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！