这是一个代码示例,

import PyPDF2
import numpy as np
# creating a pdf file object
pdfFileObj = open('original.pdf' , 'rb')

pdfFileObj_1 = open('tutorial.pdf', 'rb')
# creating a pdf reader object
pdfReader = PyPDF2.PdfFileReader(pdfFileObj)
pdfReader_1 = PyPDF2.PdfFileReader(pdfFileObj_1)
# creating a pdf writer object for new pdf
pdfWriter = PyPDF2.PdfFileWriter()
for i in range(100):
    page= pdfReader.getPage(i)
    page_1= pdfReader_1.getPage(i)
    pdfWriter.addPage(page)
    pdfWriter.addPage(page_1)

#print(pdfWriter.getNumPages())
# new pdf file object
newFile = open('replaced_pdf_1.pdf', 'wb')


pdfWriter.write(newFile)

# closing the original pdf file object
pdfFileObj.close()
pdfFileObj_1.close()
# closing the new pdf file object
newFile.close()


我得到的错误


  PdfReadWarning:未定义对象321 0。 [pdf.py:1629]追溯
  (最近一次通话最近):文件“ test.py”,第22行,在
      pdfWriter.write(newFile)文件“ /home/ubuntu/Ritesh/working/lib/python3.5/site-packages/PyPDF2/pdf.py”,
  第482行,写入
      self._sweepIndirectReferences(externalReferenceMap,self._root)文件
  “ /home/ubuntu/Ritesh/working/lib/python3.5/site-packages/PyPDF2/pdf.py”,
  _sweepIndirectReferences中的第571行
      self._sweepIndirectReferences(externMap,realdata)文件“ /home/ubuntu/Ritesh/working/lib/python3.5/site-packages/PyPDF2/pdf.py”,
  _sweepIndirectReferences中的第547行
      值= self._sweepIndirectReferences(externMap,值)文件“ /home/ubuntu/Ritesh/working/lib/python3.5/site-packages/PyPDF2/pdf.py”,
  _sweepIndirectReferences中的第571行
      self._sweepIndirectReferences(externMap,realdata)文件“ /home/ubuntu/Ritesh/working/lib/python3.5/site-packages/PyPDF2/pdf.py”,
  _sweepIndirectReferences中的第547行
      值= self._sweepIndirectReferences(externMap,值)文件“ /home/ubuntu/Ritesh/working/lib/python3.5/site-packages/PyPDF2/pdf.py”,
  _sweepIndirectReferences中的第556行
      值= self._sweepIndirectReferences(externMap,data [i])文件“ /home/ubuntu/Ritesh/working/lib/python3.5/site-packages/PyPDF2/pdf.py”,
  _sweepIndirectReferences中的第571行
      self._sweepIndirectReferences(externMap,realdata)文件“ /home/ubuntu/Ritesh/working/lib/python3.5/site-packages/PyPDF2/pdf.py”,
  _sweepIndirectReferences中的第547行
      值= self._sweepIndirectReferences(externMap,值)文件“ /home/ubuntu/Ritesh/working/lib/python3.5/site-packages/PyPDF2/pdf.py”,
  _sweepIndirectReferences中的第577行
      newobj = data.pdf.getObject(data)文件“ /home/ubuntu/Ritesh/working/lib/python3.5/site-packages/PyPDF2/pdf.py”,
  第1631行,在getObject中
      引发utils.PdfReadError(“找不到对象。”)PyPDF2.utils.PdfReadError:找不到对象。


我从更改添加到PdfFileWriter对象pdfWriter ..如果页面的页数超过5所了解的内容,它显示上述错误..否则其工作正常。我需要翻页超过100 ..请任何人对此提供帮助。

最佳答案

我在Windows 10和Red Hat Enterprise Linux 6上使用了此示例代码。
在这两个平台上,我都使用python 2.7(我的工作站上没有python 3.5)。
由于您未提供original.pdf和tutorial.pdf的版本,因此我使用了两本pdf格式的电子书:分别为686页和1014页。

我无法确认您的观察结果:

对于我在范围(100)中:

替换为

对于我在范围(600)中:

我收到了1200页的输出pdf。

关于python - 从另一个PDF替换PDF中至少100个页面,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/52384798/

10-14 08:13