问题描述
我正在尝试编写一个将pdf拆分为单独页面的函数.来自此SO答案.我复制了一个简单的函数,该函数将pdf拆分为单独的页面:
I'm trying to write a function which splits a pdf into separate pages. From this SO answer. I copied a simple function which splits a pdf into separate pages:
def splitPdf(file_):
pdf = PdfFileReader(file_)
pages = []
for i in range(pdf.getNumPages()):
output = PdfFileWriter()
output.addPage(pdf.getPage(i))
with open("document-page%s.pdf" % i, "wb") as outputStream:
output.write(outputStream)
return pages
但是,这会将新的PDF写入文件,而不是将新PDF的列表作为文件变量返回.所以我将output.write(outputStream)
的行更改为:
This however, writes the new PDFs to file, instead of returning a list of the new PDFs as file variables. So I changed the line of output.write(outputStream)
to:
pages.append(outputStream)
但是,当尝试在页面列表中写入元素时,我得到了ValueError: I/O operation on closed file
.
When trying to write the elements in the pages list however, I get a ValueError: I/O operation on closed file
.
有人知道如何将新文件添加到列表中并返回它们,而不是将它们写入文件吗?欢迎所有提示!
Does anybody know how I can add the new files to the list and return them, instead of writing them to file? All tips are welcome!
推荐答案
作为文件变量的PDF列表"的含义还不是很清楚.如果要创建字符串而不是文件PDF内容,并返回此类字符串的列表,将open()
替换为StringIO
并调用getvalue()
以获得内容:
It is not completely clear what you mean by "list of PDFs as file variables. If you want to create strings instead of files with PDF contents, and return a list of such strings, replace open()
with StringIO
and call getvalue()
to obtain the contents:
import cStringIO
def splitPdf(file_):
pdf = PdfFileReader(file_)
pages = []
for i in range(pdf.getNumPages()):
output = PdfFileWriter()
output.addPage(pdf.getPage(i))
io = cStringIO.StringIO()
output.write(io)
pages.append(io.getvalue())
return pages
这篇关于如何“写入变量"?而不是“归档"在Python中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!