问题描述
我想阅读PDF并获取一些页面列表以及每页的大小.我不需要以任何方式进行操作,只需阅读它即可.
I want to read a PDF and get some list of it's pages and each page's size. I don't need to manipulate it in any way, just read it.
当前正在尝试pyPdf,它可以执行我需要的所有操作,除了获取页面大小的方法外.由于pdf文档中的页面大小可能会有所不同,因此我可能需要反复浏览.我还有其他可以使用的libray/方法吗?
Currently trying out pyPdf and it does everything I need except a way to get page sizes. Understanding that I will probably have to iterate through, as page sizes can vary in a pdf document. Is there another libray/method I can use?
我尝试使用PIL,一些在线食谱甚至使用d = Image(imagefilename),但它从不读取我的任何PDF-它读取我向其投掷的所有内容-甚至某些我不知道PIL可以做的事情
I tried using PIL, some online recipes even have d=Image(imagefilename) usage, but it NEVER reads any of my PDFs - it reads everything else I throw at it - even some things I didn't know PIL could do.
任何指导都值得赞赏-我使用的是Windows 7 64,python25(因为我也做GAE东西),但是我很乐意在Linux或更现代的pythiis中使用它.
Any guidance appreciated - I'm on windows 7 64, python25 (because I also do GAE stuff), but I'm happy to do it in Linux or more modern pythiis.
推荐答案
这可以通过 PyPDF2 :
>>> from PyPDF2 import PdfFileReader
>>> input1 = PdfFileReader(open('example.pdf', 'rb'))
>>> input1.getPage(0).mediaBox
RectangleObject([0, 0, 612, 792])
(以前称为 pyPdf 并仍参考其文档.)
(Formerly known as pyPdf and still refers to its documentation.)
这篇关于在Python中从PDF提取页面大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!