问题描述
我正在尝试将matplotlib对象加载到reportlab.这是我的代码:
I'm trying to load a matplotlib object into reportlab.Here is my code:
from reportlab.pdfgen import canvas
from reportlab.lib.utils import ImageReader
from reportlab.platypus import Paragraph, SimpleDocTemplate, Spacer, Image
from matplotlib import pyplot as plt
def __get_img_data():
"""
returns the binary image data of the plot
"""
img_file = NamedTemporaryFile(delete=False)
plt.savefig(img_file.name)
img_data = open(img_file.name + '.png', 'rb').read()
os.remove(img_file.name)
os.remove(img_file.name + '.png')
return img_data
def get_plot():
# HERE I PLOT SOME STUFF
img_data = __get_img_data()
plt.close()
return img_data
class NumberedCanvas(canvas.Canvas):
def __init__(self):
pass
class ReportTemplate:
def __init__(self):
pass
def _header_footer(self, canvas, doc):
pass
def get_data(self):
elements = []
elements.append('hello')
## HERE I WANT TO ADD THE IMAGE
imgdata = get_plot()
with open('/tmp/x.png', 'wb') as fh:
fh.write(imgdata)
im = Image('/tmp/x.png', width=usable_width, height=usable_width)
elements.append(im)
os.remove('/tmp/x.png')
######
doc.build(elements, onFirstPage=self._header_footer,\
onLaterPages=self._header_footer,\
canvasmaker=NumberedCanvas)
# blah blah
return obj
我的目标是将绘图图像插入报告中.这工作正常,但我不想写入临时文件.我之所以尝试安装PIL,是因为我读过一些人使用PIL的图像库进行安装的信息,但是一旦安装PIL,由于Pillow版本不兼容,我的另一部分代码就会中断.
My goal is to insert the plot image into the report.This works fine but I do not want to write to a temporary file.I tried installing PIL because I've read some people doing it with PIL's image library but as soon as I install PIL, I another part of my code breaks due to incompatible Pillow versions.
推荐答案
pdfrw文档很烂
有点模糊,因为 pdfrw 文档非常糟糕.由于文档太烂,该示例的作者@ Larry-Meyn使用rst2pdf的vectorpdf扩展名作为起点,并且 扩展名也没有真正记载,因此和处理rst2pdf和pdfrw的怪癖(而且比您需要的更通用,因为它可以使rst2pdf从既有PDF的arbitray页面显示任意矩形).拉里设法使它完全起作用真是太神奇了,而我的帽子也让他高兴了.
pdfrw documentation sucks
The sole reason the pdfrw example discussed in the first answer to this question is a bit klunky is because the pdfrw documentation sucks badly. Due to the sucky doc, that example's author @Larry-Meyn used the vectorpdf extension for rst2pdf as as starting point, and that extension is not really documented either, and has to deal with the quirks of rst2pdf as well as pdfrw (and is more general than you need, in that it can let rst2pdf display an arbitrary rectangle from an arbitray page of a preexisting PDF). It's amazing that Larry managed to make it work at all, and my hat's off to him.
我完全有资格这么说,因为我是pdfrw的作者,并且对rst2pdf做出了一些贡献,包括vectorpdf扩展名.
I am perfectly qualified to say this, because I am the author of pdfrw and made a few contributions to rst2pdf, including that vectorpdf extension.
直到一个月前,我才真正开始关注stackoverflow,而pdfrw本身却停滞了好几年,但我现在在这里,我认为您应该再看一下pdfrw,即使文档仍然糟透了.
I wasn't really paying attention to stackoverflow until a month ago, and pdfrw itself languished for a few years, but I'm here now, and I think it would behoove you to take another look at pdfrw, even though the documentation still sucks.
为什么? 因为如果输出到png文件,则图像将被栅格化,而如果使用pdfrw,则图像将保持为 vector格式,这意味着它将在任何规模上都很好看.
Why? Because if you output to a png file, your image will be rasterized, and if you use pdfrw, it will remain in vector format, which means that it will look nice at any scale.
您的png示例不是一个完整的程序-尚未定义doc.build的参数,未定义样式,缺少一些导入等.但是它足够接近以获得某些意图并使其正常工作.
Your png example wasn't quite a complete program -- the parameters to doc.build weren't defined, styles wasn't defined, it was missing a few imports, etc. But it was close enough to garner some intent and get it working.
编辑-我刚刚注意到该示例实际上是Larry的示例的修改版本,因此该示例仍然非常有价值,因为在某些方面它比此示例功能更全面.
Edit -- I just noticed that this example was actually a modified version of Larry's example, so that example is still very valuable because it's a bit more full-featured than this in some ways.
在我修复了这些问题并获得了一些输出之后,我添加了一个能够使用png或pdf的选项,因此您可以看到其中的区别.下面的程序将创建两个不同的PDF文件,您可以自己比较结果.
After I fixed those issues and got some output, I added an option to be able to use png or pdf, so you can see the difference. The program below will create two different PDF files, and you can compare the results for yourself.
import cStringIO
from matplotlib import pyplot as plt
from reportlab.pdfgen import canvas
from reportlab.lib.utils import ImageReader
from reportlab.platypus import Paragraph, SimpleDocTemplate, Spacer, Image, Flowable
from reportlab.lib.units import inch
from reportlab.lib.styles import getSampleStyleSheet
from pdfrw import PdfReader, PdfDict
from pdfrw.buildxobj import pagexobj
from pdfrw.toreportlab import makerl
styles = getSampleStyleSheet()
style = styles['Normal']
def form_xo_reader(imgdata):
page, = PdfReader(imgdata).pages
return pagexobj(page)
class PdfImage(Flowable):
def __init__(self, img_data, width=200, height=200):
self.img_width = width
self.img_height = height
self.img_data = img_data
def wrap(self, width, height):
return self.img_width, self.img_height
def drawOn(self, canv, x, y, _sW=0):
if _sW > 0 and hasattr(self, 'hAlign'):
a = self.hAlign
if a in ('CENTER', 'CENTRE', TA_CENTER):
x += 0.5*_sW
elif a in ('RIGHT', TA_RIGHT):
x += _sW
elif a not in ('LEFT', TA_LEFT):
raise ValueError("Bad hAlign value " + str(a))
canv.saveState()
img = self.img_data
if isinstance(img, PdfDict):
xscale = self.img_width / img.BBox[2]
yscale = self.img_height / img.BBox[3]
canv.translate(x, y)
canv.scale(xscale, yscale)
canv.doForm(makerl(canv, img))
else:
canv.drawImage(img, x, y, self.img_width, self.img_height)
canv.restoreState()
def make_report(outfn, use_pdfrw):
fig = plt.figure(figsize=(4, 3))
plt.plot([1,2,3,4],[1,4,9,26])
plt.ylabel('some numbers')
imgdata = cStringIO.StringIO()
fig.savefig(imgdata, format='pdf' if use_pdfrw else 'png')
imgdata.seek(0)
reader = form_xo_reader if use_pdfrw else ImageReader
image = reader(imgdata)
doc = SimpleDocTemplate(outfn)
style = styles["Normal"]
story = [Spacer(0, inch)]
img = PdfImage(image, width=200, height=200)
for i in range(10):
bogustext = ("Paragraph number %s. " % i)
p = Paragraph(bogustext, style)
story.append(p)
story.append(Spacer(1,0.2*inch))
story.append(img)
for i in range(10):
bogustext = ("Paragraph number %s. " % i)
p = Paragraph(bogustext, style)
story.append(p)
story.append(Spacer(1,0.2*inch))
doc.build(story)
make_report("hello_png.pdf", False)
make_report("hello_pdf.pdf", True)
这种方法的缺点是什么?
第一个明显的缺点是现在对pdfrw有要求,但是PyPI可以提供.
What are the downsides to this approach?
The first obvious downside is that there is now a requirement for pdfrw, but that's available from PyPI.
下一个缺点是,如果您将大量matplotlib图放入文档中,我认为该技术将复制字体等资源,因为我认为reportlab不够聪明,无法注意到重复项.
The next downside is that if you are putting a lot of matplotlib plots into a document, I think this technique will replicate resources such as fonts, because I don't believe that reportlab is smart enough to notice the duplicates.
我相信可以通过将所有图输出到单个PDF的不同页面上来解决此问题. /a>.我实际上还没有使用matplotlib尝试过,但是pdfrw完全能够转换将现有pdf的每一页转换为单独的可流动文件.
I believe this problem can be solved by outputting all your plots to different pages of a single PDF. I haven't actually tried that with matplotlib, but pdfrw is perfectly capable of converting each page of an existing pdf to a separate flowable.
因此,如果您有很多图并且使最终的PDF太大,则可以进行研究,或者尝试其中一种PDF优化器,看看是否有帮助.无论如何,在不同的日子里这都是一个不同的问题.
So if you have a lot of plots and it's making your final PDF too big, you could look into that, or just try one of the PDF optimizers out there and see if it helps. In any case, that's a different problem for a different day.
这篇关于将matplotlib对象加载到reportlab的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!