问题描述
我有一个要导出到Excel的数据框,人们希望在.xlsx中使用它.我使用 to_excel
,但是当我将扩展名从.xls更改为.xlsx时,导出步骤大约需要9秒,而不是1秒.导出到.csv甚至更快,我认为这是因为它只是一种特殊格式的文本文件.
I have a dataframe that I'm exporting to Excel, and people want it in .xlsx. I use to_excel
, but when I change the extension from .xls to .xlsx, the exporting step takes about 9 seconds as opposed to 1 second. Exporting to a .csv is even faster, which I believe is due to the fact that it's just a specially formatted text file.
也许.xlsx文件刚刚添加了很多功能,因此写入它们所需的时间更长,但是我希望可以做些什么来防止这种情况.
Perhaps the .xlsx files just added a lot more features so it takes longer to write to them, but I'm hoping there is something I can do to prevent this.
推荐答案
Pandas默认使用OpenPyXL编写xlsx文件,其速度可能比用于编写xls文件的xlwt模块要慢.
Pandas defaults to using OpenPyXL for writing xlsx files which can be slower than than the xlwt module used for writing xls files.
使用 XlsxWriter 作为xlsx输出引擎来尝试:
Try it instead with XlsxWriter as the xlsx output engine:
df.to_excel('file.xlsx', sheet_name='Sheet1', engine='xlsxwriter')
它应该和xls引擎一样快.
It should be as fast as the xls engine.
这篇关于为什么将数据导出到.xlsx比导出到.xls或.csv这么慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!