FWorkbook和WorkbookFactory阅读XLSX文

FWorkbook和WorkbookFactory阅读XLSX文

本文介绍了慢XSSFWorkbook和WorkbookFactory阅读XLSX文件时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我见过的开发者有这个问题。我研究了很多论坛和官方文件的POI。不过我还没有找到一个答案。
所以,问题是..我曾尝试以下两个片段:

I've seen developers have had this problem since a few years ago. I have studied many forums and the official POI documents. Nonetheless I haven't found an answer yet.So the problem is.. I have tried the following two snippets:

Workbook wb = WorkbookFactory.create(new File("spreadsheet.xlsx"));

File file = new File("C:\\spreadsheet.xlsx");
OPCPackage opcPackage = OPCPackage.open(file.getAbsolutePath());
XSSFWorkbook workbook = new XSSFWorkbook(opcPackage);

和任一方法约需5-6min(如果应用程序没有用完的内存)来处理一个简单而相当小s preadsheet.xlsx文件(200KB)。

and either of the approaches takes about 5-6min (if the application doesn't run out of memory) to process a simple and fairly small spreadsheet.xlsx file (200KB).

什么是我需要做来解决这个问题? (我使用Apache POI 3.9)

What do I need to do to fix this? (I'm using Apache POI 3.9)

/*****************************/

这个过程需要很长的时间在以下位置:

The process takes a long time in the following location:

public class XSSFSheet extends POIXMLDocumentPart implements Sheet{
...
protected void read(InputStream is) throws IOException {
    try {
      -->>> worksheet = WorksheetDocument.Factory.parse(is).getWorksheet();
    } catch (XmlException e){
        throw new POIXMLException(e);
    }
}
...

我不能进一步调试。在VisualVM的还表示,同样的事情..!

I can't debug further. The VisualVM also says the same thing..!

推荐答案

这可能是导致加载时间的一个因素是,该数据已被粘贴到工作表中,这样的使用范围包括每一行,即当您使用在sheet.usedrange行数返回> 1,000,000行..不知道这是如何发生,但我发现我需要的,其中在装载我通过使用一些VBA脚本'干净'了工作簿执行一个中间步骤。工作簿具有围绕每个周围5000行,其中每一个是由不同的部件填写Ó业务的20张,它需要一个相当长的时间(也许4分钟)加载,但在这种情况下,这是可以接受的。在我加入它跑30分钟以上的清洁阶段,这是不能接受的......

One factor that might be contributing to the load time is that the data has been pasted into the worksheet so that the used range includes every row, ie when you use the sheet.usedrange rows count it returns > 1,000,000 rows.. Not sure how this happens but I found that I needed to perform an intermediary step wherein prior to loading the workbook I 'cleaned' it by using some vba script. The workbook has around 20 sheets of around 5000 rows each, each of which are filled out by different parts o the business, and it takes a fairly long time (maybe 4 minutes) to load but that is acceptable in this case. Before I added the cleaning stage it ran for over 30 minutes, which was not acceptable....

一个用户运行我指的是,卜pressing两个按钮的过程。第一次清洗,第二​​个没有休息。使用Runtime.getruntime.exec被触发的第一个进程,并创建一个空的文本文件,第二个进程将无法运行,除非测试文件是存在的。

A user runs the process I am referring to, bu pressing two buttons. The first cleans, the second does the rest. The first process is triggered using Runtime.getruntime.exec and creates an empty text file that the second process will not run unless the test file is there.

这篇关于慢XSSFWorkbook和WorkbookFactory阅读XLSX文件时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-30 19:52