问题描述
我有 13 个 .xlsx 文件,每个文件大约有 1000 行.现在我想用一张纸将它合并到一个 .xlsx 文件中.我正在使用这里的代码https://blog.sodhanalibrary.com/2014/11/merge-excel-files-using-java.html#.Vi9ns36rSUk.
I have 13 .xlsx files with about 1000 rows in each of them. Now I want to merge it to one .xlsx file with one sheet. I'm using code from herehttps://blog.sodhanalibrary.com/2014/11/merge-excel-files-using-java.html#.Vi9ns36rSUk.
这是我的代码(改动很少,addSheet 方法不变)
Here's my code (few changes, addSheet method unchanged)
try {
FileInputStream excellFile1 = new FileInputStream(new File("tmp_testOut1000.xlsx"));
XSSFWorkbook workbook1 = new XSSFWorkbook(excellFile1);
XSSFSheet sheet1 = workbook1.getSheetAt(0);
for(int i = 2; i < 14; i++){
FileInputStream excellFile2 = new FileInputStream(new File("tmp_testOut" + i + "000.xlsx"));
XSSFWorkbook workbook2 = new XSSFWorkbook(excellFile2);
XSSFSheet sheet2 = workbook2.getSheetAt(0);
System.out.println("add " + i);
addSheet(sheet1, sheet2);
}
excellFile1.close();
// save merged file
System.out.println("merging");
File mergedFile = new File("merged.xlsx");
if (!mergedFile.exists()) {
mergedFile.createNewFile();
}
FileOutputStream out = new FileOutputStream(mergedFile);
System.out.println("write");
workbook1.write(out);
out.close();
System.out.println("Files were merged succussfully");
} catch (Exception e) {
e.printStackTrace();
}
所有文件都在加载和合并,但在写入"之后我得到的系统输出
All files are loading and merging but after "write" sysout I'm getting
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.apache.xmlbeans.impl.store.Xobj.new_cursor(Xobj.java:1829)
at org.apache.xmlbeans.impl.values.XmlObjectBase.newCursor(XmlObjectBase.java:293)
at org.apache.xmlbeans.impl.values.XmlComplexContentImpl.arraySetterHelper(XmlComplexContentImpl.java:1151)
at org.openxmlformats.schemas.spreadsheetml.x2006.main.impl.CTFontsImpl.setFontArray(Unknown Source)
at org.apache.poi.xssf.model.StylesTable.writeTo(StylesTable.java:424)
at org.apache.poi.xssf.model.StylesTable.commit(StylesTable.java:496)
at org.apache.poi.POIXMLDocumentPart.onSave(POIXMLDocumentPart.java:341)
at org.apache.poi.POIXMLDocumentPart.onSave(POIXMLDocumentPart.java:345)
at org.apache.poi.POIXMLDocument.write(POIXMLDocument.java:206)
at Start.main(Start.java:275)
我能做什么?为什么会发生这种情况以及如何预防?
What can I do? Why is this happening and how to prevent it?
推荐答案
众所周知,POI 非常需要内存,因此在处理大型 Excel 文件时内存不足的情况并不少见.
POI is notoriously memory-hungry, so running out of memory is not uncommon when handling large Excel-files.
当您能够加载所有原始文件并且仅在编写合并文件时遇到问题时,您可以尝试使用 SXSSFWorkbook
而不是 XSSFWorkbook
并在添加后定期刷新一定数量的内容(参见 org.apache.poi.xssf.streaming
包的 poi 文档).这样您就不必将整个生成的文件保存在内存中,而只保存一小部分.
When you are able to load all original files and only get trouble writing the merged file you could try using an SXSSFWorkbook
instead of an XSSFWorkbook
and do regular flushes after adding a certain amount of content (see poi-documentation of the org.apache.poi.xssf.streaming
-package). This way you will not have to keep the whole generated file in memory but only small portions.
这篇关于Apache POI 超出了 GC 开销限制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!