我不明白XSLX表的大小约为3MB,但即使1024MB的RAM也不足以使PHPExcel将该表加载到内存中?

我在这里可能做错了什么:

function ReadXlsxTableIntoArray($theFilePath)
{
    require_once('PHPExcel/Classes/PHPExcel.php');
    $inputFileType = 'Excel2007';
    $objReader = PHPExcel_IOFactory::createReader($inputFileType);
    $objReader->setReadDataOnly(true);
    $objPHPExcel = $objReader->load($theFilePath);
    $rowIterator = $objPHPExcel->getActiveSheet()->getRowIterator();
    $arrayData = $arrayOriginalColumnNames = $arrayColumnNames = array();
    foreach($rowIterator as $row){
        $cellIterator = $row->getCellIterator();
        $cellIterator->setIterateOnlyExistingCells(false); // Loop all cells, even if it is not set
        if(1 == $row->getRowIndex ()) {
            foreach ($cellIterator as $cell) {
                $value = $cell->getCalculatedValue();
                $arrayOriginalColumnNames[] = $value;
                // let's remove the diacritique
                $value = iconv('UTF-8', 'ISO-8859-1//TRANSLIT', $value);
                // and white spaces
                $valueExploded = explode(' ', $value);
                $value = '';
                // capitalize the first letter of each word
                foreach ($valueExploded as $word) {
                    $value .= ucfirst($word);
                }
                $arrayColumnNames[] = $value;
            }
            continue;
        } else {
            $rowIndex = $row->getRowIndex();
            reset($arrayColumnNames);
            foreach ($cellIterator as $cell) {
                $arrayData[$rowIndex][current($arrayColumnNames)] = $cell->getCalculatedValue();
                next($arrayColumnNames);
            }
        }
    }
    return array($arrayOriginalColumnNames, $arrayColumnNames, $arrayData);
}

上面的函数将数据从excel表读取到数组。

有什么建议?

首先,我允许PHP使用256MB的RAM。这还不够。然后,我将容量增加了一倍,然后又尝试了1024MB。它仍然会因以下错误而用完内存:
Fatal error: Allowed memory size of 1073741824 bytes exhausted (tried to allocate 50331648 bytes) in D:\data\o\WebLibThirdParty\src\PHPExcel\Classes\PHPExcel\Reader\Excel2007.php on line 688

Fatal error (shutdown): Allowed memory size of 1073741824 bytes exhausted (tried to allocate 50331648 bytes) in D:\data\o\WebLibThirdParty\src\PHPExcel\Classes\PHPExcel\Reader\Excel2007.php on line 688

最佳答案

在PHPExcel论坛上已经有很多关于PHPExcel的内存使用的文章。因此通读之前的一些讨论可能会给您一些想法。 PHPExcel保留电子表格的“内存中”表示形式,并且容易受到PHP内存限制的影响。

文件的物理大小在很大程度上无关紧要……了解文件包含多少个单元格(每个工作表上的行*列)更为重要。

我一直使用的“经验法则”平均约为每单元1k,因此5M单元工作簿将需要5GB内存。但是,可以通过多种方法来减少该需求。可以将它们组合在一起,具体取决于您需要在工作簿中访问哪些信息以及您要如何处理。

如果您有多个工作表,但不需要全部加载,则可以使用setLoadSheetsOnly()方法来限制Reader将加载的工作表。
加载单个命名工作表:

$inputFileType = 'Excel5';
$inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #2';
/**  Create a new Reader of the type defined in $inputFileType  **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/**  Advise the Reader of which WorkSheets we want to load  **/
$objReader->setLoadSheetsOnly($sheetname);
/**  Load $inputFileName to a PHPExcel Object  **/
$objPHPExcel = $objReader->load($inputFileName);

或者,您可以通过传递名称数组来一次调用setLoadSheetsOnly()来指定多个工作表:
$inputFileType = 'Excel5';
$inputFileName = './sampleData/example1.xls';
$sheetnames = array('Data Sheet #1','Data Sheet #3');
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader of which WorkSheets we want to load **/
$objReader->setLoadSheetsOnly($sheetnames);
/**  Load $inputFileName to a PHPExcel Object  **/
$objPHPExcel = $objReader->load($inputFileName);

如果您只需要访问工作表的一部分,则可以定义一个“读取筛选器”以仅标识您实际要加载的单元格:
$inputFileType = 'Excel5';
$inputFileName = './sampleData/example1.xls';
$sheetname = 'Data Sheet #3';

/**  Define a Read Filter class implementing PHPExcel_Reader_IReadFilter  */
class MyReadFilter implements PHPExcel_Reader_IReadFilter {
    public function readCell($column, $row, $worksheetName = '') {
        //  Read rows 1 to 7 and columns A to E only
        if ($row >= 1 && $row <= 7) {
           if (in_array($column,range('A','E'))) {
              return true;
           }
        }
        return false;
    }
}

/**  Create an Instance of our Read Filter  **/
$filterSubset = new MyReadFilter();
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/**  Advise the Reader of which WorkSheets we want to load
     It's more efficient to limit sheet loading in this manner rather than coding it into a Read Filter  **/
$objReader->setLoadSheetsOnly($sheetname);
echo 'Loading Sheet using filter';
/**  Tell the Reader that we want to use the Read Filter that we've Instantiated  **/
$objReader->setReadFilter($filterSubset);
/**  Load only the rows and columns that match our filter from $inputFileName to a PHPExcel Object  **/
$objPHPExcel = $objReader->load($inputFileName);

使用读取过滤器,您还可以“大块”地读取工作簿,以便在任何时候都只有一个块驻留在内存中:
$inputFileType = 'Excel5';
$inputFileName = './sampleData/example2.xls';

/**  Define a Read Filter class implementing PHPExcel_Reader_IReadFilter  */
class chunkReadFilter implements PHPExcel_Reader_IReadFilter {
    private $_startRow = 0;
    private $_endRow = 0;

    /**  Set the list of rows that we want to read  */
    public function setRows($startRow, $chunkSize) {
        $this->_startRow    = $startRow;
        $this->_endRow      = $startRow + $chunkSize;
    }

    public function readCell($column, $row, $worksheetName = '') {
        //  Only read the heading row, and the rows that are configured in $this->_startRow and $this->_endRow
        if (($row == 1) || ($row >= $this->_startRow && $row < $this->_endRow)) {
           return true;
        }
        return false;
    }
}

/**  Create a new Reader of the type defined in $inputFileType  **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/**  Define how many rows we want to read for each "chunk"  **/
$chunkSize = 20;
/**  Create a new Instance of our Read Filter  **/
$chunkFilter = new chunkReadFilter();
/**  Tell the Reader that we want to use the Read Filter that we've Instantiated  **/
$objReader->setReadFilter($chunkFilter);

/**  Loop to read our worksheet in "chunk size" blocks  **/
/**  $startRow is set to 2 initially because we always read the headings in row #1  **/
for ($startRow = 2; $startRow <= 65536; $startRow += $chunkSize) {
    /**  Tell the Read Filter, the limits on which rows we want to read this iteration  **/
    $chunkFilter->setRows($startRow,$chunkSize);
    /**  Load only the rows that match our filter from $inputFileName to a PHPExcel Object  **/
    $objPHPExcel = $objReader->load($inputFileName);
    //    Do some processing here

    //    Free up some of the memory
    $objPHPExcel->disconnectWorksheets();
    unset($objPHPExcel);
}

如果您不需要加载格式信息,而只需要加载工作表数据,则setReadDataOnly()方法将告诉读者仅加载单元格值,而忽略任何单元格格式:
$inputFileType = 'Excel5';
$inputFileName = './sampleData/example1.xls';
/** Create a new Reader of the type defined in $inputFileType **/
$objReader = PHPExcel_IOFactory::createReader($inputFileType);
/** Advise the Reader that we only want to load cell data, not formatting **/
$objReader->setReadDataOnly(true);
/**  Load $inputFileName to a PHPExcel Object  **/
$objPHPExcel = $objReader->load($inputFileName);

使用单元缓存。这是一种减少每个单元所需的PHP内存的方法,但是会降低速度。它通过以压缩格式存储单元对象或在PHP内存(例如磁盘,APC,内存缓存)之外存储单元对象而工作...但是,保存的内存越多,脚本执行的速度就越慢。但是,您可以将每个单元所需的内存减少到大约300bytes,因此假设的5M单元将需要大约1.4GB的PHP内存。

单元缓存在开发人员文档的4.2.1节中进行了描述

编辑

查看您的代码,您将使用效率不高的迭代器,并构建单元数据数组。您可能想要查看toArray()方法,该方法已经内置在PHPExcel中,并且可以为您完成此操作。还可以在SO上的recent discussion上查看有关新的变体方法rangeToArray()的信息,以构建行数据的关联数组。

关于php - PHPExcel用尽了256、512和1024MB的RAM,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/4817651/

10-13 01:38