本文介绍了Symfony:Doctrine数据夹具:如何处理大型csv文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用doctrine数据夹具从大型CSV文件(3Mo / 37000行/ 7列)中插入(在mySQL数据库中)数据。





我认为doctrine数据夹具不是旨在管理这种数量的数据?也许解决方案应该是直接导入我的csv到数据库?



有任何想法如何进行吗?



以下是代码:

 <?php 

命名空间FBN \GuideBundle\DataFixtures\ORM;

使用Doctrine\Common\DataFixtures\AbstractFixture;
use Doctrine\Common\DataFixtures\OrderedFixtureInterface;
use Doctrine\Common\Persistence\ObjectManager;
使用FBN\GuideBundle\Entity\CoordinatesFRCity作为CoordFRCity;

class CoordinatesFRCity extends AbstractFixture implements OrderedFixtureInterface
{
public function load(ObjectManager $ manager)
{
$ csv = fopen(dirname(__ FILE__)。 '/Resources/Coordinates/CoordinatesFRCity.csv','r');

$ i = 0;

while(!feof($ csv)){
$ line = fgetcsv($ csv);

$ coordinatesfrcity [$ i] = new CoordFRCity();
$ coordinatesfrcity [$ i] - > setAreaPre2016($ line [0]);
$ coordinatesfrcity [$ i] - > setAreaPost2016($ line [1]);
$ coordinatesfrcity [$ i] - > setDeptNum($ line [2]);
$ coordinatesfrcity [$ i] - > setDeptName($ line [3]);
$ coordinatesfrcity [$ i] - > setdistrict($ line [4]);
$ coordinatesfrcity [$ i] - > setpostCode($ line [5]);
$ coordinatesfrcity [$ i] - > setCity($ line [6]);

$ manager-> persist($ coordinatesfrcity [$ i]);

$ this-> addReference('coordinatesfrcity - '。$ i,$ coordinatesfrcity [$ i]);


$ i = $ i + 1;
}

fclose($ csv);

$ manager-> flush();
}

public function getOrder()
{
return 1;
}
}


解决方案

两当您创建大批量导入时,请遵循以下规则:




  • 禁用SQL日志:( $ manager - > getConnection() - > getConfiguration() - > setSQLLogger(null); ),以避免大量内存损失。


  • p>频繁刷新和清除,而不是在结束时只刷一次。我建议你添加 if($ i%25 = 0){$ manager-> flush();












$ b

编辑:最后一件事我忘了:当你不再需要它们时,不要将你的实体保存在变量中。这里,在你的循环中,你只需要被处理的当前实体,所以不要将以前的实体存储在 $ coordinatesfrcity 数组中。如果你这么做,这可能会导致内存溢出。


I am trying to insert (in a mySQL database) datas from a "large" CSV file (3Mo / 37000 lines / 7 columns) using doctrine data fixtures.

The process is very slow and at this time I could not succeed (may be I had to wait a little bit more).

I suppose that doctrine data fixtures are not intended to manage such amount of datas ? Maybe the solution should be to import directly my csv into database ?

Any idea of how to proceed ?

Here is the code :

<?php

namespace FBN\GuideBundle\DataFixtures\ORM;

use Doctrine\Common\DataFixtures\AbstractFixture;
use Doctrine\Common\DataFixtures\OrderedFixtureInterface;
use Doctrine\Common\Persistence\ObjectManager;
use FBN\GuideBundle\Entity\CoordinatesFRCity as CoordFRCity;

class CoordinatesFRCity extends AbstractFixture implements OrderedFixtureInterface
{
    public function load(ObjectManager $manager)
    {
        $csv = fopen(dirname(__FILE__).'/Resources/Coordinates/CoordinatesFRCity.csv', 'r');

        $i = 0;

        while (!feof($csv)) {
            $line = fgetcsv($csv);

            $coordinatesfrcity[$i] = new CoordFRCity();
            $coordinatesfrcity[$i]->setAreaPre2016($line[0]);
            $coordinatesfrcity[$i]->setAreaPost2016($line[1]);
            $coordinatesfrcity[$i]->setDeptNum($line[2]);
            $coordinatesfrcity[$i]->setDeptName($line[3]);
            $coordinatesfrcity[$i]->setdistrict($line[4]);
            $coordinatesfrcity[$i]->setpostCode($line[5]);
            $coordinatesfrcity[$i]->setCity($line[6]);

            $manager->persist($coordinatesfrcity[$i]);

            $this->addReference('coordinatesfrcity-'.$i, $coordinatesfrcity[$i]);


            $i = $i + 1;
        }

        fclose($csv);

        $manager->flush();
    }

    public function getOrder()
    {
        return 1;
    }
}
解决方案

Two rules to follow when you create big batch imports like this:

  • Disable SQL Logging: ($manager->getConnection()->getConfiguration()->setSQLLogger(null);) to avoid huge memory loss.

  • Flush and clear frequently instead of only once at the end. I suggest you add if ($i % 25 = 0) { $manager->flush(); $manager->clear() } inside your loop, to flush every 25 INSERTs.

EDIT: One last thing I forgot: don't keep your entities inside variables when you don't need them anymore. Here, in your loop, you only need the current entity that is being processed, so don't store previous entity in a $coordinatesfrcity array. This might lead you to memory overflow if you keep doing that.

这篇关于Symfony:Doctrine数据夹具:如何处理大型csv文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-02 03:46
查看更多