本文介绍了使用cypher和apoc将数百万个节点添加到neo4j空间层的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个380万个节点的数据集,我正试图将所有这些数据加载到Neo4j空间中.节点将进入简单的点层,因此具有必需的纬度和经度字段.我尝试过:

I have a data set of 3.8million nodes and I'm trying to load all of these into Neo4j spatial. The nodes are going into a simple point layer, so have the required latitude and longitude fields. I've tried:

MATCH (d:pointnode)
WITH collect(d) as pn
CALL spatial.addNodes("point_geom", pn) yield count return count

但是,这一直没有发生任何事情.我也尝试过(我一直在一行上运行下一个查询,但是为了便于阅读,我将其拆分了一下):

But this just keeps spinning without anything happening. I've also tried (I've been running the next query all on one line, but I've just split it up for ease of reading):

CALL apoc.periodic.iterate("MATCH (d:pointnode)
WITH collect(d) AS pnodes return pnodes",
"CALL spatial.addNodes('point_geom', pnodes) YIELD count return count",
{batchSize:10000, parallel:false, listIterate:true})

但是仍然有很多旋转和偶然的JAVA堆错误.

But again a lot of spinning and the occasional JAVA heap error.

我尝试的最后一种方法是将FME与HTTP调用程序一起使用,这种方法虽然有效,但速度异常慢,因此无法很好地扩展数百万个节点.

The final approach I tried was to use FME with the HTTP caller, this works but is exceptionally slow so doesn't scale well for millions of nodes.

任何建议都将不胜感激.比定期迭代,apoc.periodic.commit或apoc.periodic.rock_n_roll是更好的选择吗?

Any advice or suggestions would be much appreciated. Would apoc.periodic.commit or apoc.periodic.rock_n_roll be a better choice than periodic iterate?

推荐答案

经过反复试验和定期提交,导致了相对较快的解决方案(仍然需要2-3个小时)

After a bit of trial and error periodic commit has led to a relatively quick solution (still going to take 2-3 hours)

call apoc.periodic.commit("match (n:pointnode)
where not (n)-[:RTREE_REFERENCE]-() with n limit {limit}
WITH collect(n) AS pnodes
CALL spatial.addNodes('point_geom', pnodes) YIELD count return count",
{limit:1000})

批量较大时可能更快

批量编辑为5000次需要45分钟

EDIT with a batch size of 5000 it takes 45 minutes

这篇关于使用cypher和apoc将数百万个节点添加到neo4j空间层的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

06-19 15:47