我有一个约140万个节点数据的json文件,我想为此构建一个Neo4j图形数据库。我试图使用py2neo的批处理提交功能。我的代码如下:

# the variable words is a list containing node names
from py2neo import neo4j
batch = neo4j.WriteBatch(graph_db)
nodedict = {}
# I decided to use a dictionary because I would be creating relationships
# by referring to the dictionary entries later
for i in words:
    nodedict[i] = batch.create({"name":i})
results = batch.submit()


显示的错误如下:

Traceback (most recent call last):
  File "test.py", line 36, in <module>
    results = batch.submit()
  File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2116, in submit
    for response in self._submit()
  File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2085, in _submit
    for id_, request in enumerate(self.requests)
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 427, in _send
    return self._client().send(request)
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 364, in send
    return Response(request.graph_db, rs.status, request.uri, rs.getheader("Loc$
  File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 278, in __init__
    raise SystemError(body)
SystemError: None


有人可以告诉我这里到底发生了什么吗?它与批处理查询很大有关吗?如果可以,该怎么办?提前致谢! :)

最佳答案

因此,这就是我的想法(感谢此问题:py2neo - Neo4j - System Error - Create Batch Nodes/Relationships):

py2neo批处理提交功能在可以进行的查询方面有其自身的局限性。虽然无法获得确切的上限数量,但我尝试将每批查询的数量限制为5000。因此,我决定运行以下代码:

# the variable words is a list containing node names
from py2neo import neo4j
batch = neo4j.WriteBatch(graph_db)
nodedict = {}
# I decided to use a dictionary because I would be creating relationships
# by referring to the dictionary entries later

for index, i in enumerate(words):
    nodedict[i] = batch.create({"name":i})
    if index%5000 == 0:
        batch.submit()
        batch = neo4j.WriteBatch(graph_db) # As stated by Nigel below, I'm creating a new batch
batch.submit() #for the final batch


这样,我发送了批处理请求(大小为5k的查询),并成功地创建了整个图形!

09-27 11:54