我有一个约140万个节点数据的json文件,我想为此构建一个Neo4j图形数据库。我试图使用py2neo的批处理提交功能。我的代码如下:
# the variable words is a list containing node names
from py2neo import neo4j
batch = neo4j.WriteBatch(graph_db)
nodedict = {}
# I decided to use a dictionary because I would be creating relationships
# by referring to the dictionary entries later
for i in words:
nodedict[i] = batch.create({"name":i})
results = batch.submit()
显示的错误如下:
Traceback (most recent call last):
File "test.py", line 36, in <module>
results = batch.submit()
File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2116, in submit
for response in self._submit()
File "/usr/lib/python2.6/site-packages/py2neo/neo4j.py", line 2085, in _submit
for id_, request in enumerate(self.requests)
File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 427, in _send
return self._client().send(request)
File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 364, in send
return Response(request.graph_db, rs.status, request.uri, rs.getheader("Loc$
File "/usr/lib/python2.6/site-packages/py2neo/rest.py", line 278, in __init__
raise SystemError(body)
SystemError: None
有人可以告诉我这里到底发生了什么吗?它与批处理查询很大有关吗?如果可以,该怎么办?提前致谢! :)
最佳答案
因此,这就是我的想法(感谢此问题:py2neo - Neo4j - System Error - Create Batch Nodes/Relationships):
py2neo批处理提交功能在可以进行的查询方面有其自身的局限性。虽然无法获得确切的上限数量,但我尝试将每批查询的数量限制为5000。因此,我决定运行以下代码:
# the variable words is a list containing node names
from py2neo import neo4j
batch = neo4j.WriteBatch(graph_db)
nodedict = {}
# I decided to use a dictionary because I would be creating relationships
# by referring to the dictionary entries later
for index, i in enumerate(words):
nodedict[i] = batch.create({"name":i})
if index%5000 == 0:
batch.submit()
batch = neo4j.WriteBatch(graph_db) # As stated by Nigel below, I'm creating a new batch
batch.submit() #for the final batch
这样,我发送了批处理请求(大小为5k的查询),并成功地创建了整个图形!