将Excel或类似CSV的条目插入neo4j图形时，避免重复的整体

本文介绍了将Excel或类似CSV的条目插入neo4j图形时，避免重复的整体的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下.xslx文件:

I have the following .xslx file:

无论使用哪种语言，我的软件都将返回以下图形:

My software regardless tis language will return the following graph:

我的软件逐行迭代，并且每行迭代都会执行以下查询

My software iterates line by line and on each line iteration executes the following query

MERGE (A:POINT {x:{xa},y:{ya}}) MERGE (B:POINT {x:{xb},y:{yb}}) MERGE (C:POINT {x:{xc},y:{yc}}) MERGE (A)-[:LINKS]->(B)-[:LINKS]->(C) MERGE (C)-[:LINKS]->(A)

通过插入重复项可以避免这种情况吗?

Will this avoid by inserting duplicate entries?

推荐答案

根据，是的，它将避免编写重复的条目.

According to this question, yes it will avoid writing duplicate entries.

上面的查询将匹配任何现有节点，并且将避免写入重复项.

The query above will match any existing nodes and it will avoid to write duplicates.

每个节点上都有一个很好的经验法则，可以将其复制到单独的MERGE查询中，然后再为2个节点之间的每个关系编写合并语句.

A good rule of thumb is on each node that it may be a duplicate write it into a seperate MERGE query and afterwards write the merge statements for each relationship between 2 nodes.

在使用异步技术(例如 nodejs 或什至是并行线程，必须验证是否已读完插入上一行的之后行.原因是因为异步进行多次插入可能导致图中的多个节点实际上是相同的.

After some experiece when using asyncronous technologies such nodejs or even parallel threads you must verify that you read the next line AFTER you inserted the previous one. The reason why is because is that doing multiple insertions asyncronously may result having multiple nodes into your graph that are actually the same ones.

在我的node.js项目中，我读取了excell文件，如下所示:

In node.js project of mine I read the excell file like:

const iterateWorksheet=function(worksheet,maxRows,row,callback){

process.nextTick(function(){
  //Skipping first row
  if(row==1){
    return iterateWorksheet(worksheet,maxRows,2,callback);
  }

  if(row > maxRows){
    return;
  }

  const alphas=_.range('A'.charCodeAt(0),config.excell.maxColumn.charCodeAt(0));

  let rowData={};

  _.each(alphas,(column) => {
    column=String.fromCharCode(column);
    const item=column+row;
    const key=config.excell.columnMap[column];
    if(worksheet[item] && key ){
      rowData[key]=worksheet[item].v;
    }
  });

  // The callback is the isertion over a neo4j db
  return callback(rowData,(error)=>{
    if(!error){
      return iterateWorksheet(worksheet,maxRows,row+1,callback);
    }
  });
});


 }

如您所见，当我成功插入上一行时，我将访问下一行.我找不到像大多数常规RDBMS一样对插入序列进行序列化的方法.

As you see I visit the next line when I successfully inserted the previous one. I find no way yet to serialize the inserts like most conventional RDBMS's does.

在案例或Web或服务器应用程序中，另一种 UNTESTED 方法是使用RabbitMQ之类的队列服务器或类似的服务器来对查询进行排队.然后，负责插入的代码将从队列中读取，因此整个隔离应该在队列中.

In case or web or server applications another UNTESTED approach is to use queue servers such as RabbitMQ or similar in order to queue the queries. Then the code responsimble for insertion will read from the queue so the whole isolation should be in the queue.

此外，请确保所有插入内容都已放入事务中.

Furthermore ensure that all inserts are into a transaction.

这篇关于将Excel或类似CSV的条目插入neo4j图形时，避免重复的整体的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！