我有两个哈希映射finalOldCsv和finalNewCsv。这些映射存储从旧csv和新csv读取的值。行,我的代码工作正常。但是,当我尝试在一百万行的csv上执行相同代码时,会产生错误的结果。
private static void findDiff(LinkedHashMap<String, Integer> finalOldCsv,
LinkedHashMap<String, Integer> finalNewCsv) {
for(String test:finalOldCsv.keySet())
{
System.out.println("first row from old="+finalOldCsv.get(test));
}
for(String test1:finalNewCsv.keySet())
{
System.out.println("first row from new="+finalNewCsv.get(test1));
}
ArrayList<String>temp=new ArrayList<String>();
for(String oldMatch : finalNewCsv.keySet())
{
if(oldMatch.contains(column[0]))
continue;
else
{
if (finalNewCsv.containsKey(oldMatch)&& finalOldCsv.containsKey(oldMatch))
{
System.out.println("Match Found");
writeCsv(writer,"Result/"+prefix+"_", oldMatch,"Common Rows");
temp.add(oldMatch);
}
}
}
System.out.println("before old csv size="+finalOldCsv.size());
for(String t:temp)
{
finalNewCsv.remove(t);
finalOldCsv.remove(t);
}
System.out.println("after old csv size="+finalOldCsv.size());
temp.clear();
for(String newMatch : finalNewCsv.keySet())
{
if(newMatch.contains(column[0]))
continue;
else
{
if (!finalOldCsv.containsKey(newMatch)&& finalNewCsv.containsKey(newMatch))
{
writeCsv(writer,"Result/"+prefix+"_", newMatch,"New Rows in New Table");
temp.add(newMatch);
}
}
}
for(String t:temp)
{
finalNewCsv.remove(t);
}
temp.clear();
System.out.println("finalOldCsv.keySet().size()"+finalOldCsv.keySet().size());
for(String restFromOldTable:finalOldCsv.keySet())
{
if(restFromOldTable.contains(column[0]))
continue;
else
// if()
writeCsv(writer,"Result/"+prefix+"_", restFromOldTable,"Rows from Old Table");
}
}
最佳答案
我认为您使事情变得更加复杂了。
例如,当您遍历finalNewCsv
语句中的if
时,您具有此finalNewCsv.containsKey(oldMatch)
,这不是必需的,因为它将始终为true
整个方法可以简化为:
Iterator<Map.Entry<String, Integer>> it = oldMan.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<String, Integer> entry = it.next();
if (newMap.containsKey(entry.getKey())) {
it.remove();
commonEntries.put(entry.getKey(), entry.getValue());
newMap.remove(entry.getKey());
}
}
这样做是将
oldMap
和newMap
中的所有相似键添加到commonEntries
映射。我不能完全确定findDiff()
是否应该执行此操作(该方法的名称具有误导性)。