我有两个csv文件:

csv1
csv2

(*注释标题可以不同)

csv1有1个单列csv2有5个列

现在,csv1的第1列在csv2的第2列中具有一些匹配值

我关心的是如何编写csv1的column1没有与csv2的column2匹配的值的csv

我已经附加了三个文件csv1,csv2和预期的输出。

预期产量:

ProfileID,id,name,class ,rollnumber
1,lkha,prince,sfasd,DAS
2,hgfhfk,kabir,AD,AD
5,jlh,antriskh,ASDA,AD


CSV 1:

id,name
10927,prince
109582,kabir
f546416,rahul
g44674,saini
r7341,antriskh


CSV 2:

ProfileID,id,name,class ,rollnumber
1,lkha,prince,sfasd,DAS
2,hgfhfk,kabir,AD,AD
3,f546416,rahul,AD,FF
44,g44674,saini,DD,FF
5,jlh,antriskh,ASDA,AD


我尝试使用将它们转换成字典并将它们与csv1键匹配为csv2值,但是无法正常工作

def read_csv1(filename):
    prj_structure = {}
    f = open(filename, "r")
    data = f.read()
    f.close()
    lst = data.split("\n")
    prj = ""
    for i in range(0, len(lst)):
        val = lst[i].split(",")
        if len(val)>0:
            prj = val[0]
        if prj!="":
            if prj not in prj_structure.keys():
                prj_structure[prj] = []
             prj_structure[prj].append([val[1], val[2], val[3], val[4])
    return prj_structure


def read_csv2(filename):
    prj_structure = {}
    f = open(filename, "r")
    data = f.read()
    f.close()
    lst = data.split("\n")
    prj = ""
    for i in range(0, len(lst)):
        val = lst[i].split(",")
        if len(val)>0:
            prj = val[0]
        if prj!="":
            if prj not in prj_structure.keys():
                prj_structure[prj] = []
             prj_structure[prj].append([val[0])
    return prj_structure


csv1_data = read_csv1("csv1.csv")
csv2_data = read_csv2("csv2.csv")

for k, v in csv1_data.items():
    for ks, vs in csv2_data.items():
        if k==vs[0][0]:
            #here it is not working
            sublist = []
            sublist.append(k)

最佳答案

使用csv包中的DictReader。

import csv

f1 = open('csv1.csv')
csv_1 = csv.DictReader(f1)
f2 = open('csv2.csv')
csv_2 = csv.DictReader(f2)


first_dict = {}

for row in csv_1:
    first_dict[row['name']]=row

f1.close()

f_out = open('output.csv','w')
csv_out = csv.DictWriter(f_out,csv_2.fieldnames)
csv_out.writeheader()

for second_row in csv_2:
    if second_row['name'] in first_dict:
        first_row = first_dict[second_row['name']]
        if first_row['id']!=second_row['id']:
            csv_out.writerow(second_row)

f2.close()
f_out.close()

关于python - 如何在比较两个csv文件时编写csv基础[基于列],我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57625869/

10-11 22:32