最后编辑:成功了感谢大家的帮助,特别感谢帕德雷克在我工作之前对我的帮助。
首先,如果之前有人问过这个问题,我很抱歉,我确实做了大量的搜索,但也许是用了一种我没想到的方式。
因此,我使用的csv文件如下:
0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5,21171,7.25,S
我必须解析这个文件,然后将它的一部分写入另一个csv中,我用这段代码完成了这个任务:

import csv
infile = open('data/data.csv', 'r')
incsv = csv.reader(infile, delimiter = ',')
outfile = open('data/output.csv', 'w', newline = '')
outcsv = csv.writer(outfile, delimiter = ',')

问题是字段“name”的格式设置为"Lastname, othernames",我需要将其拆分为两个字段:“lastname”和“othernames”。
我似乎找不到方法让它忽略引号并用分隔符(',')分隔名称这是一个列表,所以.strip()不起作用,我也无法确定quote\u none是否起作用,或者我只是没有把语法写下来。
也许不用说,但我对这一切都很陌生。
编辑:我在这些解决方案中遇到了错误,所以我将包括其余的代码,希望它能突出显示出哪里出了问题。
import csv

infile = open('data/titanic.csv', 'r')
incsv = csv.reader(infile, delimiter = ',')
outfile = open('data/survivors.csv', 'w', newline = '')
outcsv = csv.writer(outfile, delimiter = ',')

dict ={}

for row in incsv:
survived, pclass, name, sex, age, sibsp, parch, ticket, fare, cabin,    embarked = row
    if survived == "1":
        if name not in dict:
            dict[name] = name, pclass, sex, age

names = dict.keys()
sorted_names = sorted(names)

for name in sorted_names:
    (name, pclass, sex, age) = dict[name]
rowOutput = (name, pclass, sex, age)
outcsv.writerow(rowOutput)

outfile.close()
infile.close()

因此,这将解析原始csv,filters by survived='1',将名称添加到dict中(我知道,一旦拆分name字段,我将需要调整此值),并按字母顺序对字典进行排序。
编辑:这里是更多的原始文件的要求。很抱歉最初没有包括更多。
survived,pclass,name,sex,age,sibsp,parch,ticket,fare,cabin,embarked
0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S
1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
0,3,"Allen, Mr. William Henry",male,35,0,0,373450,8.05,,S
0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
0,1,"McCarthy, Mr. Timothy J",male,54,0,0,17463,51.8625,E46,S
0,3,"Palsson, Master. Gosta Leonard",male,2,3,1,349909,21.075,,S
1,3,"Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg)",female,27,0,2,347742,11.1333,,S

这是10行892(如果不算页眉的话是891)。

最佳答案

如果数据始终在同一列中,则可以拆分:

  In [20]: s = '0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5,21171,7.25,S'

In [21]: import  csv

In [22]: row = (next(csv.reader([s])))

In [23]:row
['0', '3', 'Braund, Mr. Owen Harris', 'male', '22', '1', '0', 'A/5', '21171', '7.25', 'S']

In [24]: last,first = row[2].split(",")

In [25]: last, first.strip()
Out[25]: ('Braund', 'Mr. Owen Harris')

假设你想用姓氏作为主键:
from operator import itemgetter

dct = {}

with  open('data/titanic.csv') as infile, open('data/survivors.csv', 'w', newline='') as outfile:
    incsv = csv.reader(infile)
    outcsv = csv.writer(outfile)
    for survived, pclass, name, sex, age in map(itemgetter(0,1, 2, 3, 4), incsv):
        if survived == "1":
            last, first = name.split(",")
            dct[last] = [first, pclass, sex, age]

    sorted_names = sorted(dct)
    for last_name in sorted_names:
         outcsv.writerow( [last_name] + dct[last_name])

itemgetter(0,1,2,3,4)只提取我们感兴趣的前五列,在for循环中解压这五个值,拆分名称并使用姓氏作为键。
如果可能缺少名字,可以使用str.partition:
        last, _, first = name.partition(",")
        dct[last] = first.strip(), pclass, sex, age

最终输出格式为:
last_name, other_names, plcass, sex, age

采样线上的输出:
In [2]: cat test.csv
1,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5,21171,7.25,S
0,3,"Braund1, Mr. Owen Harris",male,22,1,0,A/5,21171,7.25,S
1,3,"Braund3, Mr. Owen2 Harris2",male,22,1,0,A/5,21171,7.25,S
0,3,"Braund2, Mr. Owen2 Harris2",male,22,1,0,A/5,21171,7.25,S
In [3]: cat survivors.csv

In [4]: paste
from operator import itemgetter
import csv
dct = {}
with open('test.csv') as infile, open('survivors.csv', 'w', newline='') as outfile:
    incsv = csv.reader(infile)
    outcsv = csv.writer(outfile)
    for survived, pclass, name, sex, age in map(itemgetter(0, 1, 2, 3, 4), incsv):
        if survived == "1":
            last, first = name.split(",")
            dct[last] = [first, pclass, sex, age]
    sorted_names = sorted(dct)
    for last_name in sorted_names:
        outcsv.writerow([last_name] + dct[last_name])

## -- End pasted text --

In [5]: cat survivors.csv
Braund,Mr. Owen Harris,3,male,22
Braund3,Mr. Owen2 Harris2,3,male,22

10-04 21:47
查看更多