我想用list_of_occurences
列表中的正确项目填充grundformen
。
我的for循环无法正常运行。它不会从头开始重新启动,只会在阅读器中的所有行中进行一次。因此,它不会完全填充列表。
这是它打印的内容(您可以看到缺少的部分-因为它没有从列表的开头开始搜索-):
# List_of_occurrences (1 line - wrapped for easier reading)
[['NN', 1328, ('Ziel',)], ['ART', 771, ('der',)],
['$.', 732, ('_',)], ['VVFIN', 682, ('schlagen',)],
['PPER', 592, ('sie',)], ['$,', 561, ('_',)],
['ADV', 525, ('So',)], ['APPR', 507, ('in',)],
['NE', 433, ('Johanna',)], ['$(', 363, ('_',)],
['VAFIN', 334, ('haben',)], ['ADJA', 307, ('tragisch',)],
['ADJD', 278, ('recht',)], ['KON', 228, ('Doch',)],
['VVPP', 194, ('reichen',)], ['VVINF', 161, ('stören',)],
['KOUS', 151, ('Während',)], ['PPOSAT', 120, ('ihr',)],
['PTKVZ', 104, ('weiter',)], ['PRF', 98, ('sich',)],
['APPRART', 90, ('zu',)], ['PTKNEG', 87, ('nicht',)],
['VMFIN', 76, ('sollen',)], ['PIAT', 66, ('kein',)],
['PIS', 65, ('etwas',)], ['PTKZU', 52, ('zu',)],
['PRELS', 51, ('wer',)], ['PROAV', 42, ('dabei',)],
['PDS', 38, ('jener',)], ['PDAT', 37, ('dieser',)],
['PWAV', 30, ('wie',)], ['PWS', 26, ('Was',)],
['CARD', 24, ('drei',)], ['KOKOM', 21, ('wie',)],
['VAINF', 18, ('werden',)], ['KOUI', 15, ('um',)],
['VMINF', 10, ('können',)], ['VVIZU', 10, ('aufklären',)],
['VAPP', 10], ['PTKA', 6], ['PTKANT', 6], ['PWAT', 4],
['VVIMP', 4], ['PRELAT', 4], ['APZR', 3], ['APPO', 2],
['FM', 1]]
# Grundformen (1 line, wrapped for reading)
['Ziel', 'der', '_', 'schlagen', 'sie', '_', 'So', 'in', 'Johanna',
'_', 'haben', 'tragisch', 'recht', 'Doch', 'reichen', 'stören',
'Während', 'ihr', 'weiter', 'sich', 'zu', 'nicht', 'sollen', 'kein',
'etwas', 'zu', 'wer', 'dabei', 'jener', 'dieser', 'wie', 'Was',
'drei', 'wie', 'werden', 'um', 'können', 'aufklären']
occurences = collections.Counter()
with open("material-2.csv", mode='r', newline='', encoding="utf-8") as material:
reader = csv.reader(material, delimiter='\t', quotechar="\t")
for line in reader:
if line:
occurences[line[5]] += 1
else:
pass
list_of_occurences = [list(elem) for elem in occurences.most_common()]
grundformen = []
with open('material-2.csv', mode='r', newline='', encoding="utf-8") as material:
reader = csv.reader(material, delimiter='\t', quotechar="\t")
for elem in list_of_occurences:
for row in reader:
if row != [] and row[5] == elem[0]:
grundformen.append(row[2])
break
iterator = 0
for elem in grundformen:
list_of_occurences[iterator].insert(2, elem)
iterator = iterator + 1
pass
print(list_of_occurences)
print(grundformen)
整个输入文件:https://www.dropbox.com/sh/xyktjk4ycm8x6v0/AACou438_eEWx-ZYmByBiqp_a/material-2.csv?dl=0
我的输入文件的一部分:
1 Als Als _ _ KOUS _ _ _ 6 6 CP CP _ _ _
2 es es _ _ PPER _ 3 | Nom | Sg | Neut 6 6 SB SB _ _
3 zu _ _ PTKA _ _ 4 4 MO MO _ _
4 schneien schneien _ _ ADJD _ Comp | Dat | Sg | Fem 5 5 MO MO _ _
5aufgehörtaufhören_ _ VVPP _ Psp 6 6 OC OC _ _
6 hatte haben _ _ VAFIN _ 3 | Sg | Past | Ind 8 8 MO MO _ _
7,_ _ $,_ _ 8 8 PUNC PUNC _ _
8verließverlassen _ _ VVFIN _ 3 | Sg | Past | Ind 0 0 ROOT ROOT _ _
9 Johanna Johanna _ _ NE _ Nom | Sg | Masc 8 8 SB SB _ _
10冯冯_ _ APPR _ _ 5 5 SBP SBP _ _
11 Rotenhoff Rotenhoff _ _ NE _ Dat | Sg | Neut 10 10 NK NK _ _
12,_ _ $,_ _ 8 8 PUNC PUNC _ _
13 ohne ohne _ _ KOUI _ _ _ 18 18 CP CP _ _
14 ein ein _ _ ART _ Nom | Sg | Neut 16 16 NK NK _ _
15 rechtes recht _ _ ADJA _ Pos | Nom | Sg | Neut 16 16 NK NK _ _
16 Ziel Ziel _ _ NN _ Nom | Sg | Neut 18 18 OA OA _ _
17 zu _ _ _ PTKZU _ _ 18 18 PM PM _ _
18哈本哈本_ _ VAINF _ Inf 8 8 MO MO _ _
19,_ _ $,_ _ 18 18 PUNC PUNC _ _
20 das der _ _ ART _ Nom | Sg | Neut 21 21 NK NK _ _
21 Gutshaus Gutshaus _ _ NN _ Nom | Sg | Neut 16 16 APP APP _ _
22。 _ _ _ $。 _ _ 8 8 PUNC PUNC _ _
如何更改循环,以便可以填充所有内容?
最佳答案
您如何读取csv
数据时遇到问题。
在这里,数据被读入list
并可以进行第二个循环,而不必打开另一个file-object
,但是您甚至不需要两次遍历csv
数据:
import csv
import collections
occurences = collections.Counter()
grundformen = collections.defaultdict(list)
with open("material-2.csv", mode='r', newline='', encoding="utf-8") as material:
reader = [ln for ln in csv.reader(material, delimiter='\t', quotechar="\t") if ln]
for line in reader:
occurences[line[5]] += 1
grundformen[line[5]].append(line[2])
list_of_occurences = list(map(list, occurences.most_common()))
for elem in list_of_occurences:
elem.append(grundformen[elem[0]][0])
print(occurences)
通过从
list
数据中创建csv
,您可以调用break
语句,并且仍然能够在list
的开头重新开始下一个循环。当您遍历csv.reader
时,这是一个iterator
,因此即使调用break
,您也将从您中断的地方开始,直到其数据用尽。关于python - 我的前循环未按预期工作,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/31121307/