所以我有一个制表符分隔的文本文件,如下所示:
23 Hello How are you?
23 What's up?
24 I am using Python
我想将上面的数据分开并分组,以便看起来像这样:
23 Hello How are you? What's up?
24 I am using Python
基本上,我想将文本与第一列的值进行分组(然后将它们写到23.txt和24.txt的单独文本文件中)
我有如下代码:
def data_extraction(inputfile):
ifile = open(inputfile, "r")
lines = ifile.readlines()
for value in lines:
each_line = value.split('\t')
service_order = each_line[0]
text = each_line[-1]
上面将为我提供for循环内的多个列表(
value = ['23', 'Hello How are you?']
等)。我该如何对同一列及其对应的文本进行分组? 最佳答案
>>> data = """23\tHello How are you?
23\tWhat's up?
24\tI am using Python"""
>>> new_dict = defaultdict(str)
>>> data = data.split('\n')
>>> for line in data:
each_line = line.split('\t')
new_dict[int(each_line[0])] += " " + each_line[-1]
>>> print new_dict
defaultdict(<type 'str'>, {24: ' I am using Python', 23: " Hello How are you? What's up?"})
输出-
>>> for key in sorted(new_dict):
print str(key) + "\t" + new_dict[key].strip()
23 Hello How are you? What's up?
24 I am using Python
您也不应使用
readlines
并逐行读取第一行,并在读取文件时使用上下文。with open('filename', 'r') as f:
for line in f:
# Use the above code
关于python - Python:基于相同元素对多个列表进行分组,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/24819063/