所以我有一个制表符分隔的文本文件,如下所示:

23      Hello How are you?
23      What's up?
24      I am using Python


我想将上面的数据分开并分组,以便看起来像这样:

23      Hello How are you? What's up?
24      I am using Python


基本上,我想将文本与第一列的值进行分组(然后将它们写到23.txt和24.txt的单独文本文件中)

我有如下代码:

def data_extraction(inputfile):

ifile = open(inputfile, "r")
lines = ifile.readlines()


for value in lines:
    each_line = value.split('\t')
    service_order = each_line[0]
    text = each_line[-1]


上面将为我提供for循环内的多个列表(value = ['23', 'Hello How are you?']等)。我该如何对同一列及其对应的文本进行分组?

最佳答案

>>> data = """23\tHello How are you?
23\tWhat's up?
24\tI am using Python"""
>>> new_dict = defaultdict(str)
>>> data = data.split('\n')
>>> for line in data:
    each_line = line.split('\t')
    new_dict[int(each_line[0])] += " " + each_line[-1]

>>> print new_dict
defaultdict(<type 'str'>, {24: ' I am using Python', 23: " Hello How are you? What's up?"})


输出-

>>> for key in sorted(new_dict):
    print str(key) + "\t" + new_dict[key].strip()


23  Hello How are you? What's up?
24  I am using Python


您也不应使用readlines并逐行读取第一行,并在读取文件时使用上下文。

with open('filename', 'r') as f:
    for line in f:
        # Use the above code

关于python - Python:基于相同元素对多个列表进行分组,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/24819063/

10-11 21:43
查看更多