python - Python-通过前两个单词对行进行分组

我想按文件中的前两个词对文件进行分组（然后重新排列并打印）

我想做

   lines=file.readlines()
   i=0
   for line in lines:
    word1=line.split()[0]
    word2=line.split()[1]
    if word1==lines[i+1].split()[0] and word1==lines[i-1].split()[0] :
        if word2=lines[i-1].split()[1] and word2==lines[i--1].split()[0]:
              print line
    else:
       print "***new block of lines \n***"

但是，这是一个非常差的解决方案，因为它不适用于第一行或最后一行，并且总体上不能很好地工作。更好的解决方案表示赞赏

最佳答案

如果要对共享文件中前两个单词的连续行进行分组，这是itertools.groupby的用例，例如：

from itertools import groupby

with open('somefile') as fin:
    lines = ((line.split(None, 2)[:2], line) for line in fin if line.strip())
    for k, g in groupby(lines, lambda L: L[0]):
        lines = [el[1] for el in g]

在这里，k是分组密钥（最多前两个单词），而lines将是文件中共享该密钥的行。

示例somefile输入：

one two three four five
one two five six seven
three four something
three four something else
one two start of new one two block

print k, lines的结果：

['one', 'two'] ['one two three four five\n', 'one two five six seven\n']
['three', 'four'] ['three four something\n', 'three four something else\n']
['one', 'two'] ['one two start of new one two block\n']

要从line中排除前两个单词，请使用：

with open('somefile') as fin:
    lines = (line.split(None, 2) for line in fin if line.strip())
    for k, g in groupby(lines, lambda L: L[:2]):
        lines = [el[2] for el in g]

关于python - Python-通过前两个单词对行进行分组，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/28759687/