问题描述
此代码:
from itertools import groupby, count
L = [38, 98, 110, 111, 112, 120, 121, 898]
groups = groupby(L, key=lambda item, c=count():item-next(c))
tmp = [list(g) for k, g in groups]
采用[38, 98, 110, 111, 112, 120, 121, 898]
,将其按连续数字分组,并将其与最终输出合并:
Takes [38, 98, 110, 111, 112, 120, 121, 898]
, groups it by consecutive numbers and merge them with this final output:
['38', '98', '110,112', '120,121', '898']
如何使用具有多列的列表的列表来完成同样的操作,例如下面的列表,您可以在列表中按名称将其分组,并使用其第二列的值进行合并,然后合并.
How can the same be done with a list of lists with multiple columns, like this list below where you can group them by name and the consecution of its second column value and then merge.
换句话说,该数据:
L= [
['Italy','1','3']
['Italy','2','1'],
['Spain','4','2'],
['Spain','5','8'],
['Italy','3','10'],
['Spain','6','4'],
['France','5','3'],
['Spain','20','2']]
应提供以下输出:
[['Italy','1-2-3','3-1-10'],
['France','5','3'],
['Spain','4-5-6','2-8-4'],
['Spain','20','2']]
更多itertools 是否更适合此任务?
使用Python中的itertools/more-itertools将多列列表的项目分组并合并
Group and combine items of multiple-column lists with itertools/more-itertools in Python
推荐答案
这基本上是相同的分组技术,但不是使用itertools.count
而是使用enumerate
来生成索引.
This is essentially the same grouping technique, but rather than using itertools.count
it uses enumerate
to produce the indices.
首先,我们对数据进行排序,以便将给定国家/地区的所有商品归为一组,然后对数据进行排序.然后,我们使用groupby
为每个国家/地区分组.然后,在内部循环中使用groupby
将每个国家/地区的连续数据分组在一起.最后,我们使用zip
& .join
将数据重新排列为所需的输出格式.
First, we sort the data so that all items for a given country are grouped together, and the data is sorted. Then we use groupby
to make a group for each country. Then we use groupby
in the inner loop to group together the consecutive data for each country. Finally, we use zip
& .join
to re-arrange the data into the desired output format.
from itertools import groupby
from operator import itemgetter
lst = [
['Italy','1','3'],
['Italy','2','1'],
['Spain','4','2'],
['Spain','5','8'],
['Italy','3','10'],
['Spain','6','4'],
['France','5','3'],
['Spain','20','2'],
]
newlst = [[country] + ['-'.join(s) for s in zip(*[v[1][1:] for v in g])]
for country, u in groupby(sorted(lst), itemgetter(0))
for _, g in groupby(enumerate(u), lambda t: int(t[1][1]) - t[0])]
for row in newlst:
print(row)
输出
['France', '5', '3']
['Italy', '1-2-3', '3-1-10']
['Spain', '20', '2']
['Spain', '4-5-6', '2-8-4']
我承认lambda
有点神秘;最好使用适当的def
函数.我将在几分钟之内将其添加.
I admit that lambda
is a bit cryptic; it'd probably better to use a proper def
function instead. I'll add that here in a few minutes.
使用更易读的键功能也是一样.
Here's the same thing using a much more readable key function.
def keyfunc(t):
# Unpack the index and data
i, data = t
# Get the 2nd column from the data, as an integer
val = int(data[1])
# The difference between val & i is constant in a consecutive group
return val - i
newlst = [[country] + ['-'.join(s) for s in zip(*[v[1][1:] for v in g])]
for country, u in groupby(sorted(lst), itemgetter(0))
for _, g in groupby(enumerate(u), keyfunc)]
这篇关于使用Python中的itertools/more-itertools对多列列表的项目进行分组和合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!