问题描述
假设我们从这个简单的表开始,存储在一个熊猫数据框中: 名字年龄系列
0 john 1 1
1 jason 36 1
2 jane 32 1
3 jack 26 2
4 james 30 2
然后我做
group_df = df.groupby('family')
group_df = group_df.aggregate({'name':name_join,'age':pd.np.mean})
其中 name_join
是名称的简单聚合函数:
def name_join(list_names,concat =' - '):
return concat.join(list_names)
结果为:
年龄名称
家庭
1 23 john- jason-jane
2 28 jack-james
现在是问题所在。 strong>
有没有一种快速,高效的方法来克服et到以下汇总表中?
姓名年龄族
0约翰23 1
1杰森23 1
2 jane 23 1
3 jack 28 2
4 james 28 2
(注意:数字仅仅是一些例子,我不关心在这个具体例子中平均后我失去的信息)
我认为的方式我可以做到这一点看起来效率不高:
- 创建空数据框
-
group_df
,分隔名称 - 返回一个数据框,其行数与起始行中的名称一样多
- 将输出附加到空数据框中
将操作视为groupby的反面。
您将一个字符串拆分为小块,并将每个小块与家族 。 这个旧的答案
首先将'family'设置为索引列,然后参考上面的链接,然后 reset_index ()
来获得想要的结果。
Suppose we start from this simple table, stored in a pandas dataframe:
name age family
0 john 1 1
1 jason 36 1
2 jane 32 1
3 jack 26 2
4 james 30 2
Then I do
group_df = df.groupby('family')
group_df = group_df.aggregate({'name': name_join, 'age': pd.np.mean})
where name_join
is a simple aggregating function for the names:
def name_join(list_names, concat='-'):
return concat.join(list_names)
the output is:
age name
family
1 23 john-jason-jane
2 28 jack-james
Now the question.
Is there a quick, efficient way to get to the following from the aggregated table?
name age family
0 john 23 1
1 jason 23 1
2 jane 23 1
3 jack 28 2
4 james 28 2
(Note: numbers are just examples, I don't care for the information I am losing after averaging in this specific example)
The way I thought I could do it does not look too efficient:
- create empty dataframe
- from every line in
group_df
, separate the names - return a dataframe with as many rows as there are names in the starting row
- append the output to the empty dataframe
It may not be helpful to think of the operation as the "opposite" of groupby.
You are splitting a string in to pieces, and maintaining each piece's association with 'family'. This old answer of mine does the job.
Just set 'family' as the index column first, refer to the link above, and then reset_index()
at the end to get your desired result.
这篇关于是否有“由...取消分组”与 pandas 中的.groupby相反的操作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!