本文介绍了融化 pandas 中的分类列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在尝试从名为 df
的数据帧创建以下名为 out
的数据帧.我有一个非常手动和缓慢的方法,但我希望它可以通过例如 groupby()
和 melt()
I am trying to create the below dataframe called out
from the dataframe called df
. I have a very manual and slow way of doing it but I am hoping it can be done for example with a combination of groupby()
and melt()
import pandas as pd
import itertools
def expand_grid(data_dict):
rows = itertools.product(*data_dict.values())
return pd.DataFrame.from_records(rows, columns=data_dict.keys())
data_dict = dict(
id1 = list('ab'),
id2 = list('cd'),
col1 = list('ef'),
col2 = list('gh'),
col3 = list('ij'),
)
df = expand_grid(data_dict)
df['value'] = range(1,33)
out = pd.melt(df.drop('value', axis=1), id_vars=['id1', 'id2'], var_name='col', value_name='level')
out = out.drop_duplicates().reset_index()
myvals = []
for r in out.index:
out_row = out.loc[r]
df_sub = df.loc[(df.id1 == out_row[1]) & (df.id2 == out_row[2]) & (df[out_row[3]] == out_row[4])]
myvals.append(df_sub.value.sum())
out['value'] = myvals
谢谢!
推荐答案
With melt
和 groupby
df.melt(
['id1', 'id2', 'value'],
['col1', 'col2', 'col3'],
value_name='level', var_name='col'
).groupby(['id1', 'id2', 'col', 'level'], as_index=False).sum()
id1 id2 col level value
0 a c col1 e 10
1 a c col1 f 26
2 a c col2 g 14
3 a c col2 h 22
4 a c col3 i 16
5 a c col3 j 20
6 a d col1 e 42
7 a d col1 f 58
8 a d col2 g 46
9 a d col2 h 54
10 a d col3 i 48
11 a d col3 j 52
12 b c col1 e 74
13 b c col1 f 90
14 b c col2 g 78
15 b c col2 h 86
16 b c col3 i 80
17 b c col3 j 84
18 b d col1 e 106
19 b d col1 f 122
20 b d col2 g 110
21 b d col2 h 118
22 b d col3 i 112
23 b d col3 j 116
这篇关于融化 pandas 中的分类列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!