本文介绍了如何使用转置转换 pandas 数据框列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个像这样的数据框:
I have a dataframe like this :
A B C D E F G H
--------------------------
0 xx s 1 d f df f 54
1 g g4
2 x r4
3 r 43
4 ds a s d f ds f 64
5 d 43
6 s se
7 1 gf
8 3 s3
9 as t r a 2 ds k s4
如何以这种格式制作它:
How to make it in this format :
A B C D E F f g x r d s 1 3 k
---------------------------------------------------------
0 xx s 1 d f df 54 g4 r4 43
1 ds a s d f ds 64 43 se gf s3
2 as t r a 2 ds s4
第一个数据帧中会有更多值.
There will be more values in first dataframe.
推荐答案
首先通过向前填充替换A-F
列中的缺失值,然后通过 set_index
与 unstack
:
First replace missing values in columns A-F
by forward filling and then reshape by set_index
with unstack
:
cols = list('ABCDEF')
df[cols] = df[cols].ffill()
df = df.set_index(cols + ['G'])['H'].unstack().reset_index().rename_axis(None, 1)
print (df)
A B C D E F 1 3 d f g k r s x
0 as t r a 2 ds NaN NaN NaN NaN NaN s4 NaN NaN NaN
1 ds a s d f ds gf s3 43 64 NaN NaN NaN se NaN
2 xx s 1 d f df NaN NaN NaN 54 g4 NaN 43 NaN r4
如果顺序很重要,请添加 reindex
通过 unique
值:
If order is important add reindex
by unique
values:
s = df['G'].unique()
df = df.set_index(cols + ['G'])['H'].unstack().reindex(columns=s).reset_index().rename_axis(None, 1)
print (df)
A B C D E F f g x r d s 1 3 k
0 as t r a 2 ds NaN NaN NaN NaN NaN NaN NaN NaN s4
1 ds a s d f ds 64 NaN NaN NaN 43 se gf s3 NaN
2 xx s 1 d f df 54 g4 r4 43 NaN NaN NaN NaN NaN
这篇关于如何使用转置转换 pandas 数据框列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!