本文介绍了如何使用转置转换 pandas 数据框列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个像这样的数据框:

I have a dataframe like this :

      A   B  C  D  E   F  G  H
     --------------------------
   0  xx  s  1  d  f  df  f 54
   1                      g g4
   2                      x r4
   3                      r 43
   4  ds  a  s  d  f  ds  f 64
   5                      d 43
   6                      s se
   7                      1 gf
   8                      3 s3
   9  as  t  r  a  2  ds  k s4

如何以这种格式制作它:

How to make it in this format :

      A   B  C  D  E  F    f   g   x   r   d   s   1   3   k
     ---------------------------------------------------------
   0  xx  s  1  d  f  df  54  g4  r4  43
   1  ds  a  s  d  f  ds  64              43  se  gf  s3
   2  as  t  r  a  2  ds                                   s4

第一个数据帧中会有更多值.

There will be more values in first dataframe.

推荐答案

首先通过向前填充替换A-F列中的缺失值,然后通过 set_index unstack :

First replace missing values in columns A-F by forward filling and then reshape by set_index with unstack:

cols = list('ABCDEF')
df[cols] = df[cols].ffill()

df = df.set_index(cols + ['G'])['H'].unstack().reset_index().rename_axis(None, 1)
print (df)
    A  B  C  D  E   F    1    3    d    f    g    k    r    s    x
0  as  t  r  a  2  ds  NaN  NaN  NaN  NaN  NaN   s4  NaN  NaN  NaN
1  ds  a  s  d  f  ds   gf   s3   43   64  NaN  NaN  NaN   se  NaN
2  xx  s  1  d  f  df  NaN  NaN  NaN   54   g4  NaN   43  NaN   r4

如果顺序很重要,请添加 reindex 通过 unique 值:

If order is important add reindex by unique values:

s = df['G'].unique()
df = df.set_index(cols + ['G'])['H'].unstack().reindex(columns=s).reset_index().rename_axis(None, 1)
print (df)
    A  B  C  D  E   F    f    g    x    r    d    s    1    3    k
0  as  t  r  a  2  ds  NaN  NaN  NaN  NaN  NaN  NaN  NaN  NaN   s4
1  ds  a  s  d  f  ds   64  NaN  NaN  NaN   43   se   gf   s3  NaN
2  xx  s  1  d  f  df   54   g4   r4   43  NaN  NaN  NaN  NaN  NaN

这篇关于如何使用转置转换 pandas 数据框列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-05 08:35