Pandas 的数据透视表或分组依据?

本文介绍了Pandas 的数据透视表或分组依据?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个直截了当的问题，在过去的 3 个小时里一直给我带来很多困难.这应该很容易.

I have a hopefully straightforward question that has been giving me a lot of difficulty for the last 3 hours. It should be easy.

这就是挑战.

我有一个熊猫数据框:

+--------------------------+
|     Col 'X'    Col 'Y'  |
+--------------------------+
|     class 1      cat 1  |
|     class 2      cat 1  |
|     class 3      cat 2  |
|     class 2      cat 3  |
+--------------------------+

我希望将数据帧转换为:

What I am looking to transform the dataframe into:

+------------------------------------------+
|                  cat 1    cat 2    cat 3 |
+------------------------------------------+
|     class 1         1        0        0  |
|     class 2         1        0        1  |
|     class 3         0        1        0  |
+------------------------------------------+

其中的值是值计数.有人有任何见解吗?谢谢！

Where the values are value counts. Anybody have any insight? Thanks!

推荐答案

这里有几种重塑数据的方法 df

Here are couple of ways to reshape your data df

In [27]: df
Out[27]:
     Col X  Col Y
0  class 1  cat 1
1  class 2  cat 1
2  class 3  cat 2
3  class 2  cat 3

1) 使用 pd.crosstab()

In [28]: pd.crosstab(df['Col X'], df['Col Y'])
Out[28]:
Col Y    cat 1  cat 2  cat 3
Col X
class 1      1      0      0
class 2      1      0      1
class 3      0      1      0

2) 或者，使用 groupby 在 'Col X','Col Y' 上，unstack 超过 Col Y，然后用零填充 NaNs.

2) Or, use groupby on 'Col X','Col Y' with unstack over Col Y, then fill NaNs with zeros.

In [29]: df.groupby(['Col X','Col Y']).size().unstack('Col Y', fill_value=0)
Out[29]:
Col Y    cat 1  cat 2  cat 3
Col X
class 1      1      0      0
class 2      1      0      1
class 3      0      1      0

3) 或者，使用 pd.pivot_table() with index=Col X, columns=Col Y

In [30]: pd.pivot_table(df, index=['Col X'], columns=['Col Y'], aggfunc=len, fill_value=0)
Out[30]:
Col Y    cat 1  cat 2  cat 3
Col X
class 1      1      0      0
class 2      1      0      1
class 3      0      1      0

4) 或者，使用 set_index 和 unstack

In [492]: df.assign(v=1).set_index(['Col X', 'Col Y'])['v'].unstack(fill_value=0)
Out[492]:
Col Y    cat 1  cat 2  cat 3
Col X
class 1      1      0      0
class 2      1      0      1
class 3      0      1      0

这篇关于Pandas 的数据透视表或分组依据?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！