本文介绍了具有Pandas Dataframe的数据透视表(?)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个与此相似的DataFrame

I have a DataFrame that is something similar to this

id    name    value
a     Adam    5
b     Eve     6
c     Adam    4
a     Eve     3
d     Seth    2
b     Adam    4
a     Adam    2

我试图查看有多少id与多少个名称以及它们之间的重叠关系.我在id列上进行了groupby,然后可以看到有多少id与它们相关联的名字.

I am trying to see how many ids are associated with how many names and the overlap between them. I did a groupby on the id column and then I could see how many id's have how many names associated with them.

df.groupby('id')['name'].nunique().value_counts()

我现在想要的是一种获取表的方法,其中名称是列名,索引是ID,值是每个ID和名称的总和.我可以通过初始化一个DataFrame来实现for循环,其中的列是name列中的值,但是我想知道是否有一种熊猫方式来完成这样的事情?

What I would now like is a way to get a table where the names are the column names, and index is the id, and the value is the sum for each id and name. I could do it for a for loop, by initializing a DataFrame where the columns are the values in the name column but I am wondering if there is a pandas way of accomplishing something like this?

推荐答案

这就是您想要的吗?

In [54]: df.pivot_table(index='id', columns='name', values='value', aggfunc='sum')
Out[54]:
name  Adam  Eve  Seth
id
a      7.0  3.0   NaN
b      4.0  6.0   NaN
c      4.0  NaN   NaN
d      NaN  NaN   2.0

或没有NaN:

In [56]: df.pivot_table(index='id', columns='name', values='value', aggfunc='sum', fill_value=0)
Out[56]:
name  Adam  Eve  Seth
id
a        7    3     0
b        4    6     0
c        4    0     0
d        0    0     2

这篇关于具有Pandas Dataframe的数据透视表(?)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-05 08:16