问题描述
我有一个与此相似的DataFrame
I have a DataFrame that is something similar to this
id name value
a Adam 5
b Eve 6
c Adam 4
a Eve 3
d Seth 2
b Adam 4
a Adam 2
我试图查看有多少id
与多少个名称以及它们之间的重叠关系.我在id列上进行了groupby,然后可以看到有多少id与它们相关联的名字.
I am trying to see how many id
s are associated with how many names and the overlap between them. I did a groupby on the id column and then I could see how many id's have how many names associated with them.
df.groupby('id')['name'].nunique().value_counts()
我现在想要的是一种获取表的方法,其中名称是列名,索引是ID,值是每个ID和名称的总和.我可以通过初始化一个DataFrame来实现for循环,其中的列是name列中的值,但是我想知道是否有一种熊猫方式来完成这样的事情?
What I would now like is a way to get a table where the names are the column names, and index is the id, and the value is the sum for each id and name. I could do it for a for loop, by initializing a DataFrame where the columns are the values in the name column but I am wondering if there is a pandas way of accomplishing something like this?
推荐答案
这就是您想要的吗?
In [54]: df.pivot_table(index='id', columns='name', values='value', aggfunc='sum')
Out[54]:
name Adam Eve Seth
id
a 7.0 3.0 NaN
b 4.0 6.0 NaN
c 4.0 NaN NaN
d NaN NaN 2.0
或没有NaN:
In [56]: df.pivot_table(index='id', columns='name', values='value', aggfunc='sum', fill_value=0)
Out[56]:
name Adam Eve Seth
id
a 7 3 0
b 4 6 0
c 4 0 0
d 0 0 2
这篇关于具有Pandas Dataframe的数据透视表(?)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!