问题描述
所以我有一个数据框:
df = pd.DataFrame([["foo","fizz",1],["foo","fizz",2],["foo","buzz",3],["foo","buzz",4],["bar","fizz",6],["bar","buzz",8]],columns=["a","b","c"])
a b c
0 foo fizz 1
1 foo fizz 2
2 foo buzz 3
3 foo buzz 4
4 bar fizz 6
5 bar buzz 8
我可以分组:
df2 = df.groupby(["a","b"]).sum()
c
a b
bar buzz 8
fizz 6
foo buzz 7
fizz 3
哪个很棒!但是我真正需要的是两列,而不是"c"列:"foo"和"bar":
Which is awesome! But what I really need, instead of the "c" column is two columns, "foo" and "bar":
foo bar
b
buzz 7 8
fizz 3 6
有人可以建议一种方法吗?我尝试搜索,但是我想我没有正确的术语,所以我什么也找不到.
Can someone suggest a way to do this? I tried searching, but I guess I don't have the correct terminology for this so I couldn't find anything.
推荐答案
您可以为此使用unstack
:
df2.unstack(level='a')
示例:
In [146]: df2.unstack(level='a')
Out[146]:
c
a bar foo
b
buzz 8 7
fizz 6 3
之后,您将获得多索引列.如果需要获取平面数据框,则可以使用multiindex的droplevel
:
After that you'll get multiindexed columns. If you need to get flat dataframe you could use droplevel
of multiindex:
df3 = df2.unstack(level='a')
df3.columns = df3.columns.droplevel()
In [177]: df3
Out[177]:
a bar foo
b
buzz 8 7
fizz 6 3
编辑
droplevel
从MultiIndex降低级别,该列在unstack
之后变为.默认情况下,它删除级别0,这是该数据帧所需的级别.
droplevel
drops level from MultiIndex which your columns become after unstack
. By default it drops level 0 which is what you need for that dataframe.
从help(pd.core.index.MultiIndex.droplevel)
复制:
下降级别(自身,级别= 0) 返回索引,删除了请求的级别.如果MultiIndex只有2 级别,结果将是索引类型而不是MultiIndex.
droplevel(self, level=0) Return Index with requested level removed. If MultiIndex has only 2 levels, the result will be of Index type not MultiIndex.
Parameters
----------
level : int/level name or list thereof
Notes
-----
Does not check if result index is unique or not
Returns
-------
index : Index or MultiIndex
这篇关于将多级索引的一个级别拆分为多个列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!