本文介绍了从另一个(使用枢轴)创建一个数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个熊猫的问题。我有一个包含三列的数据框:'id1','id2','amount'。



从这里,我想创建另一个数据框,索引为'id1 ',哪些列是id2,单元格包含相应的金额。



我们来看一个例子:


$ b pre pre pre $ p $ d d = = pd.DataFrame([['first_person','first_item',10],['first_person','second_item' 6],['second_person','first_item',18],['second_person','second_item',36]],columns = ['id1','id2','amount'])

其中:

  id1 id2 amount 
0 first_person first_item 10
1 first_person second_item 6
2 second_person first_item 18
3 second_person second_item 36

从此我想创建一个第二个数据框,它是:

  first_item second_item 
first_person 10 6
second_person 18 36

当然,在发布之前,我已经做了一段时间,但是我已经设法做到这一点是一个双for循环...对于我的数据帧的大小是无处可计算你会知道如何用更多的pythonic方式做到这一点吗? (这显然比'for'循环更有效!)

解决方案

我想你可以使用 ( pandas中的新增内容 0.18.0 ):

  print df 
id1 id2 amount
0 first_person first_item 10
1 first_person second_item 6
2 second_person first_item 18
3 second_person second_item 36

print df.pivot(index ='id1',columns ='id2',values ='amount')
.rename_axis(无)
.rename_axis(无,轴= 1)

first_item second_item
first_person 10 6
second_person 18 36


I'm having a problem with pandas. I have a dataframe with three columns: 'id1','id2','amount'.

From this, I would like to create another dataframe which index is 'id1', which columns is 'id2', and the cells contain the corresponding 'amount'.

Let's go for an example:

import pandas as pd
df = pd.DataFrame([['first_person','first_item',10],['first_person','second_item',6],['second_person','first_item',18],['second_person','second_item',36]],columns = ['id1','id2','amount'])

which yields:

     id1              id2             amount
0    first_person     first_item      10
1    first_person     second_item     6
2    second_person    first_item      18
3    second_person    second_item     36

And from this I would like to create a second dataframe which is:

                 first_item    second_item
first_person     10            6
second_person    18            36

Of course, before posting I've worked on it for a time, but all I've managed to do for this is a double 'for loop'... Which for the size of my dataframes is nowhere to be computable. Would you know how to do this in a more pythonic way? (which would obviously be far more efficient than 'for' loops!)

解决方案

I think you can use pivot with rename_axis (new in pandas 0.18.0):

print df
             id1          id2  amount
0   first_person   first_item      10
1   first_person  second_item       6
2  second_person   first_item      18
3  second_person  second_item      36

print df.pivot(index='id1', columns='id2', values='amount')
        .rename_axis(None)
        .rename_axis(None, axis=1)

               first_item  second_item
first_person           10            6
second_person          18           36

这篇关于从另一个(使用枢轴)创建一个数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-11 13:49