如何从pandas groupby().sum()的输出中创建新列?

本文介绍了如何从pandas groupby().sum()的输出中创建新列?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

尝试根据groupby计算创建新列.在下面的代码中，我获得了每个日期的正确计算值(请参阅下面的组)，但是当我尝试用它创建一个新列(df['Data4'])时，我得到了NaN.因此，我试图在数据框中为所有日期创建一个总和为Data3的新列，并将其应用于每个日期行.例如，2015-05-08位于2行中(总计为50 + 5 = 55)，在此新列中，我希望两行中均具有55.

Trying to create a new column from the groupby calculation. In the code below, I get the correct calculated values for each date (see group below) but when I try to create a new column (df['Data4']) with it I get NaN. So I am trying to create a new column in the dataframe with the sum of Data3 for the all dates and apply that to each date row. For example, 2015-05-08 is in 2 rows (total is 50+5 = 55) and in this new column I would like to have 55 in both of the rows.

import pandas as pd
import numpy as np
from pandas import DataFrame

df = pd.DataFrame({
    'Date' : ['2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05', '2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05'],
    'Sym'  : ['aapl', 'aapl', 'aapl', 'aapl', 'aaww', 'aaww', 'aaww', 'aaww'],
    'Data2': [11, 8, 10, 15, 110, 60, 100, 40],
    'Data3': [5, 8, 6, 1, 50, 100, 60, 120]
})

group = df['Data3'].groupby(df['Date']).sum()

df['Data4'] = group

推荐答案

您要使用 transform ，这将返回一个序列，其索引与df对齐，因此您可以将其添加为新列:

You want to use transform this will return a Series with the index aligned to the df so you can then add it as a new column:

In [74]:

df = pd.DataFrame({'Date': ['2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05', '2015-05-08', '2015-05-07', '2015-05-06', '2015-05-05'], 'Sym': ['aapl', 'aapl', 'aapl', 'aapl', 'aaww', 'aaww', 'aaww', 'aaww'], 'Data2': [11, 8, 10, 15, 110, 60, 100, 40],'Data3': [5, 8, 6, 1, 50, 100, 60, 120]})

df['Data4'] = df['Data3'].groupby(df['Date']).transform('sum')
df
Out[74]:
   Data2  Data3        Date   Sym  Data4
0     11      5  2015-05-08  aapl     55
1      8      8  2015-05-07  aapl    108
2     10      6  2015-05-06  aapl     66
3     15      1  2015-05-05  aapl    121
4    110     50  2015-05-08  aaww     55
5     60    100  2015-05-07  aaww    108
6    100     60  2015-05-06  aaww     66
7     40    120  2015-05-05  aaww    121

这篇关于如何从pandas groupby().sum()的输出中创建新列?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！