问题描述
之前我曾问过这个问题:
,但对熊猫的最新更改
意味着我认为是优雅的pythonic解决方案已被弃用,因为我真的不明白。
现在的问题是,现在仍然是:在做groupby时,如何将不同的集合函数应用于不同的领域(例如x的总和,x的平均值,y的最小值,max的z等),并重新命名结果字段,一次性完成,或者至少以可能是pythonic而不是过于繁琐的方式重新命名结果字段?即sum_x不会,我需要显式重命名字段。
这个我喜欢的方法:
<$ ({ctr)。agg({realgdp:{mean_gdp:mean,std_gdp:std},
unemp :{mean_unemp:mean}})
将被弃用,现在会产生此警告:
FutureWarning:使用带重命名的字典已过时,将在未来版本中删除
$ c
$ b 谢谢!
解决方案 agg )不被弃用,但使用agg进行重命名。
请仔细阅读文档:
弃用的是:
1. Pa向一个分组/滚动/重新取样的系列添加一个字典,允许用户重命名结果集合
2.将一个字典传递给一个分组/滚动/重新取样的DataFrame。
虽然它不是一行代码,但它可以工作。
df.groupby('qtr') .agg({realgdp:[mean,std],unemp:mean})
df.columns = df.columns.map('_'。join )
df.rename(columns = {'realgdp_mean':'mean_gdp','realgdp_std':'std_gdp','unemp_mean':'mean_unemp'},inplace = True)
I had asked this question before: python pandas: applying different aggregate functions to different columnsbut the latest changes to pandas https://github.com/pandas-dev/pandas/pull/15931mean that what I thought was an elegant and pythonic solution is deprecated, for reasons I genuinely fail to understand.
The question was, and still is: when doing a groupby, how can I apply different aggregate functions to different fields (e.g. sum of x, avg of x, min of y, max of z, etc.) and rename the resulting fields, all in one go, or at least in a possibly pythonic and not-too-cumbersome way? I.e. sum_x won't do, I need to rename the fields explicitly.
This approach, which I liked:
df.groupby('qtr').agg({"realgdp": {"mean_gdp": "mean", "std_gdp": "std"},
"unemp": {"mean_unemp": "mean"}})
will be deprecated and now produces this warning:
FutureWarning: using a dict with renaming is deprecated and will be removed in a future version
Thanks!
解决方案 agg() is not deprecated but renaming using agg is.
Do go through the documentation: https://pandas.pydata.org/pandas-docs/stable/whatsnew.html#deprecate-groupby-agg-with-a-dictionary-when-renaming
What is deprecated:1. Passing a dict to a grouped/rolled/resampled Series that allowed one to rename the resulting aggregation2. Passing a dict-of-dicts to a grouped/rolled/resampled DataFrame.
This will work, though its not a single line of code
df.groupby('qtr').agg({"realgdp": ["mean", "std"], "unemp": "mean"})
df.columns = df.columns.map('_'.join)
df.rename(columns = {'realgdp_mean': 'mean_gdp', 'realgdp_std':'std_gdp', 'unemp_mean':'mean_unemp'}, inplace = True)
这篇关于将不同的聚合函数应用于不同的列(现在不适用于重命名)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!