本文介绍了 pandas 在一个列上按最大日期分组在另一列上的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个包含以下数据的数据框:
i have a dataframe with following data :
invoice_no dealer billing_change_previous_month date
110 1 0 2016-12-31
100 1 -41981 2017-01-30
5505 2 0 2017-01-30
5635 2 58730 2016-12-31
我只希望有一个最大日期的经销商.所需的输出应如下所示:
i want to have only one dealer with the maximum date . The desired output should be like this :
invoice_no dealer billing_change_previous_month date
100 1 -41981 2017-01-30
5505 2 0 2017-01-30
每个经销商的最大日期应有所不同,预先感谢您的帮助.
each dealer should be distinct with maximum date,thanks in advance for your help.
推荐答案
您可以使用groupby和transform来使用布尔索引
You can use boolean indexing using groupby and transform
df_new = df[df.groupby('dealer').date.transform('max') == df['date']]
invoice_no dealer billing_change_previous_month date
1 100 1 -41981 2017-01-30
2 5505 2 0 2017-01-30
如果有两个以上的经销商,
If there are more than two dealers,
df = pd.DataFrame({'invoice_no':[110,100,5505,5635,10000,10001], 'dealer':[1,1,2,2,3,3],'billing_change_previous_month':[0,-41981,0,58730,9000,100], 'date':['2016-12-31','2017-01-30','2017-01-30','2016-12-31', '2019-12-31', '2020-01-31']})
df['date'] = pd.to_datetime(df['date'])
df[df.groupby('dealer').date.transform('max') == df['date']]
invoice_no dealer billing_change_previous_month date
1 100 1 -41981 2017-01-30
2 5505 2 0 2017-01-30
5 10001 3 100 2020-01-31
这篇关于 pandas 在一个列上按最大日期分组在另一列上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!