本文介绍了按最小值分组,并用另一列中的值填充NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个看起来像这样的示例数据框.
I have a sample data frame that looks like this.
df = pd.DataFrame (data = {'uid': [1,1,1,2,2,3], 'pagename':['home', 'blah',
'blah', 'home', 'blah', 'blah'], 'startpage': ['NA', 'NA', 'NA', 'home',
'home', 'blah'], 'date_time': [0,1,2,5,9,1]})
我想做的是按UID分组并找到最短的date_time.如果最小date_time的起始页为Null(我将字符串'NA'表示为Null),那么我想使用该行中的页面名称来填充起始页列.我还希望为所有具有相同UID的行填充起始页.
What I want to do is group by the UID and find the min date_time. If the startpage of the min date_time is Null (I put string 'NA' for Null) then I want to use the pagename from that row to populate the startpage column. I also want the startpage to be populated for all rows with the same UID.
这是我想要的结尾数据框.
This is the ending dataframe that I want.
df = pd.DataFrame (data = {'uid': [1,1,1,2,2,3], 'pagename':['home', 'blah',
'blah', 'home', 'blah', 'blah'], 'startpage': ['home', 'home', 'home',
'home', 'home', 'blah'], 'date_time': [0,1,2,5,9,1]})
推荐答案
fillna
与transform
i = df.groupby('uid').date_time.transform('idxmin')
df.startpage = df.startpage.fillna(i.map(df.pagename))
print(df)
date_time pagename startpage uid
0 0 home home 1
1 1 blah home 1
2 2 blah home 1
3 5 home home 2
4 9 blah home 2
5 1 blah blah 3
这篇关于按最小值分组,并用另一列中的值填充NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!