本文介绍了按最小值分组,并用另一列中的值填充NA的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像这样的示例数据框.

I have a sample data frame that looks like this.

df = pd.DataFrame (data = {'uid': [1,1,1,2,2,3], 'pagename':['home', 'blah',
'blah', 'home', 'blah', 'blah'], 'startpage': ['NA', 'NA', 'NA', 'home',
'home', 'blah'], 'date_time': [0,1,2,5,9,1]})

我想做的是按UID分组并找到最短的date_time.如果最小date_time的起始页为Null(我将字符串'NA'表示为Null),那么我想使用该行中的页面名称来填充起始页列.我还希望为所有具有相同UID的行填充起始页.

What I want to do is group by the UID and find the min date_time. If the startpage of the min date_time is Null (I put string 'NA' for Null) then I want to use the pagename from that row to populate the startpage column. I also want the startpage to be populated for all rows with the same UID.

这是我想要的结尾数据框.

This is the ending dataframe that I want.

df = pd.DataFrame (data = {'uid': [1,1,1,2,2,3], 'pagename':['home', 'blah',
'blah', 'home', 'blah', 'blah'], 'startpage': ['home', 'home', 'home',
'home', 'home', 'blah'], 'date_time': [0,1,2,5,9,1]})

推荐答案

fillnatransform

i = df.groupby('uid').date_time.transform('idxmin')
df.startpage = df.startpage.fillna(i.map(df.pagename))

print(df)

   date_time pagename startpage  uid
0          0     home      home    1
1          1     blah      home    1
2          2     blah      home    1
3          5     home      home    2
4          9     blah      home    2
5          1     blah      blah    3

这篇关于按最小值分组,并用另一列中的值填充NA的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-26 11:32