有什么更好的方法来链接多个Combine_first()语句。


我已经解析了一些数据,并为cc-email提供了3个不同的列。这行得通,但是有没有更清洁的方法呢?

df['cc-email2'] = df['cc-email'].combine_first(
df['cc-email_cc-email'].combine_first(
df['cc-emails_cc-email']))


例如

df = pd.DataFrame([])
df['cc-email'] = ('[email protected]', np.nan, np.nan, np.nan)
df['cc-email_cc-email'] = (np.nan, '[email protected]', np.nan, np.nan)
df['cc-emails_cc-email'] = ('[email protected]', np.nan, np.nan, '[email protected]')


结果df:

     cc-email           cc-email_cc-email   cc-emails_cc-email    cc-email2
0    [email protected]    NaN                 [email protected]         [email protected]
1    NaN                [email protected] NaN                   [email protected]
2    NaN                NaN                 NaN                   NaN
3    NaN                NaN                 [email protected]        [email protected]

最佳答案

我认为您可以使用reduce

from functools import reduce

dfs = [df['cc-email'], df['cc-email_cc-email'], df['cc-emails_cc-email']]
df['cc-email2'] = reduce(lambda l,r: l.combine_first(r), dfs)


但似乎ffill与选择最后一列也应该工作:

df['cc-email2'] = df.ffill(axis=1).iloc[:, -1]
print (df)
          cc-email    cc-email_cc-email cc-emails_cc-email  \
0  [email protected]                  NaN      [email protected]
1              NaN  [email protected]                NaN
2              NaN                  NaN                NaN
3              NaN                  NaN     [email protected]

             cc-email2
0        [email protected]
1  [email protected]
2                  NaN
3       [email protected]

10-06 14:53