我正在尝试对贷款状态数据进行重新编码,以使每个观察结果均为“默认”或“全额付款”。具体来说,我想将任何人重新编码!='Fully Paid'为'Default'。

这是我的价值观:

df.loan_status.unique()

array(['Fully Paid', 'Charged Off', 'Default', 'Late (31-120 days)',
   'In Grace Period', 'Late (16-30 days)',
   'Does not meet the credit policy. Status:Fully Paid',
   'Does not meet the credit policy. Status:Charged Off', 'Issued'], dtype=object)


我尝试了以下代码,但是所有观察结果都重新编码为“默认”:

statuses= df['loan_status'].unique()
for status in statuses:
    if status!='Fully Paid':
        df['loan_status']='Default'


任何有关如何执行此操作的建议将不胜感激!

最佳答案

我喜欢这种方法。

Andras Deak / MaxU;选项1

df.loc[df.loan_status.ne('Fully Paid'), 'loan_status'] = 'Default'


选项2
pd.Series.where

ls = df.loan_status
df.update(ls.where(ls.eq('Fully Paid'), 'Default'))


选项3
pd.Series.mask

ls = df.loan_status
df.update(ls.mask(ls.ne('Fully Paid')).fillna('Default'))


选项4
numpy.where

ls = df.loan_status.values
paid, dflt = 'Fully Paid', 'Default'
df.loc[:, 'loan_status'] = np.where(ls == paid, paid, dflt)

08-24 23:16