python - 重新编码pandas dataframe列中除一个值外的所有值

我正在尝试对贷款状态数据进行重新编码，以使每个观察结果均为“默认”或“全额付款”。具体来说，我想将任何人重新编码！='Fully Paid'为'Default'。

这是我的价值观：

df.loan_status.unique()

array(['Fully Paid', 'Charged Off', 'Default', 'Late (31-120 days)',
   'In Grace Period', 'Late (16-30 days)',
   'Does not meet the credit policy. Status:Fully Paid',
   'Does not meet the credit policy. Status:Charged Off', 'Issued'], dtype=object)

我尝试了以下代码，但是所有观察结果都重新编码为“默认”：

statuses= df['loan_status'].unique()
for status in statuses:
    if status!='Fully Paid':
        df['loan_status']='Default'

任何有关如何执行此操作的建议将不胜感激！

最佳答案

我喜欢这种方法。

Andras Deak / MaxU;选项1

df.loc[df.loan_status.ne('Fully Paid'), 'loan_status'] = 'Default'

选项2
pd.Series.where

ls = df.loan_status
df.update(ls.where(ls.eq('Fully Paid'), 'Default'))

选项3
pd.Series.mask

ls = df.loan_status
df.update(ls.mask(ls.ne('Fully Paid')).fillna('Default'))

选项4
numpy.where

ls = df.loan_status.values
paid, dflt = 'Fully Paid', 'Default'
df.loc[:, 'loan_status'] = np.where(ls == paid, paid, dflt)