背景

我有以下示例df

import pandas as pd
df = pd.DataFrame({'Birthdate':['This person was born Date of Birth: 5/6/1950 and other',
                          'no Date of Birth: nothing here',
                          'One Date of Birth: 01/01/2001 last here'],
                  'P_ID': [1,2,3],
                  'N_ID' : ['A1', 'A2', 'A3']}

                 )

 df
                                 Birthdate                 N_ID P_ID
    0   This person was born Date of Birth: 5/6/1950 a...   A1  1
    1   no Date of Birth: nothing here                      A2  2
    2   One Date of Birth: 01/01/2001 last here             A3  3

目标

*BDAY*替换生日的前几位数字,例如5/6/1950变成*BDAY*1950
所需的输出
                                 Birthdate                 N_ID P_ID
    0   This person was born Date of Birth: *BDAY*1950 a... A1  1
    1   no Date of Birth: nothing here                      A2  2
    2   One last Date of Birth: *BDAY*2001 last here        A3  3

尝试过

python - Replace first five characters in a column with asterisks中,我尝试了以下代码:df.replace(r'Date of Birth: ^\d{3}-\d{2}', "*BDAY*", regex=True),但它并不能完全满足我的期望输出

问题

如何获得所需的输出?

最佳答案

尝试这个:

df['Birthdate'] = df.Birthdate.str.replace(r'[0-9]?[0-9]/[0-9]?[0-9]/', '*BDAY*')


Out[273]:
                                           Birthdate  P_ID N_ID
0  This person was born Date of Birth: *BDAY*1950...     1   A1
1                     no Date of Birth: nothing here     2   A2
2            One Date of Birth: *BDAY*2001 last here     3   A3

09-27 01:18