我有一个这样的数据框。
我正在尝试删除出现在子字符串列中的字符串。
Main substring
Sri playnig well cricket cricket
sri went out NaN
Ram is in NaN
Ram went to UK,US UK,US
我的期望值是
Main substring
Sri playnig well cricket
sri went out NaN
Ram is in NaN
Ram went to UK,US
我尝试了
df["Main"].str.reduce(df["substring"])
但没有用,请帮忙。 最佳答案
这是使用pd.DataFrame.apply
的一种方法。请注意,np.nan == np.nan
的计算结果为False
,我们可以在函数中使用此技巧来确定何时应用删除逻辑。
import pandas as pd, numpy as np
df = pd.DataFrame({'Main': ['Sri playnig well cricket', 'sri went out',
'Ram is in' ,'Ram went to UK,US'],
'substring': ['cricket', np.nan, np.nan, 'UK,US']})
def remover(row):
sub = row['substring']
if sub != sub:
return row['Main']
else:
lst = row['Main'].split()
return ' '.join([i for i in lst if i!=sub])
df['Main'] = df.apply(remover, axis=1)
print(df)
Main substring
0 Sri playnig well cricket
1 sri went out NaN
2 Ram is in NaN
3 Ram went to UK,US
关于python - 如何减少基于另一列的数据框列值的一部分,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/50230233/