我在数据框中有两列都是字符串,其中column1在column2中有一些匹配的关键字。我想从新列中的column1和column2中提取那些匹配的关键字。

df['column3']=df.column1.apply(lambda x : df.column2[df.column2.str.contains(x)]


我期望输出是这样的

column1                     column2                 column3
A girl is going to market   girl market school      girl market
A girl is going to school   girl market school      girl school
The sky is blue in color    sky blue orange color   sky blue color

最佳答案

使用apply

例如:

df["column3"] = df.apply(lambda x: " ".join(i for i in x["column2"].split() if i in x["column1"]),axis=1)
print(df)


输出:

                     column1                column2         column3
0  A girl is going to market     girl market school     girl market
1  A girl is going to school     girl market school     girl school
2   The sky is blue in color  sky blue orange color  sky blue color

关于python-3.x - 如何从 Pandas 数据框中的两列中提取匹配的关键字?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/56325012/

10-12 20:12