本文介绍了如何在数据列中拆分值并将其添加到带有pandas中条件的新列中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个df,

name                        Value
Sri is a cricketer          Sri,is
Ram player                  Ram
Ravi is a singer            is
cricket and foot is ball    and,is,foot

和一个列表

my_list=["is", "foot"]

我正在尝试将df ["value"]除以(,),并将该值添加到新列(如果my_list中存在该值).我的预期输出是

I am trying to split df["value"] by (,) and adding the value to a new column if the value exists in my_list.My expected output is

name                      Value        my_list
Sri is a cricketer        Sri           is      
Ram player                Ram 
Ravi is a singer                        is     
cricket and foot is ball  and          is,foot

请帮助实现这一目标,在此先感谢

please help to achieve this, thanks in advance

推荐答案

使用 str.findall str.join :

my_list=["is", "foot"]
df['my_list'] = df['Value'].str.findall('(' + '|'.join(my_list) + ')').str.join(',')
print (df)
                       name        Value  my_list
0        Sri is a cricketer       Sri,is       is
1                Ram player          Ram         
2          Ravi is a singer           is       is
3  cricket and foot is ball  and,is,foot  is,foot

使用 split 并获得set个中的intersection个:

my_list=["is", "foot"]
df['my_list']=df['Value'].str.split(',').apply(lambda x: set(x) & set(my_list)).str.join(',')
print (df)
                       name        Value  my_list
0        Sri is a cricketer       Sri,is       is
1                Ram player          Ram         
2          Ravi is a singer           is       is
3  cricket and foot is ball  and,is,foot  is,foot

最后:

df['Value'] = (df['Value'].str.replace('(' + '|,'.join(my_list) + ')', '')
                          .str.replace('[,]{2,}',',')
                          .str.strip(','))
print (df)
                       name Value  my_list
0        Sri is a cricketer   Sri       is
1                Ram player   Ram         
2          Ravi is a singer             is
3  cricket and foot is ball   and  is,foot

或者:

my_list=["is", "foot"]

s1 = df['Value'].str.split(',')

df['my_list'] = s1.apply(lambda x: set(x) & set(my_list)).str.join(',')
df['Value'] = s1.apply(lambda x: set(x) - set(my_list)).str.join(',')
print (df)

                       name Value  my_list
0        Sri is a cricketer   Sri       is
1                Ram player   Ram         
2          Ravi is a singer             is
3  cricket and foot is ball   and  is,foot

这篇关于如何在数据列中拆分值并将其添加到带有pandas中条件的新列中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-14 23:20