删除方括号和单引号

删除方括号和单引号

我有一个如下的数据框,我想删除方括号和单引号(')和逗号。

id  currentTitle1
1   ['@@@0000070642@@@']
2   ['@@@0000082569@@@']
3   ['@@@0000082569@@@']
4   ['@@@0000082569@@@']
5   ['@@@0000060910@@@', '@@@0000039198@@@']
6   ['@@@0000060910@@@']
7   ['@@@0000129849@@@']
8   ['@@@0000082569@@@']
9   ['@@@0000082569@@@', '@@@0000060905@@@', '@@@0000086889@@@']
10  ['@@@0000082569@@@']


我想要输出如下

id  currentTitle1
1   @@@0000070642@@@
2   @@@0000082569@@@
3   @@@0000082569@@@
4   @@@0000082569@@@
5   @@@0000060910@@@ @@@0000039198@@@
6   @@@0000060910@@@
7   @@@0000129849@@@
8   @@@0000082569@@@
9   @@@0000082569@@@ @@@0000060905@@@ @@@0000086889@@@
10  @@@0000082569@@@


我从正则表达式清理操作获取数据为df['currentTitle']=df['currentTitle'].str.findall(r'@{3}\d+@‌​{3}')

编辑:发布不干净的数据。请记住,也有未包含的空白行

id  currentTitle    currentTitle_unclean
1   @@@0000070642@@@    accompanying functions of @@@0000070642@@@ and business risk assessment - director
2   @@@0000082569@@@    account @@@0000082569@@@ - sales agent /representative at pronovias fashion group
3   @@@0000082569@@@    account manager/product @@@0000082569@@@ - handbags and accessories
4   @@@0000082569@@@    account @@@0000082569@@@ for entrepreneurs and small size companies
5   @@@0000060910@@@ @@@0000039198@@@   academic @@@0000060910@@@ , administrative, and @@@0000039198@@@ liaison coordinator
6   @@@0000060910@@@    account executive at bluefin insurance @@@0000060910@@@ limited
7   @@@0000129849@@@    account executive for interior @@@0000129849@@@ magazine inex
8   @@@0000082569@@@    account @@@0000082569@@@ high potential secondment programme
9   @@@0000082569@@@ @@@0000060905@@@ @@@0000086889@@@  account @@@0000082569@@@ @@@0000060905@@@ -energy and commodities @@@0000086889@@@ candidate
10  @@@0000082569@@@    account @@@0000082569@@@ paints, coatings, adhesives - ser, slo, cro

最佳答案

您可以将applyjoin结合使用:

df['currentTitle1'] = df['currentTitle1'].apply(' '.join)
print (df)
   id      currentTitle                               currentTitle_unclean  \
0   1  @@@0000070642@@@  accompanying functions of @@@0000070642@@@ and...
1   2  @@@0000082569@@@  account @@@0000082569@@@ - sales agent /repres...
2   3  @@@0000082569@@@  account manager/product @@@0000082569@@@ - han...
3   4  @@@0000082569@@@  account @@@0000082569@@@ for entrepreneurs and...
4   5  @@@0000060910@@@  @@@0000039198@@@   academic @@@0000060910@@@ ,...
5   6  @@@0000060910@@@  account executive at bluefin insurance @@@0000...
6   7  @@@0000129849@@@  account executive for interior @@@0000129849@@...
7   8  @@@0000082569@@@  account @@@0000082569@@@ high potential second...
8   9  @@@0000082569@@@  @@@0000060905@@@ @@@0000086889@@@  account @@@...
9  10  @@@0000082569@@@  account @@@0000082569@@@ paints, coatings, adh...

                                       currentTitle1
0                                   @@@0000070642@@@
1                                   @@@0000082569@@@
2                                   @@@0000082569@@@
3                                   @@@0000082569@@@
4  @@@0000039198@@@ @@@0000060910@@@ @@@000003919...
5                                   @@@0000060910@@@
6                                   @@@0000129849@@@
7                                   @@@0000082569@@@
8  @@@0000060905@@@ @@@0000086889@@@ @@@000008256...
9                                   @@@0000082569@@@


或如not_a_robot所述:

df['currentTitle1'].map(lambda x: ' '.join(x))


如果错误:


  TypeError:只能加入可迭代


如果没有列出则可以添加条件让原始值:

df['currentTitle1'] = df['currentTitle1'].apply(lambda x: ' '.join(x) if type(x) == list
                                                                      else x)


或创建一个空字符串:

df['currentTitle1'] = df['currentTitle1'].apply(lambda x: ' '.join(x) if type(x) == list
                                                                      else '')

关于python - Pandas 删除方括号和单引号,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/44263331/

10-11 20:45