本文介绍了用分隔符python将细胞连接成字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出以下内容:

df = pd.DataFrame({'col1' : ["a","b"],
            'col2'  : ["ab",np.nan], 'col3' : ["w","e"]})

我希望能够创建一列,将所有三列的内容连接到一个字符串中,并用字符"*"分隔,而忽略 NaN .

I would like to be able to create a column that joins the content of all three columns into one string, separated by the character "*" while ignoring NaN.

这样我就可以得到类似的东西:

so that I would get something like that for example:

a*ab*w
b*e

有什么想法吗?

刚意识到还有一些其他要求,我需要该方法可以使用整数和浮点数,还需要能够处理特殊字符(例如,西班牙字母).

Just realised there were a few additional requirements, I needed the method to work with ints and floats and also to be able to deal with special characters (e.g., letters of Spanish alphabet).

推荐答案

In [68]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().values.tolist()), axis=1)
df
Out[68]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e

更新

如果您有整数或浮点数,则可以先将它们转换为 str :

If you have ints or float you can convert these to str first:

In [74]:

df = pd.DataFrame({'col1' : ["a","b",3],
            'col2'  : ["ab",np.nan, 4], 'col3' : ["w","e", 6]})
df
Out[74]:
  col1 col2 col3
0    a   ab    w
1    b  NaN    e
2    3    4    6
In [76]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
df
Out[76]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e
2    3    4    6   3*4*6

另一个更新

In [81]:

df = pd.DataFrame({'col1' : ["a","b",3,'ñ'],
            'col2'  : ["ab",np.nan, 4,'ü'], 'col3' : ["w","e", 6,'á']})
df
Out[81]:
  col1 col2 col3
0    a   ab    w
1    b  NaN    e
2    3    4    6
3    ñ    ü    á

In [82]:

df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
​
df
Out[82]:
  col1 col2 col3 new_col
0    a   ab    w  a*ab*w
1    b  NaN    e     b*e
2    3    4    6   3*4*6
3    ñ    ü    á   ñ*ü*á

我的代码仍然可以使用西班牙语字符

My code still works with Spanish characters

这篇关于用分隔符python将细胞连接成字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 18:46