本文介绍了用分隔符python将细胞连接成字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
给出以下内容:
df = pd.DataFrame({'col1' : ["a","b"],
'col2' : ["ab",np.nan], 'col3' : ["w","e"]})
我希望能够创建一列,将所有三列的内容连接到一个字符串中,并用字符"*"分隔,而忽略 NaN
.
I would like to be able to create a column that joins the content of all three columns into one string, separated by the character "*" while ignoring NaN
.
这样我就可以得到类似的东西:
so that I would get something like that for example:
a*ab*w
b*e
有什么想法吗?
刚意识到还有一些其他要求,我需要该方法可以使用整数和浮点数,还需要能够处理特殊字符(例如,西班牙字母).
Just realised there were a few additional requirements, I needed the method to work with ints and floats and also to be able to deal with special characters (e.g., letters of Spanish alphabet).
推荐答案
In [68]:
df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().values.tolist()), axis=1)
df
Out[68]:
col1 col2 col3 new_col
0 a ab w a*ab*w
1 b NaN e b*e
更新
如果您有整数或浮点数,则可以先将它们转换为 str
:
If you have ints or float you can convert these to str
first:
In [74]:
df = pd.DataFrame({'col1' : ["a","b",3],
'col2' : ["ab",np.nan, 4], 'col3' : ["w","e", 6]})
df
Out[74]:
col1 col2 col3
0 a ab w
1 b NaN e
2 3 4 6
In [76]:
df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
df
Out[76]:
col1 col2 col3 new_col
0 a ab w a*ab*w
1 b NaN e b*e
2 3 4 6 3*4*6
另一个更新
In [81]:
df = pd.DataFrame({'col1' : ["a","b",3,'ñ'],
'col2' : ["ab",np.nan, 4,'ü'], 'col3' : ["w","e", 6,'á']})
df
Out[81]:
col1 col2 col3
0 a ab w
1 b NaN e
2 3 4 6
3 ñ ü á
In [82]:
df['new_col'] = df.apply(lambda x: '*'.join(x.dropna().astype(str).values), axis=1)
df
Out[82]:
col1 col2 col3 new_col
0 a ab w a*ab*w
1 b NaN e b*e
2 3 4 6 3*4*6
3 ñ ü á ñ*ü*á
我的代码仍然可以使用西班牙语字符
My code still works with Spanish characters
这篇关于用分隔符python将细胞连接成字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!