本文介绍了根据可能并非在所有值上都存在的定界符将pandas列分为两部分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

数据框的其中一列看起来像这样:

One of the columns of my dataframe looks something like this:

[application]
blah/3.14
xyz/5.2
abc
...
...

(代表软件/版本)

我正在尝试实现以下目标:

I'm trying to achieve something like this:

[application] [name]  [ver]
blah/3.14      blah    3.14
xyz/5.2        xyz     5.2 
abc            abc     na   <-- this missing value can be filled in with a string too
...  
...

您已经知道,我想使用'/'作为分隔符将列分为两部分.堆栈溢出解决方案建议如下所示:

As you can already tell, I'd like to split the column into two, using '/' as a delimiter. A stack overflow solution suggests something like this:

tmptbl = pd.DataFrame(main_tbl.application.str.split('/', 1).tolist(), columns= ['name', 'ver'])
main_tbl['name'] = tmptbl.name
main_tbl['ver'] = tmptbl.ver

起初看起来不错,但是对于不带'/'的列(例如'abc'),它会崩溃.

Which looks great at first, but it crashes for columns without '/', such as 'abc'.

我还能尝试什么?

推荐答案

使用带有参数expand=Truestr.split 用于返回DataFrame:

main_tbl[['name','ver']] = main_tbl.application.str.split('/', expand=True)
print (main_tbl)
  application  name   ver
0   blah/3.14  blah  3.14
1     xyz/5.2   xyz   5.2
2         abc   abc  None

如果需要,请添加 replace :

And if need NaNs add replace:

main_tbl.ver = main_tbl.ver.replace({None:np.nan})
print (main_tbl)
  application  name   ver
0   blah/3.14  blah  3.14
1     xyz/5.2   xyz   5.2
2         abc   abc   NaN

这篇关于根据可能并非在所有值上都存在的定界符将pandas列分为两部分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-29 16:57