本文介绍了从单个列创建多个列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理具有以下内容的列的数据框:

I am working on a data frame that has a column with the following:

         Products
1           A;B
2           A
3           D;A;C

我想改为:

          Has_A      Has_B        Has_C   ...
1           1          1            0
2           1          0            0

此外,更进一步,有些行包含诸如无产品"或无"之类的内容,并且存在NaN,我想将所有这些都放在一列中(如果可能的话).

Also, as a step further, there are some rows that contains something like "No products" or "None" and there is NaNs, I would like to put all these into 1 column (if possible ).

有什么秘诀吗?有可能吗?

Any tips ? Is it possible to do ?

谢谢

推荐答案

您可以使用 str.get_dummies 主要是:

You can use str.get_dummies mainly:

df = df['Products'].str.get_dummies(';').add_prefix('Has_')
print (df)
   Has_A  Has_B  Has_C  Has_D
0      1      1      0      0
1      1      0      0      0
2      1      0      1      1

示例:

还添加了带有 replace的解决方案由使用list comprehension创建的dict并添加了NaNNone.

There is also add solution with replace by dict created with list comprehension and added NaN and None.

df = pd.DataFrame({'Products': ['A;B', 'A', 'D;A;C', 'No prods', np.nan, 'None']})
print (df)
   Products
0       A;B
1         A
2     D;A;C
3  No prods
4       NaN
5      None

L = ['No prods','None']
d = {x :'No product' for x in L + [None, np.nan]}
df['Products'] = df['Products'].replace(d)
df = df['Products'].str.get_dummies(';').add_prefix('Has_')
print (df)
   Has_A  Has_B  Has_C  Has_D  Has_No product
0      1      1      0      0               0
1      1      0      0      0               0
2      1      0      1      1               0
3      0      0      0      0               1
4      0      0      0      0               1
5      0      0      0      0               1

这篇关于从单个列创建多个列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-19 01:04