本文介绍了从单个列创建多个列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我正在处理具有以下内容的列的数据框:
I am working on a data frame that has a column with the following:
Products
1 A;B
2 A
3 D;A;C
我想改为:
Has_A Has_B Has_C ...
1 1 1 0
2 1 0 0
此外,更进一步,有些行包含诸如无产品"或无"之类的内容,并且存在NaN,我想将所有这些都放在一列中(如果可能的话).
Also, as a step further, there are some rows that contains something like "No products" or "None" and there is NaNs, I would like to put all these into 1 column (if possible ).
有什么秘诀吗?有可能吗?
Any tips ? Is it possible to do ?
谢谢
推荐答案
您可以使用 str.get_dummies
主要是:
You can use str.get_dummies
mainly:
df = df['Products'].str.get_dummies(';').add_prefix('Has_')
print (df)
Has_A Has_B Has_C Has_D
0 1 1 0 0
1 1 0 0 0
2 1 0 1 1
示例:
还添加了带有 replace
的解决方案由使用list comprehension
创建的dict
并添加了NaN
和None
.
There is also add solution with replace
by dict
created with list comprehension
and added NaN
and None
.
df = pd.DataFrame({'Products': ['A;B', 'A', 'D;A;C', 'No prods', np.nan, 'None']})
print (df)
Products
0 A;B
1 A
2 D;A;C
3 No prods
4 NaN
5 None
L = ['No prods','None']
d = {x :'No product' for x in L + [None, np.nan]}
df['Products'] = df['Products'].replace(d)
df = df['Products'].str.get_dummies(';').add_prefix('Has_')
print (df)
Has_A Has_B Has_C Has_D Has_No product
0 1 1 0 0 0
1 1 0 0 0 0
2 1 0 1 1 0
3 0 0 0 0 1
4 0 0 0 0 1
5 0 0 0 0 1
这篇关于从单个列创建多个列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!