问题描述
这是我想到的问题,给出一个表
Here is the question I have in mind, given a table
Id type
0 1 [a,b]
1 2 [c]
2 3 [a,d]
我想将其转换为以下形式:
I want to convert it into the form of:
Id a b c d
0 1 1 1 0 0
1 2 0 0 1 0
2 3 1 0 0 1
我需要一个非常有效的方式来转换一张大桌子。欢迎任何评论。
I need a very efficient way to convert a large table. any comment is welcome.
========================== ======
====================================
我收到了几个好的答案,非常感谢你的帮助。
I have received several good answers, and really appreciate your help.
现在一个新的问题来了,这是我的笔记本电脑内存不足以通过使用 pd.dummies
生成整个数据框。
Now a new question comes along, which is my laptop memory is insufficient to generating the whole dataframe by using pd.dummies
.
是否还是一行一行地生成一个稀疏向量,然后一起?
is there anyway to generate a sparse vector row by row and stack then together?
推荐答案
尝试这个
>>> df
Id type
0 1 [a, b]
1 2 [c]
2 3 [a, d]
>>> df2 = pd.DataFrame([x for x in df['type'].apply(
... lambda item: dict(map(
... lambda x: (x,1),
... item))
... ).values]).fillna(0)
>>> df2.join(df)
a b c d Id type
0 1 1 0 0 1 [a, b]
1 0 0 1 0 2 [c]
2 1 0 0 1 3 [a, d]
它基本上将列表列表转换为dict列表,并构造一个DataFrame这个
It basically convert the list of list to list of dict and construct a DataFrame out of this
这篇关于如何将一列分割成多列并计数频率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!