本文介绍了如何拆分大 pandas 数据框中的元组列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个大熊猫数据框(这只是一小部分)
I have a pandas dataframe (this is only a little piece)
>>> d1
y norm test y norm train len(y_train) len(y_test) \
0 64.904368 116.151232 1645 549
1 70.852681 112.639876 1645 549
SVR RBF \
0 (35.652207342877873, 22.95533537448393)
1 (39.563683797747622, 27.382483096332511)
LCV \
0 (19.365430594452338, 13.880062435173587)
1 (19.099614489458364, 14.018867136617146)
RIDGE CV \
0 (4.2907610988480362, 12.416745648065584)
1 (4.18864306788194, 12.980833914392477)
RF \
0 (9.9484841581029428, 16.46902345373697)
1 (10.139848213735391, 16.282141345406522)
GB \
0 (0.012816232716538605, 15.950164822266007)
1 (0.012814519804493328, 15.305745202851712)
ET DATA
0 (0.00034337162272515505, 16.284800366214057) j2m
1 (0.00024811554516431878, 15.556506191784194) j2m
>>>
我想拆分包含元组的所有列。例如,我想用列 LCV-a
和 LCV替换列
。 LCV
b
I want to split all the columns that contain tuples. For example I want to replace the column LCV
with the columns LCV-a
and LCV-b
.
我该怎么做?
编辑:
提出的解决方案不起作用?为什么?
The proposed solution does not work why??
>>> d1['LCV'].apply(pd.Series)
0
0 (19.365430594452338, 13.880062435173587)
1 (19.099614489458364, 14.018867136617146)
>>>
编辑:
这似乎在工作
This seems to be working
>>> d1['LCV'].apply(eval).apply(pd.Series)
0 1
0 19.365431 13.880062
1 19.099614 14.018867
>>>
推荐答案
您可以通过在该列中应用(pd.Series)
In [13]: df = pd.DataFrame({'a':[1,2], 'b':[(1,2), (3,4)]})
In [14]: df
Out[14]:
a b
0 1 (1, 2)
1 2 (3, 4)
In [16]: df['b'].apply(pd.Series)
Out[16]:
0 1
0 1 2
1 3 4
In [17]: df[['b1', 'b2']] = df['b'].apply(pd.Series)
In [18]: df
Out[18]:
a b b1 b2
0 1 (1, 2) 1 2
1 2 (3, 4) 3 4
这是因为它使每个元组成为一个系列,然后被看作是数据帧的一行。
This works because it makes of each tuple a Series, which is then seen as a row of a dataframe.
这篇关于如何拆分大 pandas 数据框中的元组列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!