本文介绍了Pandas 数据框或面板到 3d numpy 数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

设置:

pdf = pd.DataFrame(np.random.rand(4,5), columns = list('abcde'))
pdf['a'][2:]=pdf['a'][0]
pdf['a'][:2]=pdf['a'][1]
pdf.set_index(['a','b'])

输出:

                         c           d           e
a           b
0.439502    0.115087     0.832546    0.760513    0.776555
            0.609107     0.247642    0.031650    0.727773
0.995370    0.299640     0.053523    0.565753    0.857235
            0.392132     0.832560    0.774653    0.213692

每个数据系列按索引 ID a 分组,b 表示 a 其他特征的时间索引.有没有办法让熊猫生成一个反映 a 分组的 numpy 3d 数组?目前它以二维形式读取数据,所以 pdf.shape 输出 (4, 5).我想要的是数组的变量形式:

Each data series is grouped by the index ID a and b represents a time index for the other features of a. Is there a way to get the pandas to produce a numpy 3d array that reflects the a groupings? Currently it reads the data as two dimensional so pdf.shape outputs (4, 5). What I would like is for the array to be of the variable form:

array([[[-1.38655912, -0.90145951, -0.95106951,  0.76570984],
        [-0.21004144, -2.66498267, -0.29255182,  1.43411576],
        [-0.21004144, -2.66498267, -0.29255182,  1.43411576]],

       [[ 0.0768149 , -0.7566995 , -2.57770951,  0.70834656],
        [-0.99097395, -0.81592084, -1.21075386,  0.12361382]]])

有没有原生 Pandas 的方法来做到这一点?请注意,实际数据中每个 a 分组的行数是可变的,因此我不能只是转置或重塑 pdf.values.如果没有原生方式,从数十万行和数百列迭代构建数组的最佳方法是什么?

Is there a native Pandas way to do this? Note that number of rows per a grouping in the actual data is variable, so I cannot just transpose or reshape pdf.values. If there isn't a native way, what's the best method for iteratively constructing the arrays from hundreds of thousands of rows and hundreds of columns?

推荐答案

panel.values

将直接返回一个 numpy 数组.这将必然是最高可接受的 dtype,因为所有内容都被压缩到单个 3-d numpy 数组中.它将是 new 数组,而不是 Pandas 数据的视图(无论 dtype).

will return a numpy array directly. this will by necessity be the highest acceptable dtype as everything is smushed into a single 3-d numpy array. It will be new array and not a view of the pandas data (no matter the dtype).

这篇关于Pandas 数据框或面板到 3d numpy 数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-05 17:24