问题描述
在python中存储和分析高维日期的最佳方法是什么?我喜欢Pandas DataFrame和Panel,可以在其中轻松操作轴.现在,我有一个超多维数据集(昏暗> = 4)的数据.我一直在考虑诸如面板的字典,作为面板条目的元组之类的东西.我想知道Python中是否有一个高调面板对象.
What is best way to store and analyze high-dimensional date in python? I like Pandas DataFrame and Panel where I can easily manipulate the axis. Now I have a hyper-cube (dim >=4) of data. I have been thinking of stuffs like dict of Panels, tuple as panel entries. I wonder if there is a high-dim panel thing in Python.
更新16年5月20日:非常感谢您的所有回答.我已经尝试过MultiIndex和xArray,但是我无法对它们中的任何一个发表评论.在我的问题中,我会尝试使用ndarray,因为我发现标签不是必不可少的,可以将其单独保存.
update 20/05/16:Thanks very much for all the answers. I have tried MultiIndex and xArray, however I am not able to comment on any of them. In my problem I will try to use ndarray instead as I found the label is not essential and I can save it separately.
更新16/09/16:我最后来使用MultiIndex.起初,操作它的方法非常棘手,但我现在已经习惯了.
update 16/09/16:I came up to use MultiIndex in the end. The ways to manipulate it are pretty tricky at first, but I kind of get used to it now.
推荐答案
MultiIndex
对于高维数据最有用,如,因为它允许您在DataFrame
环境中使用任意数量的尺寸.
MultiIndex
is most useful for higher dimensional data as explained in the docs and this SO answer because it allows you to work with any number of dimension in a DataFrame
environment.
除了Panel
之外,还有 Panel4D -目前处于实验阶段.鉴于MultiIndex
的优点,我不建议使用此版本或三维版本.与这些数据结构相比,我认为这些数据结构不会吸引太多人,而且的确会被淘汰.
In addition to the Panel
, there is also Panel4D - currently in experimental stage. Given the advantages of MultiIndex
I wouldn't recommend using either this or the three dimensional version. I don't think these data structures have gained much traction in comparison, and will indeed be phased out.
这篇关于Python中的高维数据结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!