本文介绍了转换数据文件"X","Y","Z",“数据"格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有3个数据集,第一个名为Data的数据保存了我的数据;该表格有5列和3行-每列代表一个特定的位置,可以用一组X,Y位置进行标识,每行代表一个特定的深度(Z);第二个数据集包含5个X,Y位置(第一个数据集的列),而第三个文件则包含3个Z值(数据表的行)

I have 3 datasets,the first one named Data that holds my data; the table has 5 columns, and 3 rows - each column represents a specific location, that can be identified with a set of X, Y locations, and each row represents a specific depth (Z);the 2nd dataset holds the 5 X, Y locations (the columns of the first data set), while a 3rd file holds the 3 Z values, (rows of Data table)

import numpy as np
Data = np.arange(1, 16).reshape(3, 5) #holds the 'data' I am interested in
X = [0, 0, 1, 1, 2] #create 'X', 'Y' values
Y = [0, 1, 0, 1, 0]
XY = np.array((X, Y)).reshape(5, 2) # this is the format I have the 'X' and 'Y' values
Z = [-1, -5, -10]
z = np.array(Z)

我现在要合并所有内容,并拥有一个X,Y,Z,数据格式的新的numpy数组(或pandas数据框)例如,对于给定的数据,表的前3行应为:

I want now to combine all and have a new numpy array (or pandas dataframe) of the X, Y, Z, Data formatfor example for the data given the first 3 rows of the table should be:

X Y  Z Data #this is a header, I just add it to make reading easier
0 0  -1   1
0 0  -5   6
0 0 -10  11
0 1  -1   2
0 1  -5   7
0 1 -10  12

等...

关于如何做到这一点的任何提示都会很棒我正在考虑使用pandas创建正确的(多)索引列,但是找不到正确的方法

any hint on how to do that would be greatI am thinking using pandas to create the proper (multi) index columns but I fail to find the proper way to do so

推荐答案

从X和Y构建MultiIndex,然后使用unstack.

Build a MultiIndex from X and Y, and use unstack.

In [4]: columns = pd.MultiIndex.from_arrays([X, Y])

In [5]: df = DataFrame(Data, columns=columns, index=Z)

In [6]: df
Out[6]:
      0       1       2
      0   1   0   1   0
-1    1   2   3   4   5
-5    6   7   8   9  10
-10  11  12  13  14  15

In [7]: df1 = df.unstack().reset_index()

In [8]: df1.columns = ['X', 'Y', 'Z', 'Data']

In [9]: df1
Out[9]:
    X  Y   Z  Data
0   0  0  -1     1
1   0  0  -5     6
2   0  0 -10    11
3   0  1  -1     2
4   0  1  -5     7
5   0  1 -10    12
6   1  0  -1     3
7   1  0  -5     8
8   1  0 -10    13
9   1  1  -1     4
10  1  1  -5     9
11  1  1 -10    14
12  2  0  -1     5
13  2  0  -5    10
14  2  0 -10    15

我选择将X,Y和Z设置为适当的列(reset_index()),而不是将它们保留为三级MultiIndex.通常,这更干净更有用.

I chose to make X, Y, and Z proper columns (reset_index()) instead of leaving them as a three-level MultiIndex. Generally, this is cleaner and more useful.

这篇关于转换数据文件"X","Y","Z",“数据"格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 02:11