以下代码

import pandas as pd

df = pd.load_csv('trace.data')

print(df.ix[0:1, :])


产生以下DataFrame

   frame#  X-1  Y-1  Angle-1  Error-1  X-5  Y-5  Angle-5  Error-5  X-12  \
0       1  NaN  NaN      NaN      NaN  NaN  NaN      NaN      NaN   NaN
1       2  NaN  NaN      NaN      NaN  NaN  NaN      NaN      NaN   NaN

      ...      Angle-1355  Error-1355  X-1384  Y-1384  Angle-1384  Error-1384  \
0     ...             NaN         NaN     NaN     NaN         NaN         NaN
1     ...             NaN         NaN     NaN     NaN         NaN         NaN

   X-1408  Y-1408  Angle-1408  Error-1408
0     853    2340  283.262859           0
1     NaN     NaN         NaN         NaN

[2 rows x 801 columns]


每行对应于单个图片帧所进行的所有测量的集合。

第一列是帧的编号。

从第二列开始,每四个连续的列是该测量的X位置,Y位置,角度和误差。

i中的数字X-i Y-i Angle-i Error-i是该点的ID。

我想将DataFrame改成这种形式的DataFrame:


帧#
点ID(iX-i中的Y-i等)
维度名称(例如XY等)
测量(实际测量,float64


一只可敬的熊猫怎么做?

最佳答案

df = pd.DataFrame({'frame': [1, 2],
                   'Angle-1': [1.6288175485083471, -0.16980795008048055],
                   'Angle-1355': [-0.23364001238956567, 0.10508954185705043],
                   'Angle-1384': [-0.1055306764132989, 1.5766485876766343],
                   'Angle-5': [1.0530749477672805, -0.58051944875155881],
                   'Error-1': [-0.22597615373237354, -0.067869089031437124],
                   'Error-1355': [-1.1205136108736824, 1.5398343350154859],
                   'Error-1384': [0.2072177497820725, 1.5802856128691691],
                   'Error-5': [-0.054906215727689098, -0.115633635459458],
                   'X-1': [1.2374207482997275, -0.74052859017582551],
                   'X-12': [-0.10554748111840574, 0.51297919944988468],
                   'X-1384': [2.2710928129358541, 2.2873598143523743],
                   'X-5': [-0.68576722189220918, 1.480319768103725],
                   'Y-1': [-0.72686786051739416, 1.662550986420245],
                   'Y-1384': [-1.384276797510166, 0.89414830326943084],
                   'Y-5': [-0.12183746322452065, 1.0471295991115857]})


给定上面的示例数据框,您可以弹出frames列,并使用列表推导将其重整为扁平化的结构。使用连字符拆分列并重新分配,以创建MultiIndex。然后将new_frames与熔化的数据框水平连接。

瞧!

frames = df.pop('frame')
new_frames = [i for j in range(df.shape[1]) for i in frames]

df.columns = df.columns.str.split('-', expand=True)

>>> (pd.concat([pd.DataFrame(new_frames), pd.melt(df)], axis=1, ignore_index=True)
     .rename(columns={0: 'frame', 1: 'dimension', 2: 'point', 3: 'measurement'}))
    frame dimension point  measurement
0       1     Angle     1     1.628818
1       2     Angle     1    -0.169808
2       1     Angle  1355    -0.233640
3       2     Angle  1355     0.105090
4       1     Angle  1384    -0.105531
5       2     Angle  1384     1.576649
6       1     Angle     5     1.053075
7       2     Angle     5    -0.580519
8       1     Error     1    -0.225976
9       2     Error     1    -0.067869
10      1     Error  1355    -1.120514
11      2     Error  1355     1.539834
12      1     Error  1384     0.207218
13      2     Error  1384     1.580286
14      1     Error     5    -0.054906
15      2     Error     5    -0.115634
16      1         X     1     1.237421
17      2         X     1    -0.740529
18      1         X    12    -0.105547
19      2         X    12     0.512979
20      1         X  1384     2.271093
21      2         X  1384     2.287360
22      1         X     5    -0.685767
23      2         X     5     1.480320
24      1         Y     1    -0.726868
25      2         Y     1     1.662551
26      1         Y  1384    -1.384277
27      2         Y  1384     0.894148
28      1         Y     5    -0.121837
29      2         Y     5     1.047130

关于python - 如何将pandas.read_csv结果改成以下格式?,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/35589992/

10-12 21:45