我是 python 的新手,我试图弄清楚如何加载一个包含每个时间步长数据块的数据文件,例如:

TIME:,0
Q01 : A:,-10.7436,0.000536907,-0.00963283,0.00102934
Q02 : B:,0,0.0168694,-0.000413983,0.00345921
Q03 : C:,0.0566665
Q04 : D:,0.074456
Q05 : E:,0.077456
Q06 : F:,0.0744835
Q07 : G:,0.140448
Q08 : H:,-0.123968
Q09 : I:,0
Q10 : J:,0.00204377,0.0109621,-0.0539183,0.000708574
Q11 : K:,-2.86115e-17,0.00947104,0.0145645,1.05458e-16,-1.90972e-17,-0.00947859
Q12 : L:,-0.0036781,0.00161254
Q13 : M:,-0.00941257,0.000249692,-0.0046302,-0.00162387,0.000981709,-0.0135982,-0.0223496,-0.00872062,0.00548815,0.0114075,.........,-0.00196206
Q14 : N:,3797, 66558
Q15 : O:,0.0579981
Q16 : P:,0
Q17 : Q:,625

TIME:,0.1
Q01 : A:,-10.563,0.000636907,-0.00963283,0.00102934
Q02 : B:,0,0.01665694
Q03 : C:,0.786,-0.000666,0.6555
Q04 : D:,0.87,0.96
Q05 : E:,0.077456
Q06 : F:,0.07447835
Q07 : G:,0.140448
Q08 : H:,-0.123968
Q09 : I:,0
Q10 : J:,0.00204377,0.0109621,-0.0539183,0.000708574
Q11 : K:,-2.86115e-17,0.00947104,0.0145645,1.05458e-16,-1.90972e-17,-0.00947859
Q12 : L:,-0.0036781,0.00161254
Q13 : M:,-0.00941257,0.000249692,-0.0046302,-0.00162387,0.000981709,-0.0135982,-0.0223496,-0.00872062,0.00548815,0.0114075,.........,-0.00196206
Q14 : N:,3797, 66558
Q15 : O:,0.0579981
Q16 : P:,0,2,4
Q17 : Q:,786

每个块都包含许多变量,其中的数据列数可能非常不同。每个时间步中每个变量的列数可能会发生变化,但每个时间步中每个块的变量数是相同的,并且始终知道导出了多少变量。数据文件中没有关于数据块数(时间步长)的信息。

读取数据后,应以每个时间步长的变量格式加载它:
Time:  |  A:                                           |  B:
0      |  -10.7436,0.000536907,-0.00963283,0.00102934  |  ........
0.1    |  -10.563,0.000636907,-0.00963283,0.00102934   |  ........
0.2    |  ......                                       |  ........

如果每个时间步的数据列数和每个变量的列数都相同,这将是一个非常简单的问题。

我想我需要一行一行地读取文件,在两个循环中,每个块一个,然后在每个块内一次,然后将输入存储在一个数组中(追加?)。由于我对 python 和 numpy 还不是很熟悉,每行列数的变化让我一时感到有些困惑。

如果有人能指出我正确的方向,例如我应该使用哪些功能来相对有效地执行此操作,那就太好了。

最佳答案

import pandas as pd
res = {}
TIME = None

# by default lazy line read
for line in open('file.txt'):
    parts = line.strip().split(':')
    map(str.strip, parts)
    if len(parts) and parts[0] == 'TIME':
        TIME = parts[1].strip(',')
        res[TIME] = {}
        print('New time section start {}'.format(TIME))
        # here you can stop and work with data from previou period
        continue

    if len(parts) <= 1:
        continue
    res[TIME][parts[1].lstrip()] = parts[2].strip(',').split(',')

df = pd.DataFrame.from_dict(res, 'columns')
# for example for TIME 0
dfZero = df['0']
print(dfZero)


df = pd.DataFrame.from_dict(res, 'index')

dfA = df['A']
print(dfA)

Python 导入文本文件,其中每行具有不同的列数-LMLPHP

关于Python 导入文本文件,其中每行具有不同的列数,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/37354745/

10-10 06:57